May sound stupid, but I need some advise. Will there be any performance difference between
select colx from tab1 inner join tab2 on tab2.col = tab1.col
and
select colx from tab1 inner join tab2 on tab1.col = tab2.col
I couldn't find any reference on this, but I see the second one used everywhere.
Thanks
Hello, everyoneI have one question about the standard join and inner join, which oneis faster and more reliable? Can you recommend me to use? Please,explain me...ThanksChamnap
Please let me know the way to increase the performance of the below query :
SELECT DISTINCT a.* FROM a INNER JOIN #temp1 b on (a.col1 = b.col1 OR a.col1 IS NULL) INNER JOIN #temp2 c on (a.col2 = c.col1 OR a.col2 IS NULL)
Here, there are no indexes/pk on the columns in any table. But I am sure that the table #temp1 and #temp2 has distinct/unique values in columns col1 used here. The table 'a' has redandant values in its column used here.
Should I create pk on the columns for #temp1 and #temp2 used here. Is that enough ? Or should I also create index on the columns of the table 'a' used here.
Also please let me know is there anyother way to increase the performance of the query.
I am having two tables contains 6 lakhs record and 30,000 record respectively. i make a innner join between these two tables it take min 15 sec. i dint know wats the problem can any 1 help me.
Hi everyone I need a solution for this query. It is working fine for 2 tables but when there are 1000's of records in each table and query has more than 2 tables. The process never ends. Here is the query (select siqPid= 1007, t1.Gmt909Time as GmtTime,(t1.engValue+t2.engValue+t3.engValue+t4.engValue) as EngValue, t1.Loc1Time as locTime,t1.msgId into #temp5 from #temp1 as t1,#temp2 as t2,#temp3 as t3,#temp4 as t4 where t1.Loc1Time = t2.Loc1Time and t2.Loc1Time = t3.Loc1Time and t3.Loc1Time = t4.Loc1Time) I was trying to do something with this query.
But the engValues cant be summed up. and if I add that in the query, the query isnt compiling. (select siqPid= 1007, t1.Gmt909Time as GmtTime, t1.Loc1Time as locTime,t1.msgId,(t1.engValue+t2.engValue+t3.engValue+t4.engValue) as engValue --into #temp5 from #temp1 as t1 where exists (Select 1 from #temp2 as t2 where t1.Loc1Time = t2.Loc1Time and exists (Select 1 from #temp3 as t3 where t2.Loc1Time = t3.Loc1Time and exists (Select 1 from #temp4 as t4 where t3.Loc1Time = t4.Loc1Time))))
I need immediate help on that, I would appreciate an input on it.
I am developing reporting service and using lots of 'LEFT OUTER JOIN',I am worried about the performance and want to use some subquery toimprovethe performance.Could I do that like below,[the origin source]SELECT *FROM TableALEFT OUTER JOIN TableBON TableA.item1 = TableB.item1WHERE TableA.item2 = 'xxxx'TableB.item2 > yyyy AND TableB.item2 < zzzzI add the subquery to query every table before 'LEFT JOIN'--------------------------------------------------------------------------SELECT *FROM(SELECT *FROM TableAWHERE TableA.item2 = 'xxxx') TableCLEFT OUTER JOIN(SELECT *FROM TableBWHERE TableB.item2 > yyyy AND TableB.item2 < zzzz) TableDON TableC.item1 = TableD.item1WHERE TableC.item2 = 'xxxx'TableD.item2 > yyyy AND TableD.item2 < zzzz--------------------------------------------------------------------------Can anyone give me some suggestion?Thanks a lot.Leland Huang
We are using SSIS to transfer data from OLTP database to DataWarehouse database on a daily basis. Our solution is modelled on / a copy of the (excellent) Project Real solution.
first stage of process, we extract the daily "records" into 4 stage files (raw file source) Then we process these 4 files to populate our Fact and Dimension tables.
We are having a problem with the population of a couple of our dimension tables. e.g, Comment dimension table - DimComment Our aim is to only add records to the table that do not already exist
So on "left side" of dataflow
1) read contents of stage file - just get commenttext column 2) use derived column component to add three columns (updatedby, lastupdated, ETLLoadID) and to Trim(commenttext) 3) use sort component to sort output - sort on commenttext and remove duplicates on "right side" of dataflow
1) read the contents of the DimComments table
select ltrim(rtrim(CommentText)) as CommentText, CommentKey from DimComment order by ltrim(rtrim(CommentText)) - remembering to set both the IsSorted and SortOrder properties for the component.
then we use a Merge Join component to merge the two dataflows. Within the Merge Join, we use a Left Outer Join so that we get all of the commenttext records from the daily stage file - which will have a CommentKey from the dimcomment table if there is a match (matching on comment text) then we use a Conditional Split component to remove records from the data flow where the CommentKey is not null - i.e. we only want records that dont already exist onthe comment table. finally, update DimComment table
Problem Database table not being updated correctly - has duplicates Problem appears to be with the Merge Join component.
From the existing records on the DimComment table, we get 1,943,309 records From the daily stage file, we get 2,578 records - that after the sort (and duplicate removal) is reduced to 776 records After the Merge Join, this is reduced to 771 records - only the first 5 records are matched - but it should be reduced by more....
if i amend the reading of the DimComment table to have the following clause
where (commenttext like '%automation comm%') then it returns 8,347 rows - and 20 rows are matched - correct if i amend the reading of the DimComment table to have the following clause
where (commenttext like '%automation com%') then it returns 603,286 rows - and 358 rows are matched - correct if i amend the reading of the DimComment table to have the following clause
where (commenttext like '%automation%') then it returns 899,462 rows - and 0 rows are matched - incorrect if i amend the reading of the DimComment table to have the following clause
where (commenttext like 'a%' or commenttext like 'b%'
or commenttext like 'c%' or commenttext like 'd%' or commenttext like 'e%') then it returns 899,462 rows - and 29 rows are matched - incorrect - did most of them - but not all ???
in theory - if i run the process twice, i should get NO updates second time through - this is NOT the case !!
so it would appear that in some cases, the Merge Join component is doing its merge join before it has got all of its records from the DimComment dimension table - matching appear to work on diffferent sets of records - as long as there are not too many records to process
Question 1 is it possible to correct this ? Question 2 is there a limitation on the processing of the Merge Join component in number of records it can handle ? - if so, what is it ? - we have two other SSIS packages doing similar processing to this (but not as many records - YET) (tablerow structure on dimensions tables is short - only 3-4 columns per row) Question 3 might it be better to process the daily comments by after doing the sort, doing a lookup on the DimComment table - and then inserting into the DimComment table if there is no match found
FYI above details got from running app on my PC on local database (SQL2005) - database is a copy of production (couple of weeks old)
I'm experiencing performance problems with the merge join task. Every time I'm building a nice package using this task, I'm ending up deleting it and using SQL statement in the OLE DB source to accomplish the join since it takes forever to run and crushing my computer at the process. It makes me feel I don't use the abilities SSIS has to offer compared to DTS. Of course for the use of several thousands of records it works fine, but in a production surrounding with hundred of thousands of rows, it seems to be futile.
I am making a ASP.NET web application that involves 2 SQL Server(A & B). I created a view in SQL server A pointing to the table in SQL Server B. I found out my application will run REALLY slow when accessing such a view. so I try to avoid using them. But in the case of 2 table joining from 2 different SQL Servers, I have no choice. Can anyone help me with this? Thanks!
Hello folks,first of all I really don't know how you gurus call this way ofwriting joins:SELECTA.FIELD,B.FIELDFROMTABLE_A A,TABLE_B BWHEREA.ID_FIELD = B.ID_FIELDI find this way very useful and readable. It works also with left andright Joins (using *= or =* instead of = )A friend of mine found that the inner join way (using = ) in Access ismuch more slower than using the classic INNER JOIN TABLE ON FIELDsintax. My question is: was MSSQL Server studied for using the shortway, or it is just a workaround found by someone? Is there aperformance degrade folllowing this way?TIA,tK
We have a table with a couple of computed columns. The value of the computed column represents a foreign key reference into another table. We're seeing a major performance problem doing a query joining between the two tables with one of the columns, but not the other. In other words, this kind of query is very fast:
select * from TheTable A, FKeyTable B where A.ComputedColumn1 = B.KeyColumn
but this one sends the CPU usage of SQL Server to 99% for a very long time:
select * from TheTable A, FKeyTable B where A.ComputedColumn2 = B.KeyColumn
The main difference we can see that the computed column that causes problems is based on a UDF, and the other one isn't (but again, both are computed). When I look at the execution plan, the slow query shows a Nested Loop (Inner Join) with a "No Join Predicate" warning, with the estimated # of rows being 70 million (which correponds to the product of 1016 rows in TheTable and 69K rows in FKeyTable). The fast query doesn't have that warning, and shows 1016 rows (the # of rows in TheTable).
Does anyone know why the usage of a UDF would induce this horribly inefficient join behavior? Anything we can do to fix it?
1. Right now in my queries I am using lots of LEFT Joins and INNER JOINs... and I was suggested to look at 'IN'... But with IN I did face some performance issues previously and stopped using it... but I have got new doubts on which query will give me better performance...
A query using LEFTJoin or a query using IN/NOT-IN
2. This question is about CONVERT...
I have a stored proc which is used for updating a table... and multiple columns [of the same table] and corresponding values are sent to the proc [only a subset of the columns might be sent for updates everytime and the columns to update is not fixed for each run of the SP]...
I have to construct a UPDATE String out of it using string concatenation to finally be able to use "sys.sp_executesql" on that update statement...
This results in me having to use CONVERT() lots of times... and one of the columns among them on which I am doing a CONVERT is of the type XML...
So the question is as follows... a. Is it preferrable to construct a single UPDATE statement string and execute it using "sys.sp_executesql" b. Or Is it preferrable to give multiple UPDATE statments... i.e. one update statement for each column [Depending on whether that column has to be updated for that run or not]
i.e. The question essentially is:
Does a single update query constructed using lots of CONVERTS [Basically on INT and XML types] give more performance over using multiple UPDATE statments on the table Or is it the other way round..
Hello Everyone,I have a very complex performance issue with our production database.Here's the scenario. We have a production webserver server and adevelopment web server. Both are running SQL Server 2000.I encounted various performance issues with the production server with aparticular query. It would take approximately 22 seconds to return 100rows, thats about 0.22 seconds per row. Note: I ran the query in singleuser mode. So I tested the query on the Development server by taking abackup (.dmp) of the database and moving it onto the dev server. I ranthe same query and found that it ran in less than a second.I took a look at the query execution plan and I found that they we'rethe exact same in both cases.Then I took a look at the various index's, and again I found nodifferences in the table indices.If both databases are identical, I'm assumeing that the issue is relatedto some external hardware issue like: disk space, memory etc. Or couldit be OS software related issues, like service packs, SQL Serverconfiguations etc.Here's what I've done to rule out some obvious hardware issues on theprod server:1. Moved all extraneous files to a secondary harddrive to free up spaceon the primary harddrive. There is 55gb's of free space on the disk.2. Applied SQL Server SP4 service packs3. Defragmented the primary harddrive4. Applied all Windows Server 2003 updatesHere is the prod servers system specs:2x Intel Xeon 2.67GHZTotal Physical Memory 2GB, Available Physical Memory 815MBWindows Server 2003 SE /w SP1Here is the dev serers system specs:2x Intel Xeon 2.80GHz2GB DDR2-SDRAMWindows Server 2003 SE /w SP1I'm not sure what else to do, the query performance is an order ofmagnitude difference and I can't explain it. To me its is a hardware oroperating system related issue.Any Ideas would help me greatly!Thanks,Brian T*** Sent via Developersdex http://www.developersdex.com ***
I was writing a query using both left outer join and inner join. And the query was ....
SELECT Â Â Â Â Â Â Â S.companyname AS supplier, S.country,P.productid, P.productname, P.unitprice,C.categoryname FROM Â Â Â Â Â Â Â Production.Suppliers AS S LEFT OUTER JOIN Â Â Â Â Â Â (Production.Products AS P Â Â Â Â Â Â Â Â INNER JOIN Production.Categories AS C
[code]....
However ,the result that i got was correct.But when i did the same query using the left outer join in both the cases
i.e..
SELECT Â Â Â Â Â Â Â S.companyname AS supplier, S.country,P.productid, P.productname, P.unitprice,C.categoryname FROM Â Â Â Â Â Â Â Production.Suppliers AS S LEFT OUTER JOIN (Production.Products AS P LEFT OUTER JOIN Production.Categories AS C ON C.categoryid = P.categoryid) ON S.supplierid = P.supplierid WHERE S.country = N'Japan';
The result i got was same,i.e
supplier   country   productid   productname   unitprice   categorynameSupplier QOVFD   Japan   9   Product AOZBW   97.00   Meat/PoultrySupplier QOVFD   Japan  10   Product YHXGE   31.00   SeafoodSupplier QOVFD   Japan  74   Product BKAZJ   10.00   ProduceSupplier QWUSF   Japan   13   Product POXFU   6.00   SeafoodSupplier QWUSF   Japan   14   Product PWCJB   23.25   ProduceSupplier QWUSF   Japan   15   Product KSZOI   15.50   CondimentsSupplier XYZ   Japan   NULL   NULL   NULL   NULLSupplier XYZ   Japan   NULL   NULL   NULL   NULL
and this time also i got the same result.My question is that is there any specific reason to use inner join when join the third table and not the left outer join.
OLEDB source 1 SELECT ... ,[MANUAL DCD ID] <-- this column set to sort order = 1 ... FROM [dbo].[XLSDCI] ORDER BY [MANUAL DCD ID] ASC
OLEDB source 2 SELECT ... ,[Bo Tkt Num] <-- this column set to sort order = 1 ... FROM ....[dbo].[FFFenics] ORDER BY [Bo Tkt Num] ASC
These two tasks are followed immediately by a MERGE JOIN
All columns in source1 are ticked, all column in source2 are ticked, join key is shown above. join type is left outer join (source 1 -> source 2)
result of source1 (..dcd column) ... 4-400-8000119 4-400-8000120 4-400-8000121 4-400-8000122 <--row not joining 4-400-8000123 4-400-8000124 ...
result of source2 (..tkt num column) ... 4-400-1000118 4-400-1000119 4-400-1000120 4-400-1000121 4-400-1000122 <--row not joining 4-400-1000123 4-400-1000124 4-400-1000125 ...
All other rows are joining as expected. Why is it failing for this one row?
I'm having trouble with a multi-table JOIN statement with more than one JOIN statement.
For each order, I need to return the following: CarsID, CarModelName, MakeID, OrderDate, ProductName, Total ordered the Car Category.
The carid (primary key) and carmodelname belong to the Cars table. The makeid and orderdate belong to the OrderDetails table. The productname and carcategory belong to the Product table.
The number of rows returned should be the same as the number of rows in OrderDetails.
Why would I use a left join instead of a inner join when the columns entered within the SELECT command determine what is displayed from the query results?
I have a merge join (full outer join) task in a data flow. The left input comes from a flat file source and then a script transformation which does some custom grouping. The right input comes from an oledb source. The script transformation output is asynchronous (SynchronousInputID=0). The left input has many more rows (200,000+) than the right input (2,500). I run it from VS 2005 by right-click/execute on the data flow task. The merge join remains yellow and the task never finishes. I do see a row count above the flat file destination that reaches a certain number and seems to get stuck there. When I test with a smaller file on the left it works OK. Any suggestions?
A piece of software I wrote starting timing out on a query that left outer joins a table to a view. Both the table and view have approximately the same number of rows (about 170000).
The table has 2 very similar columns, one is a varchar(1) and another is varchar(100). Neither are included in any index and beyond the size difference, the columns have the same properties. One of the employees here uses the varchar(1) column (called miscsearch) to tag large sets of rows to perform some action on. In this case, he had set 9000 rows miscsearch value to "g". The query then should join the table and view for all rows where miscsearch is set to g in the table. This query takes at least 20 minutes to run (I stopped it at this point).
If I remove the "where" clause and join all rows in the two tables, the query completes in about 20 seconds. If set the varchar(100) column (called descrip) to "g" for the same rows set via miscsearch, the query completes in about 20 seconds.
If I force the join type to a hash join, the query completes using miscsearch in about 30 seconds.
So, this works:
SELECT di.File_No, prevPlacements, balance,'NOT PLACED' as status FROM Info di LEFT OUTER HASH JOIN View_PP pp ON di.ram_file_no = pp.file_no WHERE miscsearch = 'g' ORDER BY balance DESC
and this works:
SELECT di.File_No, prevPlacements, balance,'NOT PLACED' as status FROM Info di LEFT OUTER JOIN View_PP pp ON di.ram_file_no = pp.file_no WHERE descrip = 'g' ORDER BY balance DESC
But this does't:
SELECT di.File_No, prevPlacements, balance,'NOT PLACED' as status FROM Info di LEFT OUTER JOIN View_PP pp ON di.ram_file_no = pp.file_no WHERE miscsearch = 'g' ORDER BY balance DESC
What should I be looking for here to understand why this is happening?
We are trying to migrate from sql 2005 to 2012. I am changing one of the implicit join to explicit join. As soon as I change the join, the number of rows returned are fewer than before.
INSERT #RIF_TEMP1 (rf1_row_no,rf1_rif, rf1_key_id_no, rf1_last_date, rf1_start_date) SELECT currow.rf0_row_no, currow.rf0_rif, currow.rf0_key_id_no, prevrow.rf0_start_date, currow.rf0_start_date FROM #RIF_TEMP0 currow LEFT JOIN #RIF_TEMP0 prevrow ON (currow.rf0_row_no = prevrow.rf0_row_no + 1)
[Code] ....
the count returned from both the queries is different.
I am not sure what am I doing wrong. The count of #RIF_TEMP0 is always 32, it never changes, but the variable @countTemp is different for both the queries.
Why does this right join return the same results as using a left (or even a full join)?There are 470 records in Account, and there are 1611 records in Contact. But any join returns 793 records.
select Contact.firstname, Contact.lastname, Account.[Account Name] from Contact right join Account on Contact.[Account Name] = Account.[Account Name] where Contact.[Account Name] = Account.[Account Name]
Is there a way to do a super-table join ie two table join with no matching criteria? I am pulling in a sheet from XL and joining to a table in SQLServer. The join should read something like €œfor every row in the sheet I need that row and a code from a table. 100 rows in the sheet merged with 10 codes from the table = 1000 result rows.
This is the simple sql (no join on the tables):
select 1.code, 2.rowdetail from tblcodes 1, tblelements 2
I read that merge joins work a lot faster than hash joins. How would you convert a hash join into a merge join? (Referring to output on Execution Plan diagrams.) THANKS
Hello Everyone,I have a very complex performance issue with our production database.Here's the scenario. We have a production webserver server and adevelopment web server. Both are running SQL Server 2000.I encounted various performance issues with the production server witha particular query. It would take approximately 22 seconds to return100 rows, thats about 0.22 seconds per row. Note: I ran the query insingle user mode. So I tested the query on the Development server bytaking a backup (.dmp) of the database and moving it onto the devserver. I ran the same query and found that it ran in less than asecond.I took a look at the query execution plan and I found that they we'rethe exact same in both cases.Then I took a look at the various index's, and again I found nodifferences in the table indices.If both databases are identical, I'm assumeing that the issue isrelated to some external hardware issue like: disk space, memory etc.Or could it be OS software related issues, like service packs, SQLServer configuations etc.Here's what I've done to rule out some obvious hardware issues on theprod server:1. Moved all extraneous files to a secondary harddrive to free up spaceon the primary harddrive. There is 55gb's of free space on the disk.2. Applied SQL Server SP4 service packs3. Defragmented the primary harddrive4. Applied all Windows Server 2003 updatesHere is the prod servers system specs:2x Intel Xeon 2.67GHZTotal Physical Memory 2GB, Available Physical Memory 815MBWindows Server 2003 SE /w SP1Here is the dev serers system specs:2x Intel Xeon 2.80GHz2GB DDR2-SDRAMWindows Server 2003 SE /w SP1I'm not sure what else to do, the query performance is an order ofmagnitude difference and I can't explain it. To me its is a hardware oroperating systemrelated issue.Any Ideas would help me greatly!Thanks,Brian T
There is a table called "tblvZipCodes" that contain a zipcode of all cities, area code that are located in that zip code.
The problem I have with the inner join is that there are more than 1 cities in one zipcode code. Is there a way to just return only the 1st row and not return the rest of the rows from the tblvZipCodes in the INNER JOIN query?
Thanks..
Code:
SELECT TOP 100 PERCENT dbo.tblPurchaseRaw.Year, dbo.tblPurchaseRaw.Make, dbo.tblPurchaseRaw.Model, dbo.tblPurchaseRaw.ModelType, dbo.tblPurchaseRaw.Color, dbo.tblvZipCodes.ZIPCode, dbo.tblvZipCodes.City, dbo.tblvZipCodes.County, dbo.tblvZipCodes.State, dbo.tblvZipCodes.AreaCode, dbo.tblvZipCodes.Region, dbo.tblaAccounts.Name, dbo.tblaAccounts.PhoneOne, dbo.tblaAccounts.AccountID, dbo.tblPurchaseRaw.AcceptedID, dbo.tblPurchaseRaw.Series, dbo.tblPurchaseRaw.BodyStyle, dbo.tblaAccounts.WebSite, dbo.tblaAccounts.SalesEmail, dbo.tblPurchaseRaw.EmailTo, dbo.tblPurchaseRaw.PhotoURL, dbo.tblPurchaseRaw.Mileage, dbo.tblPurchaseRaw.RawID, dbo.tblvRegions.Name AS RegionName, dbo.tblPurchaseRaw.VIN, dbo.tblPurchaseRaw.Style, dbo.tblPurchaseRaw.StockDate FROM dbo.tblPurchaseRaw INNER JOIN dbo.tblaAccounts ON dbo.tblPurchaseRaw.AccountID = dbo.tblaAccounts.AccountID INNER JOIN dbo.tblvZipCodes ON dbo.tblPurchaseRaw.ZipCode = dbo.tblvZipCodes.ZIPCode INNER JOIN dbo.tblvRegions ON dbo.tblvZipCodes.Region = dbo.tblvRegions.RegionID WHERE (CONVERT(char, dbo.tblPurchaseRaw.StockDate, 101) <> '01/01/1900') AND (dbo.tblPurchaseRaw.SoldRawID IS NULL) AND (dbo.tblPurchaseRaw.AcceptedID <> - 10) AND (dbo.tblPurchaseRaw.AcceptedID <> - 1) ORDER BY dbo.tblvZipCodes.ZIPCode