Please let me know the way to increase the performance of the below query :
SELECT DISTINCT a.* FROM a INNER JOIN #temp1 b on (a.col1 = b.col1 OR a.col1 IS NULL) INNER JOIN #temp2 c on (a.col2 = c.col1 OR a.col2 IS NULL)
Here, there are no indexes/pk on the columns in any table. But I am sure that the table #temp1 and #temp2 has distinct/unique values in columns col1 used here. The table 'a' has redandant values in its column used here.
Should I create pk on the columns for #temp1 and #temp2 used here. Is that enough ? Or should I also create index on the columns of the table 'a' used here.
Also please let me know is there anyother way to increase the performance of the query.
We have a table with a couple of computed columns. The value of the computed column represents a foreign key reference into another table. We're seeing a major performance problem doing a query joining between the two tables with one of the columns, but not the other. In other words, this kind of query is very fast:
select * from TheTable A, FKeyTable B where A.ComputedColumn1 = B.KeyColumn
but this one sends the CPU usage of SQL Server to 99% for a very long time:
select * from TheTable A, FKeyTable B where A.ComputedColumn2 = B.KeyColumn
The main difference we can see that the computed column that causes problems is based on a UDF, and the other one isn't (but again, both are computed). When I look at the execution plan, the slow query shows a Nested Loop (Inner Join) with a "No Join Predicate" warning, with the estimated # of rows being 70 million (which correponds to the product of 1016 rows in TheTable and 69K rows in FKeyTable). The fast query doesn't have that warning, and shows 1016 rows (the # of rows in TheTable).
Does anyone know why the usage of a UDF would induce this horribly inefficient join behavior? Anything we can do to fix it?
Hello, everyoneI have one question about the standard join and inner join, which oneis faster and more reliable? Can you recommend me to use? Please,explain me...ThanksChamnap
May sound stupid, but I need some advise. Will there be any performance difference between select colx from tab1 inner join tab2 on tab2.col = tab1.col and select colx from tab1 inner join tab2 on tab1.col = tab2.col
I couldn't find any reference on this, but I see the second one used everywhere. Thanks
Hello Everyone,I have a very complex performance issue with our production database.Here's the scenario. We have a production webserver server and adevelopment web server. Both are running SQL Server 2000.I encounted various performance issues with the production server with aparticular query. It would take approximately 22 seconds to return 100rows, thats about 0.22 seconds per row. Note: I ran the query in singleuser mode. So I tested the query on the Development server by taking abackup (.dmp) of the database and moving it onto the dev server. I ranthe same query and found that it ran in less than a second.I took a look at the query execution plan and I found that they we'rethe exact same in both cases.Then I took a look at the various index's, and again I found nodifferences in the table indices.If both databases are identical, I'm assumeing that the issue is relatedto some external hardware issue like: disk space, memory etc. Or couldit be OS software related issues, like service packs, SQL Serverconfiguations etc.Here's what I've done to rule out some obvious hardware issues on theprod server:1. Moved all extraneous files to a secondary harddrive to free up spaceon the primary harddrive. There is 55gb's of free space on the disk.2. Applied SQL Server SP4 service packs3. Defragmented the primary harddrive4. Applied all Windows Server 2003 updatesHere is the prod servers system specs:2x Intel Xeon 2.67GHZTotal Physical Memory 2GB, Available Physical Memory 815MBWindows Server 2003 SE /w SP1Here is the dev serers system specs:2x Intel Xeon 2.80GHz2GB DDR2-SDRAMWindows Server 2003 SE /w SP1I'm not sure what else to do, the query performance is an order ofmagnitude difference and I can't explain it. To me its is a hardware oroperating system related issue.Any Ideas would help me greatly!Thanks,Brian T*** Sent via Developersdex http://www.developersdex.com ***
I am having two tables contains 6 lakhs record and 30,000 record respectively. i make a innner join between these two tables it take min 15 sec. i dint know wats the problem can any 1 help me.
Hi everyone I need a solution for this query. It is working fine for 2 tables but when there are 1000's of records in each table and query has more than 2 tables. The process never ends. Here is the query (select siqPid= 1007, t1.Gmt909Time as GmtTime,(t1.engValue+t2.engValue+t3.engValue+t4.engValue) as EngValue, t1.Loc1Time as locTime,t1.msgId into #temp5 from #temp1 as t1,#temp2 as t2,#temp3 as t3,#temp4 as t4 where t1.Loc1Time = t2.Loc1Time and t2.Loc1Time = t3.Loc1Time and t3.Loc1Time = t4.Loc1Time) I was trying to do something with this query.
But the engValues cant be summed up. and if I add that in the query, the query isnt compiling. (select siqPid= 1007, t1.Gmt909Time as GmtTime, t1.Loc1Time as locTime,t1.msgId,(t1.engValue+t2.engValue+t3.engValue+t4.engValue) as engValue --into #temp5 from #temp1 as t1 where exists (Select 1 from #temp2 as t2 where t1.Loc1Time = t2.Loc1Time and exists (Select 1 from #temp3 as t3 where t2.Loc1Time = t3.Loc1Time and exists (Select 1 from #temp4 as t4 where t3.Loc1Time = t4.Loc1Time))))
I need immediate help on that, I would appreciate an input on it.
I am developing reporting service and using lots of 'LEFT OUTER JOIN',I am worried about the performance and want to use some subquery toimprovethe performance.Could I do that like below,[the origin source]SELECT *FROM TableALEFT OUTER JOIN TableBON TableA.item1 = TableB.item1WHERE TableA.item2 = 'xxxx'TableB.item2 > yyyy AND TableB.item2 < zzzzI add the subquery to query every table before 'LEFT JOIN'--------------------------------------------------------------------------SELECT *FROM(SELECT *FROM TableAWHERE TableA.item2 = 'xxxx') TableCLEFT OUTER JOIN(SELECT *FROM TableBWHERE TableB.item2 > yyyy AND TableB.item2 < zzzz) TableDON TableC.item1 = TableD.item1WHERE TableC.item2 = 'xxxx'TableD.item2 > yyyy AND TableD.item2 < zzzz--------------------------------------------------------------------------Can anyone give me some suggestion?Thanks a lot.Leland Huang
We are using SSIS to transfer data from OLTP database to DataWarehouse database on a daily basis. Our solution is modelled on / a copy of the (excellent) Project Real solution.
first stage of process, we extract the daily "records" into 4 stage files (raw file source) Then we process these 4 files to populate our Fact and Dimension tables.
We are having a problem with the population of a couple of our dimension tables. e.g, Comment dimension table - DimComment Our aim is to only add records to the table that do not already exist
So on "left side" of dataflow
1) read contents of stage file - just get commenttext column 2) use derived column component to add three columns (updatedby, lastupdated, ETLLoadID) and to Trim(commenttext) 3) use sort component to sort output - sort on commenttext and remove duplicates on "right side" of dataflow
1) read the contents of the DimComments table
select ltrim(rtrim(CommentText)) as CommentText, CommentKey from DimComment order by ltrim(rtrim(CommentText)) - remembering to set both the IsSorted and SortOrder properties for the component.
then we use a Merge Join component to merge the two dataflows. Within the Merge Join, we use a Left Outer Join so that we get all of the commenttext records from the daily stage file - which will have a CommentKey from the dimcomment table if there is a match (matching on comment text) then we use a Conditional Split component to remove records from the data flow where the CommentKey is not null - i.e. we only want records that dont already exist onthe comment table. finally, update DimComment table
Problem Database table not being updated correctly - has duplicates Problem appears to be with the Merge Join component.
From the existing records on the DimComment table, we get 1,943,309 records From the daily stage file, we get 2,578 records - that after the sort (and duplicate removal) is reduced to 776 records After the Merge Join, this is reduced to 771 records - only the first 5 records are matched - but it should be reduced by more....
if i amend the reading of the DimComment table to have the following clause
where (commenttext like '%automation comm%') then it returns 8,347 rows - and 20 rows are matched - correct if i amend the reading of the DimComment table to have the following clause
where (commenttext like '%automation com%') then it returns 603,286 rows - and 358 rows are matched - correct if i amend the reading of the DimComment table to have the following clause
where (commenttext like '%automation%') then it returns 899,462 rows - and 0 rows are matched - incorrect if i amend the reading of the DimComment table to have the following clause
where (commenttext like 'a%' or commenttext like 'b%'
or commenttext like 'c%' or commenttext like 'd%' or commenttext like 'e%') then it returns 899,462 rows - and 29 rows are matched - incorrect - did most of them - but not all ???
in theory - if i run the process twice, i should get NO updates second time through - this is NOT the case !!
so it would appear that in some cases, the Merge Join component is doing its merge join before it has got all of its records from the DimComment dimension table - matching appear to work on diffferent sets of records - as long as there are not too many records to process
Question 1 is it possible to correct this ? Question 2 is there a limitation on the processing of the Merge Join component in number of records it can handle ? - if so, what is it ? - we have two other SSIS packages doing similar processing to this (but not as many records - YET) (tablerow structure on dimensions tables is short - only 3-4 columns per row) Question 3 might it be better to process the daily comments by after doing the sort, doing a lookup on the DimComment table - and then inserting into the DimComment table if there is no match found
FYI above details got from running app on my PC on local database (SQL2005) - database is a copy of production (couple of weeks old)
I'm experiencing performance problems with the merge join task. Every time I'm building a nice package using this task, I'm ending up deleting it and using SQL statement in the OLE DB source to accomplish the join since it takes forever to run and crushing my computer at the process. It makes me feel I don't use the abilities SSIS has to offer compared to DTS. Of course for the use of several thousands of records it works fine, but in a production surrounding with hundred of thousands of rows, it seems to be futile.
I'm having trouble with a multi-table JOIN statement with more than one JOIN statement.
For each order, I need to return the following: CarsID, CarModelName, MakeID, OrderDate, ProductName, Total ordered the Car Category.
The carid (primary key) and carmodelname belong to the Cars table. The makeid and orderdate belong to the OrderDetails table. The productname and carcategory belong to the Product table.
The number of rows returned should be the same as the number of rows in OrderDetails.
A piece of software I wrote starting timing out on a query that left outer joins a table to a view. Both the table and view have approximately the same number of rows (about 170000).
The table has 2 very similar columns, one is a varchar(1) and another is varchar(100). Neither are included in any index and beyond the size difference, the columns have the same properties. One of the employees here uses the varchar(1) column (called miscsearch) to tag large sets of rows to perform some action on. In this case, he had set 9000 rows miscsearch value to "g". The query then should join the table and view for all rows where miscsearch is set to g in the table. This query takes at least 20 minutes to run (I stopped it at this point).
If I remove the "where" clause and join all rows in the two tables, the query completes in about 20 seconds. If set the varchar(100) column (called descrip) to "g" for the same rows set via miscsearch, the query completes in about 20 seconds.
If I force the join type to a hash join, the query completes using miscsearch in about 30 seconds.
So, this works:
SELECT di.File_No, prevPlacements, balance,'NOT PLACED' as status FROM Info di LEFT OUTER HASH JOIN View_PP pp ON di.ram_file_no = pp.file_no WHERE miscsearch = 'g' ORDER BY balance DESC
and this works:
SELECT di.File_No, prevPlacements, balance,'NOT PLACED' as status FROM Info di LEFT OUTER JOIN View_PP pp ON di.ram_file_no = pp.file_no WHERE descrip = 'g' ORDER BY balance DESC
But this does't:
SELECT di.File_No, prevPlacements, balance,'NOT PLACED' as status FROM Info di LEFT OUTER JOIN View_PP pp ON di.ram_file_no = pp.file_no WHERE miscsearch = 'g' ORDER BY balance DESC
What should I be looking for here to understand why this is happening?
I am making a ASP.NET web application that involves 2 SQL Server(A & B). I created a view in SQL server A pointing to the table in SQL Server B. I found out my application will run REALLY slow when accessing such a view. so I try to avoid using them. But in the case of 2 table joining from 2 different SQL Servers, I have no choice. Can anyone help me with this? Thanks!
Hello folks,first of all I really don't know how you gurus call this way ofwriting joins:SELECTA.FIELD,B.FIELDFROMTABLE_A A,TABLE_B BWHEREA.ID_FIELD = B.ID_FIELDI find this way very useful and readable. It works also with left andright Joins (using *= or =* instead of = )A friend of mine found that the inner join way (using = ) in Access ismuch more slower than using the classic INNER JOIN TABLE ON FIELDsintax. My question is: was MSSQL Server studied for using the shortway, or it is just a workaround found by someone? Is there aperformance degrade folllowing this way?TIA,tK
1. Right now in my queries I am using lots of LEFT Joins and INNER JOINs... and I was suggested to look at 'IN'... But with IN I did face some performance issues previously and stopped using it... but I have got new doubts on which query will give me better performance...
A query using LEFTJoin or a query using IN/NOT-IN
2. This question is about CONVERT...
I have a stored proc which is used for updating a table... and multiple columns [of the same table] and corresponding values are sent to the proc [only a subset of the columns might be sent for updates everytime and the columns to update is not fixed for each run of the SP]...
I have to construct a UPDATE String out of it using string concatenation to finally be able to use "sys.sp_executesql" on that update statement...
This results in me having to use CONVERT() lots of times... and one of the columns among them on which I am doing a CONVERT is of the type XML...
So the question is as follows... a. Is it preferrable to construct a single UPDATE statement string and execute it using "sys.sp_executesql" b. Or Is it preferrable to give multiple UPDATE statments... i.e. one update statement for each column [Depending on whether that column has to be updated for that run or not]
i.e. The question essentially is:
Does a single update query constructed using lots of CONVERTS [Basically on INT and XML types] give more performance over using multiple UPDATE statments on the table Or is it the other way round..
I have 2 tables, I will add sample data to them to help me explain...Table1(Fields: A, B)=====1,One2,Two3,ThreeTable2(Fields: A,B)=====2,deux9,neufI want to create a query that will only return data so long as the key(Field A) is on both tables, if not, return nothing. How can I dothis? I am thnking about using a 'JOIN' but not sure how to implementit...i.e: 2 would return data- but 9 would not...any help would be appreciated.
Is there any specific place where I can find which SQL Query is more efficient?.
Is INNER JOIN is faster or Select ... Where ID in (SELECT ...) is faster?
I have two tables: 1.FLEET (The number of rows is not so much) Attributes : Company_Id (PK) Fleet_Id (PK) Fleet_Name Fleet_Description
2.USER_PRIVILEGE (The number of rows can reach up to 3 times the number of row in fleet table) Attributes : Company_Id (PK) Fleet_Id (PK) User_Id (PK) Privilege_Id(PK) Comment Category
I want to select Fleet_Id and Fleet_Name from fleet table Where the current user has privilege_id=1
I have two possible select statement :
1.Option 1
SELECT Fleet_Name, Fleet_Id FROM FLEET WHERE (Company_Id = 2) AND (Fleet_Id IN (SELECT fleet_id FROM user_privilege WHERE user_id = 11 AND company_id = 2 AND privilege_id = 1)) ORDER BY Fleet_Name
2.Option 2
SELECT F.Fleet_Name, F.Fleet_Id FROM USER_PRIVILEGE U INNER JOIN FLEET F ON U.Fleet_Id = F.Fleet_Id WHERE (F.Company_Id = 2) AND (U.Privilege_Id = 1) AND (U.User_Id = 11) ORDER BY F.Fleet_Name
Actually which one is faster. Is SQL Statement with INNER JOIN (Option 2) can be executed faster than the one with double Select Statement(Option 1)?
Hi: I have the following query, can somebody help me? SELECT s.Id, s.NameFROM Switch s INNER JOIN SwitchTelephoneRange r ON s.Id = r.IdWHERE '1526858' BETWEEN FromTelephone AND ToTelephone Where the '1526858' is a phone number. My problem is, I want to run the above query for each record in : select Telephone from PhoneDirectory So, each telephone number in the second phone, would macth the ' ' in the first query. How can I do so? Do I need a loop? a cursor? Can you help please? Thanks
Hi, I want to know query execution time as output. I want execution time only. this is for tuning purpose... Time displayed in the status bar is not helpful for me. thanks.
Hi there, I'm having a big performance problems with sql query. What i have done is changes physical layout (rearanged the columns) in one of the tables in the database. I used bcp to get the data out and then in. There are about quarter million rows in the this table. I have created the same indexes but know the same query takes a long time to execute. I have noticed that the showplan is somehow different than it used to be. This query uses the table i have changed and another one that i haven't touched. I have updated the stats to no help. Here are the show plans. this one is slow STEP 1 The type of query is INSERT The update mode is direct Worktable created for ORDER BY FROM TABLE SW_PERSON Nested iteration Index : swiPERSON10 FROM TABLE SW_CASE Nested iteration Table Scan TO TABLE Worktable 1 STEP 2 The type of query is SELECT This step involves sorting FROM TABLE Worktable 1 Using GETSORTED Table Scan
this one used to be fast STEP 1 The type of query is INSERT The update mode is direct Worktable created for ORDER BY FROM TABLE SW_CASE Nested iteration Table Scan FROM TABLE SW_PERSON Nested iteration Index : PK_SW_PERSON_1__27 TO TABLE Worktable 1 STEP 2 The type of query is SELECT This step involves sorting FROM TABLE Worktable 1 Using GETSORTED Table Scan
I think the problem is with the fact that the fist one doesn't use the PK key which is the one the links both tables. My question is how to force the query to use this index. PS. One thing i haven't done is to recreate indexes on the other table. But i don't think that would have made a differnece. Thanks
CREATE FUNCTION dbo.fnProductsRetrieveBySupplierID ( @SupplierID int ) RETURNS TABLE AS RETURN ( SELECT * FROM Products WHERE SupplierID = @SupplierID )
CREATE FUNCTION dbo.fnSuppliersRetrieveBySupplierID ( @SupplierID int ) RETURNS TABLE AS RETURN ( SELECT * FROM Suppliers WHERE SupplierID = @SupplierID )
I have been testing the performance of the following SQL statements:
Code:
1. SELECT * FROM Products INNER JOIN Suppliers ON (Products.SupplierID = Suppliers.SupplierID) WHERE Products.SupplierID = 3
2. SELECT * FROM dbo.fnProductsRetrieveBySupplierID (3), dbo.fnSuppliersRetrieveBySupplierID (3)
I have built a loop to execute each statement multiple times and then compare the execution times. Although both queries produce the same result, the 2nd one (which uses the functions) is about twice as slow, does anyone know why?
I am joining three tables each table has got about 1.5 million rows,selecting data from these three tables and inserting into a table,to avoid transaction log issues I am running the query in a batch of size 50,000 rows,it is taking about 5hrs to insert all the 1.5 millions rows.
All the columns in the where clause have proper indexes,I ran show plan for the query and it is using indexes properly and not doing any table scan.I updated the statistics for all the indexes also.
query looks some thing like this.
insert into table d (col1,col2,col3,.............. ) values (a.col1,b.col2,c.col3 .....................) from a,b,c where a.id = b.id and a.id = c.id and a.id in between @minid and @currid
The @minid starts from 1 and @currid starts from 50000 ,I am running this in a loop, in next iteration @minid will become 50001 and currid 100,000 and so on.
I have two tables. Employee EmployeeCode int Primary Key
Employee_Stock EmployeeCode int StockCode varchar(10) Primay key on (Employeecode, StockCode)..
There is no foreign key relation between these 2 tables. Now my question is which query give more performance. and why? 1. Select * from Employee INNER JOIN Employee_Stock on Employee.Employeecode = Employee_Stock.EmployeeCode
2. Create a foreign Key between Employee and Employee_Stock for EmployeeCode. and run the same query.
Actually we forgot to put the foreign key relationship between these 2 tables and we have lot of queries joining them.. Now if we add foreignkey, is it going to improve the performance or not?
I wrote a query and I use cursor in the query. When I run the query on dev box it takes 3 mins. I moved the query to EPM database box and it takes forever to run. Usually EPM database query performance is much better. How do I start debugging the poor performance?
How can I check if the query is creating any table locks?
Purpose of query: I get all the Companies (20000) and loop thru each company in the cursor and do calculations.
I have a query like below and it takes a aouple of seconds to run
select a.Registration_Key, ag.Agreement_Type_Name,p.ServiceProvider from dbo.Assessment a INNER JOIN ( select distinct Registration_Key, p.ServiceProvider, max(CSDS_Object_Key) as [Sequence] from dbo.Assessment a INNER JOIN dbo.CD_Provider_Xref p ON a.Provider_CD = p.Provider_CD where Creation_DT >= '07/01/2007' and Reason_CD = 1 group by Registration_Key, p.ServiceProvider ) as s1 ON a.CSDS_Object_Key = s1.Sequence INNER JOIN dbo.CD_Provider_XREF p ON a.Provider_CD = p.Provider_CD INNER JOIN dbo.CD_Agreement_Type ag ON ag.Agreement_Type_CD = a.Agreement_Type_CD LEFT OUTER JOIN ( select distinct Registration_Key, p.ServiceProvider , 1 as served from dbo.Encounters e INNER JOIN dbo.CD_Provider_Xref p ON e.Provider_CD = p.Provider_CD where Encounter_Begin_DT between '08/01/2007' and '08/31/2007' and Procedure_CD is not null and Encounter_Units > 0
) as s2 ON a.Registration_Key = s2.Registration_Key and p.ServiceProvider = s2.ServiceProvider
group by a.Registration_Key, ag.Agreement_Type_Name, p.ServiceProvider
However, if i add a served field( stamped with 1) it takes forever to run.. All of join columns have indexes.. cluster and non-clustered.. and i don;t see any index fregmentaitons...
select a.Registration_Key, ag.Agreement_Type_Name,p.ServiceProvider, served from dbo.Assessment a INNER JOIN ( select distinct Registration_Key, p.ServiceProvider, max(CSDS_Object_Key) as [Sequence] from dbo.Assessment a INNER JOIN dbo.CD_Provider_Xref p ON a.Provider_CD = p.Provider_CD where Creation_DT >= '07/01/2007' and Reason_CD = 1 group by Registration_Key, p.ServiceProvider ) as s1 ON a.CSDS_Object_Key = s1.Sequence INNER JOIN dbo.CD_Provider_XREF p ON a.Provider_CD = p.Provider_CD INNER JOIN dbo.CD_Agreement_Type ag ON ag.Agreement_Type_CD = a.Agreement_Type_CD LEFT OUTER JOIN ( select distinct Registration_Key, p.ServiceProvider , 1 as served from dbo.Encounters e INNER JOIN dbo.CD_Provider_Xref p ON e.Provider_CD = p.Provider_CD where Encounter_Begin_DT between '08/01/2007' and '08/31/2007' and Procedure_CD is not null and Encounter_Units > 0
) as s2 ON a.Registration_Key = s2.Registration_Key and p.ServiceProvider = s2.ServiceProvider
group by a.Registration_Key, ag.Agreement_Type_Name, p.ServiceProvider, served
Hello SQL Gurus, From the query below, I am using 2 TOP functions to return the desired row. I am wondering if someone can shed some light on how to AVOID using 2 TOP statements and combine into just one select query?
select TOP 1 * from (select TOP 2 Num from A order by Num) X order by Num desc
Truly Appreciate your help as this performance issue has been bugging in my head for quite some time...
I usually am all over answering these kinds of questions, but while I continue to work on this issue, maybe someone here can lend me a hand. A vendor application we run, stores metadata about backup blobs stored on a NAS device. The app basically backs up select folders on 1400 remote computers in the back office of our stores, and stores this on a NAS, while maintaining metadata about the BLOBs in SQL Server so that they can push recovery of the data back to the original store it came from. The database is roughly 80GB in size and has a single file group and is on its own dedicated LUN. It uses TempDB heavily, and this is not something that I can change, but TempDB is on a different disk array.
Today I spent hours on a conference call with them looking at a specific stored procedure that is used to clean up the records in the database after a BLOB file is deleted. A single BLOB file can have millions of related records in the database. There is a LEFT JOIN in the code that is against a table with 150 million + rows of data in it. The table size is fairly small, only 5 GB of data, but the LEFT JOIN spools 2.4GB of data to a Hash Match. It seems to me like the left join can't be removed, but I don't get how all of this works, because I didn't write the application. It is an INDEX SCAN. I can't seem to eliminate it. Is there anything I can do to help this thing out?