What Are Some Good Books On SQL Query Optimization And Other Advanced SQL Topics?
Aug 6, 2007
I graduated from college about three years ago and have been working as a programmer using MS SQL 2000 and even though I've learned a TON in these past three years I know there is tons more I can learn.
Because my company uses Microsoft SQL Server 2000 I'd like to focus on that, but anything ANSI-SQL is perfectly fine.
So... what are some good books on SQL query optimization and other advanced SQL topics?
Sorry if this has been posted before, but I performed a search on the archives and didn't find anything !
I have just been turned over a SQL 6.5 SP3 / NT 4.0 SP3 server that has several SQL errors in the event log. I need to find a good book to read/scan, so I can get up to speed on SQL.
I am an Oracle DBA with over 7 yrs of experience.I am new to sql server 2000 and am given the responsibility of sql server 2000 production databases in a few weeks.I already have the sql server 2000 DBA survival guide.
I would like to know if there are any good books out there for
Any recommendations for good advanced t-sql books/articles? I find myself involved with writing increasingly more complex queries and after spending a few hours on some, and then searching on this site for potential answers/help, I am wondering if there might be some good books on creating more advanced/complex t-sql for real world scenarios.
How to optimize the following Stored procedure running on MSSQL server 2000 sp4 :
CREATE PROCEDURE proc1 @Franchise ObjectId , @dtmStart DATETIME , @dtmEnd DATETIME AS BEGIN
SELECT p.Product , c.Currency , c.Minor , a.ACDef , e.Event , t.Dec , count(1) "Count" , sum(Amount) "Total" FROM tb_Event t JOIN tb_Prod p ON ( t.ProdId = p.ProdId ) JOIN tb_ACDef a ON ( t.ACDefId = a.ACDefId ) JOIN tb_Curr c ON ( t.CurrId = c.CurrId ) JOIN tb_Event e ON ( t.EventId = e.EventId ) JOIN tb_Setl s ON ( s.BUId = t.BUId and s.SetlD = t.SetlD ) WHERE Fran = @Franchise AND t.CDate >= @dtmStart AND t.CDate <= @dtmEnd AND s.Status = 1 GROUP BY p.Product , c.Currency , c.Minor , a.ACDef , e.Event , t.Dec
Hi, Can anyone help me optimize the SELECT statement in the 3rd step? I am actually writing a monthly report. So for each employee (500 employees) in a row, his attendance totals for all days in a month are displayed. The problem is that in the 3rd step, there are actually 31 SELECT statements which are assigned to 31 variables. After I assign these variable, I insert them in a Table (4th step) and display it. The troublesome part is the 3rd step. As there are 500 employees, then 500x31 times the variables are assigned and inserted in the table. This is taking more than 4 minutes which I know is not required :). Can anyone help me optimize the SELECT statements I have in the 3rd step or give a better suggestion. DECLARE @EmpID, @DateFrom, @Total1 .... // Declaring different variables SELECT @DateFrom = // Set to start of any month e.g. 2007-06-01 ...... 1st Loop (condition -- Get all employees, working fine) BEGIN SELECT @EmpID = // Get EmployeeID ...... 2nd SELECT @Total1 = SUM (Abences) ...... 3rd FROM Attendance WHERE employee_id_fk = @EmpID (from 2nd step) AND Date_Absent = DATEADD ("day", 0, Convert (varchar, @DateFrom)) (from 1st step) SELECT @Total2 ........................... same as above SELECT @Total3 ........................... same as above INSERT IN @TABLE (@EmpID, @Total1, ...... @Total31) ...... 4th Iterate (condition) to next employee ...... 5th END It's only the loop which consumes the 4 minutes. If I can somehow optimize this part, I will be most satisfied. Thanks for anyone helping me....
Trying to optimize a query, and having problems interpreting the data. We have a query that queries 5 tables with 4 INNER JOINS. When I use INNER HASH JOIN, this is the result:
(Using SQL Programmer)
SQL Server Execution Times: CPU time = 40 ms, elapsed time = 80 ms.
Now, when timing the code execution on my ASP page, it's "faster" not using the HASH. Using HASH, there are a few Hash Match/Inner Joins reported in the Execution Plan. Not using HASH, there are Bookmark Lookups/Nested Loops.
My question is which is better to "see": Boomark Lookups/Nested Loops or Hash Match/Inner Joins for the CPU/Server?
IS there any way to rewrite this Query in optimized way?
SELECT dbo.Table1.EmpId E from dbo.Table1 where EmpId in( SELECT dbo.Table1.EmpId FROM (SELECT DISTINCT PersonID, MAX(dtmStatusDate) AS dtmStatusDate FROM dbo.Table1 GROUP BY PersonID) derived_table INNER JOIN dbo.Table1 ON derived_table.PersonID = dbo.Table1.PersonID AND derived_table.dtmStatusDate = dbo.Table1.dtmStatusDate))
How can I optimized the following query: (SELECT e.SID FROMStudents s JOINTable1e ON e.SID= s.SID JOINTable2 ed ON ed.Enrollment = e.Enrollment JOINTable3 t ON t.TNum = e.TNum JOINTable4 bt ON bt.TNum = t.TNum JOINTable5 b ON b.Batch = bt.Batch JOIN IPlans i ON i.IPlan = ed.IPlan JOINPGroups g ON g.PGroup= i.PGroup
WHERE t.TStatus= 'ACP' ANDed.EStatus= 'APR' ANDe.SID=(select distinct SID from Table1 where Enrollment=@DpEnrollment)) AND(ed.EffectiveDate= (SELECT EffectiveDate FROM Table2 ed JOIN Table1 e ON e.enrollment=ed.enrollment WHERE IPlan = @DpIPlan ANDTCoord = @DpTCoord ANDAGCoord= @DpAGCoord ANDDCoord=@DpDCoord ) ANDDSeq= @DpDSeq) ANDe.SID= (select distinct SID from Table1 where Enrollment=@DpEnrollment)) ) ANDed.TerminationDate= (SELECT TerminationDate FROM Table2 ed JOIN Table1 e ON e.enrollment=ed.enrollment WHERE IPlan = @DpIPlan ANDTCoord = @DpTCoord ANDAGCoord= @DpAGCoord ANDDCoord= @DpDCoord ) ANDDSeq= @DpDSeq) ANDe.SID= (select distinct SID from Table1 where Enrollment=@DpEnrollment)) ) ))
DECLARE @PTEffDate_tmp AS SMALLDATETIME SELECT @PTEffDate_tmp = DateAdd(day, -1, PDate) FROM PDates pd WHERE iplan = @DIPlan and pd.TCoord = @DTCoord and DType = 'EF'
DECLARE @PTCoord_tmp as char(3) SELECT @PTCoord_tmp = tc.TCoord FROM PDates pd JOIN TCoords tc ON (pd.TCoord = tc.TCoord) WHERE pd.Iplan = @DIPlan and tc.TGroup = @TGroup_tmp and PDate = @PTEffDate_tmp and DateType = 'TR1'
DECLARE @EStatus_tmp as char(3) SELECT @EStatus_tmp = EDStatus From EDetails ed JOIN ENR e ON (ed.enr = e.enr) JOIN Trans t ON (e.transID = t.TransID) WHERE iplan = @DIPlan and ed.TCoord = @PTCoord_tmp and t.TransS= 'ACP' and DCoord = @DCoord and CEnr is null
How can I optimazed my query. Since my DB is more then 1 mln it takes a while to do all those join? select * FROM EEMaster eem JOIN NHistory nh ON eem.SNumber = nh.SNumber OR eem.OldNumber = nh.SNumber OR eem.CID = (Replicate ('0',12-len( nh.SNumber))+ nh.SNumber )
I work on tables containing 10 million plus records. What are the general steps needed to ensure that my queries run faster? I know a few: - The join fields should be indexed -Selecting only needed fields -Using CTE or derived tables as much as I can -Using good table reference eg select a.x , b.y from TableA a inner join TableB b on a.id = b.id
I will be happy if somebody could share or add more to my list.
Dear all, The below query take 7 min to execute so i want optimize the query.please any suggestions..........
SELECT DISTINCT VC.O_Id C_Id, VC.Name C_Name,VB.Org_Id B_Id, VB.code S_Code,VB.Name S_Name, mt12.COLUMN003 M_D_Code, mt12.COLUMN004 M_D_Name,CQ.COLUMN004 R_Code, CQ.COLUMN005 R_Date, CQ.COLUMN006 Ser,CQ.COLUMN008 R_Nature, CQ.COLUMN011 E_Date,mt26.COLUMN003 W_Code, mt26.COLUMN004 W_Name, mt17.COLUMN005 V_Code,mt17.COLUMN006 V_Name, mt19.column002 I_Code, mt19.column003 I_Name, mt19.COLUMN0001 R_I_No,mt92.COLUMN001 B_Id, mt92.COLUMN005 B_No, CASE mt92.COLUMN006 WHEN '0' THEN 'Ser' WHEN '1' THEN 'Un-Ser' WHEN '2' THEN 'Ret' WHEN '3' THEN 'Retd' WHEN '4' THEN 'Rep' WHEN '5' THEN 'Repd' WHEN '6' THEN 'Con' WHEN '7' THEN 'Cond' ELSE mt92.COLUMN006 END S_C_Type, mt20.COLUMN003 T_G_Code,mt20.COLUMN004 T_G_Name, V.U_Code,V.U_Name, mt19.column005 I_Quantity,mt20.COLUMN003 T_Code, mt20.COLUMN004 T_Name, mt59.COLUMN005 T_Price,VR.code C_L_Code, VR.Name C_L_Name FROM tab90 CQ INNER JOIN tab91 mt19 ON mt19.COLUMN002 = CQ.COLUMN001 LEFT JOIN tab92 mt92 ON mt92.COLUMN002 = CQ.COLUMN001 LEFT JOIN tab93 mt93 ON mt93.COLUMN004 = CQ.COLUMN001 INNER JOIN tab12 mt12 ON mt12.COLUMN001 = CQ.COLUMN003 LEFT JOIN tab26 mt26 ON mt26.COLUMN001 = CQ.COLUMN009 LEFT JOIN tab20 mt20 ON mt20.COLUMN001 = mt93.COLUMN005 LEFT JOIN tab59 mt59 ON mt59.COLUMN002=mt20.COLUMN001 LEFT JOIN tab17 mt17 ON mt17.COLUMN001 = CQ.COLUMN010 INNER JOIN VM V ON V.UOM_ID = mt19.COLUMN004 INNER JOIN tab19 mt19 ON mt19.COLUMN001 = mt19.COLUMN003 INNER JOIN vOrg VR ON CQ.COLUMN007 = VR.Org_Id INNER JOIN vOr VB ON CQ.COLUMN002 = VB.Org_Id INNER JOIN vOr VC ON VB.Top_Parent = VC.Org_Id WHERE CQ.COLUMN005 Between '02/01/2007' and '08/25/2008' And VC.O_Id in ('fb243e92-ee74-4278-a2fe-8395214ed54b')
LATEST column value changes for Row 1 since there is a repetition of value 124, meaning this row is no longer the latest.
NEW COLUMN value changes for ROW 2 since there it is no longer new; we already have an occurrence of 124 in the first row.
I m not sure if i can solve this query using any option other than cursor. it will be like taking first row --> comparing it with all the other rows and then moving further.
Plz. suggest me if there is a better approach for doing this
(SELECT add_house FROM hs_address WHERE add_id = do_address_registration_id) as add_house, (SELECT add_flat FROM hs_address WHERE add_id = do_address_registration_id) as add_house,
..... FROM hs_donor WHERE do_id = 400
Fields add_flat and add_house belong to one table. How one may optimize this query?
I am writing a query which will display employee details who is handling maximum number of projects. Here I am joining 2 tables. one is LUP_EmpProject, which contain employee id and project id and project date, in this table I have used a composite primary key of employee id, project id and project date. The other table is
EmployeeDetails which contain employee names and employee id.
I want to display the details of the employee who is handling maximum projects. Below given is the code which is working fine. But the query is taking time to execute it. Any body know how to optimize the code so that I can get the result quickly.
Code Snippet SELECT EmployeeDetails.FirstName+' '+EmployeeDetails.LastName AS EmpName, COUNT(LUP_EmpProject.Empid) AS Number_Of_Projects FROM LUP_EmpProject INNER JOIN EmployeeDetails ON LUP_EmpProject.Empid=EmployeeDetails.Empid GROUP BY EmployeeDetails.FirstName+' '+EmployeeDetails.LastName, LUP_EmpProject.Empid HAVING COUNT(LUP_EmpProject.Empid)>0 AND COUNT(LUP_EmpProject.Empid)=(SELECT MAX(Number_Of_Projects) FROM (SELECT COUNT(LUP_EmpProject.Empid) Number_Of_Projects FROM LUP_EmpProject GROUP BY LUP_EmpProject.Empid)AS sub)
max(f1.WeekValue)/case when max(f2.WeekValue) = 0 then NULL else max(f2.WeekValue) end,
@GroupOrder,@MetricOrder --from @temptable
from @FinalData f1 inner join @FinalData f2 on f1.weekdate = f2.weekdate
where (f1.Grouptitle = @GroupPFR and f1.MetricName = '$ Products')
and ( f2.Grouptitle = @GroupRevenue and f2.MetricName = 'Net Revenue')
group by f1.weekdate
There are many calculations like this in my procedure. and It takes like 3 min to run whole procedure now as I am doing group by.. So In Execution plan it show me that 60% of the query time is take n by SORT operation.. can any one give me any other option to do this.
Hi all, I have the following query to be optimized. It just takes too long to complete the execution.
---------------------------------------------------------------------------------- SELECT COUNT(*) FROM Tbl_A a INNER JOIN Tbl_B b ON a.AID = b.AID INNER JOIN Tbl_C c ON a.AID = c.AID INNER JOIN Tbl_D d ON d.DID = a.DID INNER JOIN Tbl_E e ON e.DID = d.DID INNER JOIN Tbl_F f ON e.EID = f.EID WHERE a.Col_1 = 1 AND (a.Col_2 LIKE N'%abc%') AND a.Col_3 <> CASE WHEN d.Col_1 ='ABC' THEN 'BR' ELSE '' END AND c.Col_1 = CASE WHEN d.Col_1 ='ABC' THEN 'ABC_COMPANY' ELSE 'PPRO' END AND f.Col_1 = 'val1' ------------------------------------------------------------------------------------------------------------------
here is the estimated records for the tables. ------------------------------------------------------------------------------------------------------------------ Tbl_A has over 150,000 records Tbl_B has over 150,000 records Tbl_C has over 450,000 records Tbl_D has over 33 records Tbl_E has over 4000 records Tbl_F has over 5000 records ------------------------------------------------------------------------------------------------------------------
I have a small tricky problem here...need help of all you experts.
Let me explain in detail. I have three tables
1. Emp Table: Columns-> EMPID and DeptID 2. Dept Table: Columns-> DeptName and DeptID 3. Team table : Columns -> Date, EmpID1, EmpID2, DeptNo.
There is a stored procedure which runs every day, and for "EVERY" deptID that exists in the dept table, selects two employee from emp table and puts them in the team table. Now assuming that there are several thousands of departments in the dept table, the amount of data entered in Team table is tremendous every day.
If I continue to run the stored proc for 1 month, the team table will have lots of rows in it and I have to retain all the records.
The real problem is when I want to retrive data for a employee(empid1 or empid2) from Team table and view the related details like date, deptno and empid1 or empid2 from emp table. HOw do we optimise the data retrieval and storage for the table Team. I cannot use partitions as I have SQL server 2005 standard edition.
Please help me to optimize the query and data retrieval time from Team table.
I need help in optimizing this query. The major time takes in calling a remote database. Thanks in advance.ALTER PROCEDURE dbo.myAccountGetCallLogsTest@directorynumber as varchar(10),@CallType as tinyint ASdeclare @dt as intSELECT TOP 1 @dt=datediff(day,C.EstablishDate,getdate())FROM ALBHM01CGSERVER.Core.dbo.Customer C INNER JOIN ALBHM01CGSERVER.Core.dbo.UsgSvc U ON C.CustID = U.CustIDWHERE (U.ServiceNumber = @directoryNumber)ORDER BY C.EstablishDate DESCIF @dt>90select DN as Number, Remote_DN as [Remote Number], City, StartTime as [Start Time], EndTime as [End Time] from vw_Call_Logs where DN = '1' + @directoryNumber and call_type = @CallType and datediff(day,starttime,getdate())<90order by starttime descELSE select DN as Number, Remote_DN as [Remote Number], City, StartTime as [Start Time], EndTime as [End Time] from vw_Call_Logs where DN = '1' + @directoryNumber and call_type = @CallType and datediff(day,starttime,getdate())< @dtorder by starttime desc
select count(a.callid) from tbl1 as a inner join tbl2 as b on a.calldefid=b.calldefid where a.programid=175
select count(a.callid) from tbl1 as a inner join tbl2 as b on a.calldefid=b.calldefid where b.programid=175
callid - pk on tbl1 calldefid - nonclustered index on both tbl1 and tbl2 programid - nonclustered index on both tbl1 and tbl2 tbl2 is the smaller table
from my understanding, the second query will run faster because you reduce the records in the smaller table, then join to the larger table (tbl1).
but can you explain to me why limiting the rows on tbl1 first, then joining to tbl2 would take longer?