I'm looking for an in depth book, article, faq, whatever, regarding the query optimizer...
I've read the books online pretty thoroughly and have been sql coding for a number of years. The system I work on relies heavily on real time access to data and the number crunching procedures we use are a critical part of the design. For the most part, sometimes through trial and error, I have been able to find ways to achieve the performance we need, but I'm often surprised by the methods that prove most effective.
For example, I have cases where I can only get the performance I'm looking for using table functions, and other cases where indexed temporary tables are the only way. I have statements that run fast as a select statement, but when converted to an update statement limp along, forcing me to resort to cursors, temp tables, or table hints with varying degrees of success.
I'm wondering if anyone has come across material that takes an in depth look at the various technologies available and how to tweek queries. I want to get away from hours of testing and hacking.
I have a SQL command which I run on two separate servers. Both servers and configured and built the same. On server 1 it takes mere seconds, but on server 2 it takes over 5 minutes.
I have checked the execution plan on both servers and they are completely different. I ran UPDATE STATISTICS WITH FULLSCAN on both servers, but the execution plans were still different.
My question is why are the execution plans so different and how do I get them to execute with the same plan.
I am trying to resolve performance issues in a third party application. I have run the profiler and found a transaction that performs a table scan against a 6 million row table. This transaction occurs repeatedly, so I thought, just add an index on the columns in the where clause used here. After adding the index, I looked at the estimated execution plan in Query analyzer, and I find that it is still performing the table scan. If I run the query it takes over 60 seconds to run, if i add an index hint, it runs in under a second. I ran DBCC SHOW_STATISTICS to see if the statistics were up to date:
Statistics for INDEX 'IX_Finish_dept'. Updated Rows Rows Sampled Steps Density Average key length -------------------- -------------------- -------------------- ------ ------------------------ ------------------------ Jun 26 2007 5:18PM 6832336 6832336 150 2.1415579E-7 18.0
(1 row(s) affected)
All density Average Length Columns ------------------------ ------------------------ ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.1875491E-7 8.0 finish 1.9796084E-7 18.0 finish, dept
Hello All, I have a series of Stored Procedure that has a query taking a join of 5 tables. These tables are quiet large with couple of them having around 10 million rows. As this is a DSS application having periodic data loads, I thought of creating Indexed View on top of these tables. Now the problem is that the Indexed View is not directly used by the optimizer. I need to change my queries and put a WITH (NOEXPAND) query hint to make sure the indexed views are used. This is inspite getting dramatic improvement in the query timings (from 64 secs down to 3 secs) after using the Indexed Views. I would like to know what can be the possible reason for the optimizer not using the Indexed View by itself. Is it because my Indexed View caters to multiple queries or I am missing out on something basic.
I have SQL 7.0 SP2 on NT 4.0 SP5. My database is 180GIG. 23 Tables. It has been up and running for 2 years without any problems. All of a sudden my queries have started taking a long time to run. The optimizer has decided that table scans are better than indexes. If I use query hints they work just fine, but I can't modify all of our code to make these changes.
This is happening on all tables. Records counts are the in the same range they have always been.
Statistics and indexes are all fine and current. Have dropped and rebuilt both.
Our app has been distributed on more then 300 different sites. On one of the sites we get the error "Could not continue scan with NOLOCK due to data movement" indicating that the query optimizer takes a NOLOCK for our select statement ( has been opened with adOpenDynamic, adLockOptimistic ).
It's no option to change the source, we have to solve this without touching the code.
Is there any way to tweak the query optimizer so that our app works correctly? I know that there will be a reduction of performance but it's our only choose.
I've got this query that runs in 30 seconds and returns about 24000. The table variable returns about 145 rows (no performance issue here), and the TransactionTbl table has 14.2 Million rows, a compound, clustered primary key, and 6 non-clustered indexes, none of which meet the needs of the query.
Actual execution plan shows SQL is doing an index seek, then a nested loop join, and then fetching the remaining data from the TransactionTbl using a Key Lookup.
I designed a new indexes based on the query, which when I force it's usage via an index hint, reduces the run time to sub-second, but without the index hint the SQL optimiser won't use the new index, which looks like this:
CREATE INDEX IX_Test on GLSchemB.TransactionTbl (CltID, Date) include (Ledger_Code, Amount, CurrencyID, AssetID)and I tried this: CREATE INDEX IX_Test on GLSchemB.TransactionTbl (CltID, Date, Ledger_Code, CurrencyID, AssetID) include (Amount)and even a full covering index!
I did some testing, including disabling all indexes but the PK, and the optimiser tells me I've got a missing index and recommends I create one EXACTLY like the one I designed, but when I put my one back it doesn't use it.
I though this may be due to fragmentation and/or stats being out of date, so I rebuilt the PK and my index, and the optimiser started using my index, doing an index seek and running sub-second. Thinking I had solved the problem I rebuilt all the indexes, testing after each one, and my index was used BUT as soon as I flushed the related query plan, the optimiser went back to using a less optimal index, with a seek and key lookup plan and taking 30 seconds.
For now I've resorted to using the OPTION (TABLE HINT(G, INDEX(IX_Test))) to force this, but it's a work around only. Why the optimiser would select a less optimal query plan?
I am having an issue with large queries using Microsoft SQL Server 2005 - 9.00.2221.00 (X64).
I have a query with many INNER/LEFT OUTER/RIGHT OUTER joins which is taking very very very long to run. This looks exactly like this problem described in http://support.microsoft.com/kb/318530. However, this doc says it was fixed in SP1, which is already installed.
Basically I have a query:
SELECT .... FROM TABLEA
INNER JOIN TABLEB ... LEFT OUTER TABLEC... LEFT OUTER TABLED... RIGHT OUTER TABLEF... LEFT OUTER TABLEJ.. LEFT OUTER TABLEH... LEFT OUTER TABLEI... RIGHT OUTER TABLEK... LEFT OUTER TABLEM.. ... 17 joined tables in all...... WHERE TABLEB.field1 = 'abc'
The query plan for this is using TABLEA as the "main" table and joining everything else to it. The problem is, TABLEA has 117 MILLION records. TABLEB has 10,000 records which match the WHERE. I stopped this query after it ran for 62 HOURS.
If I simply change the query to:
SELECT .... FROM TABLEB
INNER JOIN TABLEA ... LEFT OUTER TABLEC... LEFT OUTER TABLED... RIGHT OUTER TABLEF... LEFT OUTER TABLEJ.. LEFT OUTER TABLEH... LEFT OUTER TABLEI... RIGHT OUTER TABLEK... LEFT OUTER TABLEM.. ... 17 joined tables in all...... WHERE TABLEB.field1 = 'abc'
The query runs in 15 mins. The query plan now uses TABLEB and the WHERE clause to join all the other tables.
The problem is, this query is generated from a report writter, and I have no control over the way it creates the SQL code.
I try do some tests and I get one doubt, why the optimizer don€™t make a constant scan in normal tables, for instance:
Code Snippet --drop table #tmp create table #tmp (id Int Identity(1,1) Primary key, name VarChar(250)) go insert into #tmp(name) values(NEWID()) insert into #tmp(name) values(NEWID()) go set statistics profile on go -- Execution plan create a Constant Scan select * from #tmp where id = 1 and id = 5 go set statistics profile off
GO
--drop table tmp create table tmp (id Int Identity(1,1) Primary key, name VarChar(250)) go insert into tmp(name) values(NEWID()) insert into tmp(name) values(NEWID())
go set statistics profile on -- Why execution plan does not create a Constant Scan for this case? select * from tmp where id = 1 and id = 5 go set statistics profile off
I tried this:use northwindgoSELECT OrderDateFROM Orders WHERE OrderDate > '19950101'see the query plan? okSELECT OrderDate, EmployeeIdFROM Orders WHERE OrderDate > '19950101'see the query plan? what appened?the only way to make an index seek instead of an index scan is toforce theindex usage ( with(index=orderdate) ), but I don't like this solutionalso try this:SELECT *FROM Orders WHERE employeeId > 9andSELECT *FROM Orders WHERE employeeId > 8Can someone explain why this appens? and how can I overturn theperformance loss problem (well not in orders table, but in my tablethere are 300K records and making a scan to retrieve 50 records is notexactly what I want)thanks to all
Way back when, and at least in version 7 IIRC, the query optimizer gaveup when the where clause in a statement contained more than 4 searchconditions.Does anyone know if such a limitation still exist in MS SQL 2005? TheBOL seems to be silent on the issue.Boa
I'm very puzzled by the choice of NC index being made by the optimizerin this example. I don't actually think it should use an NC index atall.I have:Table: CustomerStatus_TSingle data page19 recordsClustered Index on CustomerStatusID:CREATE TABLE [CustomerStatus_T] ([CustomerStatusID] [int] NOT NULL ,[Name] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[Description] [varchar] (200) COLLATE SQL_Latin1_General_CP1_CI_ASNULL ,[Code] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[CodeAlt] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[Ordinal] [int] NULL ,[Default] [int] NULL ,[Display] [bit] NOT NULL ,[StatusType] [varchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,[DateCreated] [smalldatetime] NULL ,[DateUpdated] [smalldatetime] NULL ,[DateArchived] [smalldatetime] NULL ,CONSTRAINT [PK_ROMS_CustomerStatus] PRIMARY KEY CLUSTERED([CustomerStatusID]) ON [PRIMARY]) ON [PRIMARY]If I run the following query, it does exactly what I expect and scansthe clustered index:SELECT customerStatusID, [Name] FROM CustomerStatus_TWHERE dateArchived IS NULLAND Display = 1AND StatusType = 'Q‘and gives the following QEP and IO statistics:|--Clustered Index Scan(OBJECT:([Reach_Roms].[dbo].[CustomerStatus_T].[PK_ROMS_CustomerStatus]),WHERE:(([CustomerStatus_T].[DateArchived]=NULL AND[CustomerStatus_T].[StatusType]='Q') ANDConvert([CustomerStatus_T].[Display])=1))Table 'CustomerStatus_T'. Scan count 1, logical reads 2, physicalreads 0,read-ahead reads 0.If I now put a NC index on the statustype column:create index ix_nci_statustype on customerstatus_t(statustype)the query plan changes to:SELECT customerStatusID, [Name] FROM CustomerStatus_TWHERE dateArchived IS NULLAND Display = 1AND StatusType = 'Q‘|--Filter(WHERE:([CustomerStatus_T].[DateArchived]=NULL ANDConvert([CustomerStatus_T].[Display])=1))|--Bookmark Lookup(BOOKMARK:([Bmk1000]),OBJECT:([Reach_Roms].[dbo].[CustomerStatus_T]))|--IndexSeek(OBJECT:([Reach_Roms].[dbo].[CustomerStatus_T].[ix_nci_statustype]),S EEK:([CustomerStatus_T].[StatusType]='Q') ORDEREDFORWARD)Table 'CustomerStatus_T'. Scan count 1, logical reads 7,physical reads 0, read-ahead reads 0.For some bizarre reason, the optimizer thinks that a NC index lookupon a single-page table, which ultimately costs 7 IOs, is cheaper thana table (or Clustered Index) scan of a single page. Why? Theshowplan cost also shows that it expects the NC index to be cheaper(which is presumably why it goes and uses it), but even after runningUPDATE STATISTICS on the table it still chooses the same idiotic queryplan.Any thoughts, or has anyone seen similar behaviour before, and cananyone please explain it to me?p.s. I don't actually WANT to put a NC index on this table, but Inoticed the behaviour by accident which is why I'm asking the question:-)
Is there a DMV or similar in SQL 2012, or SQL 2008, that shows when a statistic was last used by the optimizer? I would like to cleanup some of the auto-generated stats, assuming it's possible to do so. In particular I'm looking to drop those statistics that were created by one-off queries, data loads, etc, and are now doing nothing but adding to the execution time of Update Statistics jobs.
declare @ContactId as integerset @ContactId = 5select *from Person.Contactwhere ContactId = @ContactIdOR @ContactId = -1If you run this in SQL 2005 on the AdventureWorks database,why the logical reads is 561Table 'Contact'. Scan count 1, logical reads 56and not 2 when you run without the second OR condition:declare @ContactId as integerset @ContactId = 5select *from Person.Contactwhere ContactId = @ContactIdHow can i use the same SP and either get one record returnedby passing the ID of the field, or pass a dummy parameter like-1 in order to get ALL the records returned.In this case even when i pass a parameter like ContactID = 5there is still a table scan (clustered index scan in this case)happening for the other OR condition.There's no method to tell SQL to start checking the first conditionwhether or not it is true then if it is false then check the second ORconditon. On the same topic does this mean all OR conditions areALWAYS verified regardless if one of them has already been determinedto be True?Thank you
Guys,what I need is a tool which gives details on the choice of anexecution plan by the SQL Server. For example, the cost for a hashjoin might be 200 and 100 for a nested loop, and therefore a nestedloop is used. Same thing for the access paths for each table/viewinvolved. In Oracle, we turn on event 100053 to see this kind of info.ThanxDaniel
An interesting discussion yesterday. One of the programmers asked about the use of the NOLOCK optimizer hint with an iterator table aka table of numbers. His comment was that this optimizer hint was not efficient. Rather than give a knee-jerk response I thought it would be better to ask. The main circumstance is that the iterator table is completely static with a fill factor of 100%. My purpose is to eliminate lock contention if I can.
Are there reasons to not use the NOLOCK hint in this case to potentially improve performance?
I cannot find this info anywhere. We have Hyperion 8.x running against SQL 2000. We will upgrade side-by-side in the coming weeks. It appears that Hyperion 8.x is assured to continue to work without issue with a 2K5 database in 80 compatibility level.
I have seen the BOL table on differences between 80 and 90 (specifically affects us - WITH on Index hints), but what I wonder are about 3 specific features.
1. Do you still get DMV capabilities in 80 mode? (I ran a test on sys.dm_db_index_usage_stats for an 80 compat DB, creating index and using query to hit index, and yes - it does look as if this DMV does get populated).
2. Can you still Table Partition in 80 mode? (Ran a test in DEV, and yes - this does look possible - any gotchas?)
3. Does the optimizer behave quite differently in 80 mode?
I have tested in DEV moving a DB from 80 to 90 and back, and this seems to work with no problems.
Hello, I have the following query in Access 2000 that I need to convertto SQL 2000:UPDATE tblShoes, tblBoxesSET tblShoes.Laces1 = NullWHERE (((tblShoes.ShoesID)=Int([tblBoxes].[ShoesID])) AND((tblBoxes.Code8)="A" Or (tblBoxes.Code8)="B"))WITH OWNERACCESS OPTION;The ShoesID in the tblShoes table is an autonumber, however the recordsin the tblBoxes have the ShoesID converted to text.This query runs ok in Access, but when I try to run it in the SQLServer 2000 Query Analizer I get errors because of the comma in the"UPDATE tblShoes, tblBoxes" part. I only need to update the tblShoesfield named Laces1 to NULL for every record matching the ones in thetblBoxes that are marked with an "A" or an "B" in the tblBoxes.Code8field.Any help would be greatly appreciated.JR
Here is what I have, but I'm not confident it is correct: CASE WHEN (PTicketNum = '0' OR PTicketNum IS NULL) THEN 'M999Z' else PTicketNum END AS Ticket
I am new to SQl tranasactions and I'm currently trying to write an SP that will check the database sizes and print the error to the windows eventlogs so that MOM can monitor them. I do not wish to retain the information so I am using a temp table. which once this is written to the logs will be dropped, I also want it to loop though all the databases except for exemption list. SQL 2000 query (hope I'm not completely going in the wrong direction)
SET NOCOUNT ON
DECLARE @size int DECLARE @name AS sysname DECLARE @string varchar(1000) --DECLARE @query varchar(2000)
SET @Name = (SELECT TOP 1 name FROM master..sysdatabases WHERE -- exemption list NAME NOT IN ('master', 'tempdb', 'model', 'msdb') and NAME > @Name )
--SET @name = 'DBA'
SET @string = 'INSERT INTO ##temp (sum([size]) as [size])' + @name + '.dbo.sysfiles' --SET @string = 'select sum([size]) as [size] into ##temp from ' + @name + '.dbo.sysfiles' --(I REALISE THAT THIS ---------WAS TRYING TO CREATE THE TEMP TABLE FOR EACH DATABASE) EXEC (@string) SELECT (@string)
SELECT @size = DBsize from ##temp
-- once done all dbs then get out IF @Name IS NULL BREAK
END
DROP TABLE ##temp -- --Raise error inputs to Eventlog RAISERROR ('Database size %s is %i', 10, 2, @name, @size ) with log SELECT @size
I would welcome any help from someone on amending this query or indeed showing me a best practice way of doing this.Thank you inadvance to everyone that reply€™s :-)
Hi guys I have a program which imports data into sql server from csv files. This program uses the something like this
CREATE PROCEDURE MyImport ( @FileName AS VARCHAR(200) ) AS SET NOCOUNT ON DECLARE @SQL VARCHAR(4000) SET @SQL = ' SELECT * FROM OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=C:;'', ''SELECT * FROM ' + QUOTENAME(@FileName) + '' + ''') as t ' EXEC (@SQL)
when i excute this i get a permission error saying the account i specified in the web config doesn't have rights to run these type of scripts. I tried using the sa account which works. But its not recommended to setup applications to access the database using the sa account. i did some research and i found out that yon can actually enable that by change a registry key. But i cant find the MSDASQL datasource under ms sql server providers in the registry. Any other ways i can enable this?
select ZT_AuthDetail Where MasterKey in (select * from ZT_AuthDetail d, ZT_AuthMaster m where (ReplyTime Between '2004/11/22' and '2004/12/31') And (m.CustId = '33196700') And (m.Prikey = '1840'))
when I run this query , the error message shows somethins wrong with "Sub Query from Exists....." if I run (select * from ZT_AuthDetail d, ZT_AuthMaster m where (ReplyTime Between '2004/11/22' and '2004/12/31') And (m.CustId = '33196700') And (m.Prikey = '1840')) alone the result is fine and Ok.. but with select ZT_AuthDetail where MasteKey in... I got error message can you please tell me why and how to fix it? ** I use SQL2000 thank you very much
I have a procedure (used to create a report) and was used in sql 7.0 service pack 3.
Problem is that we are upgrading to SQL 2000 and this procedure now takes 1 minute and 30 seconds to execute vs. 10 seconds previously.
Everything is same between the sql 7 and sql 2000 server. i.e. database size, indexes, hardware etc.
I looked at the query execution plan and it seems to do a sort which is taking majority of the resources on sql 2000 even though there is no sort stmt issued in the procedure itself.
Any help would be appreciated?? I am more curious to find out why this is the case when all the variables are same between the two servers yet sql 2000 performance is much worse than sql 7.0. It should be the other way around!!
If I use and Execute SQL Task like this ... ------------------------------------ DECLARE @VAR SET @VAR = ?
EXEC sp_Test @VAR ------------------------------------ and then click parse query I get an error "Syntax error or Access violation"
However; if I use it this way ------------------------------------ EXEC sp_Test ? ------------------------------------ it works.
Now here is the odd thing. If I put in a DECLARE statement and a global parameter "?" I get the syntax error.
But .... If I use a disconnected edit and use DECLARE and "?" then task will run properly. So the problem seems to be with the query parser.
The problem is I don't have the time to edit everything in a disconnected manner. Is there a way or update that will allow me to use DECLARE and "?". I have tried many hotfixes but none have worked.
I am having an issue with a SQL query I wrote in SQL 2000. I am almost there but am lacking in one area, hoping a better programmer than me and a fresh set of eyes can pick off my mistake.
The goal is very straight forward, order and rank the following fields: R1.PSWins DESC, R1.PSPoints_Total DESC, R1.PSTBDifference
The issue lies in my sub query in the AND / OR joins... This query will work if I sort and rank on two fields, but if I try three or more it does not work properly.
( SELECT COUNT(*) FROM tblWeeklyStandings_All R2 WHERE R2.AutoPoolID = R1.AutoPoolID AND R2.Week = R1.Week AND ( R2.PSWins > R1.PSWins OR R2.PSWins = R1.PSWins AND R2.PSPoints_Total > R1.PSPoints_Total OR R2.PSPoints_Total = R1.PSPoints_Total AND R2.PSTBDifference < R1.PSTBDifference ) ) + 1 AS Rank
FROM dbo.tblWeeklyStandings_All R1 INNER JOIN dbo.qryUsers_SDR ON R1.AutoPoolID = dbo.qryUsers_SDR.AutoPoolID AND R1.PoolID = dbo.qryUsers_SDR.PoolID WHERE (R1.PoolType = '2007' OR R1.PoolType = '2008') AND (dbo.qryUsers_SDR.OrderBy1 = '6') AND (dbo.qryUsers_SDR.PointSpread = 1)
ORDER BY R1.PSWins DESC, R1.PSPoints_Total DESC, R1.PSTBDifference
At this point I am ready to buy SQL 2005 because I think this gets easier, hoping a 2000 wiz can help me correct my error, or missing "(", or something!
Here are the results: Rank Username Wins Points TB Difference 1. test 0618084 16 50 0 4. test 0618083 16 50 66 3. test 0618081 15 50 55 4. test 0618082 14 50 55 5. admin 2 0 0
As you can see the ranking is not correct. It should be the following: Rank Username Wins Points TB Difference 1. test 0618084 16 50 0 2. test 0618083 16 50 66 3. test 0618081 15 50 55 4. test 0618082 14 50 55 5. admin 2 0 0
Sometimes it works sometimes it does not, frustrating!
In different scenerios, I am ranking the following way when ties occur. My code sometimes does this correctly but as you can see above it fails in certain areas, like it forgets R1.PSWins DESC, R1.PSPoints_Total DESC, R1.PSTBDifference....
Rank 1 Tied 1 Tied 3 4 5 Tied 5 Tied 5 Tied 8
etc...
The most frustrating thing is the sort order is correct, it is the rank sub query that fails under certain scenerious...