SQL 2012 :: Query Optimizer Not Using Optimum Index
Jul 28, 2015
Running SQL 2012 SP2
I've got this query that runs in 30 seconds and returns about 24000. The table variable returns about 145 rows (no performance issue here), and the TransactionTbl table has 14.2 Million rows, a compound, clustered primary key, and 6 non-clustered indexes, none of which meet the needs of the query.
Actual execution plan shows SQL is doing an index seek, then a nested loop join, and then fetching the remaining data from the TransactionTbl using a Key Lookup.
I designed a new indexes based on the query, which when I force it's usage via an index hint, reduces the run time to sub-second, but without the index hint the SQL optimiser won't use the new index, which looks like this:
CREATE INDEX IX_Test on GLSchemB.TransactionTbl (CltID, Date) include (Ledger_Code, Amount, CurrencyID, AssetID)and I tried this:
CREATE INDEX IX_Test on GLSchemB.TransactionTbl (CltID, Date, Ledger_Code, CurrencyID, AssetID) include (Amount)and even a full covering index!
I did some testing, including disabling all indexes but the PK, and the optimiser tells me I've got a missing index and recommends I create one EXACTLY like the one I designed, but when I put my one back it doesn't use it.
I though this may be due to fragmentation and/or stats being out of date, so I rebuilt the PK and my index, and the optimiser started using my index, doing an index seek and running sub-second. Thinking I had solved the problem I rebuilt all the indexes, testing after each one, and my index was used BUT as soon as I flushed the related query plan, the optimiser went back to using a less optimal index, with a seek and key lookup plan and taking 30 seconds.
For now I've resorted to using the OPTION (TABLE HINT(G, INDEX(IX_Test))) to force this, but it's a work around only. Why the optimiser would select a less optimal query plan?
I am trying to resolve performance issues in a third party application. I have run the profiler and found a transaction that performs a table scan against a 6 million row table. This transaction occurs repeatedly, so I thought, just add an index on the columns in the where clause used here. After adding the index, I looked at the estimated execution plan in Query analyzer, and I find that it is still performing the table scan. If I run the query it takes over 60 seconds to run, if i add an index hint, it runs in under a second. I ran DBCC SHOW_STATISTICS to see if the statistics were up to date:
Statistics for INDEX 'IX_Finish_dept'. Updated Rows Rows Sampled Steps Density Average key length -------------------- -------------------- -------------------- ------ ------------------------ ------------------------ Jun 26 2007 5:18PM 6832336 6832336 150 2.1415579E-7 18.0
(1 row(s) affected)
All density Average Length Columns ------------------------ ------------------------ ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2.1875491E-7 8.0 finish 1.9796084E-7 18.0 finish, dept
In another forum post, a poster was deleting large numbers of rows from a table in batches of 50,000.
In the bad old days ('80s - '90s), I used to have to delete rows in batches of 500, then 1000, then 5000, due to the size of the transaction rollback segments (yes - Oracle).
I always found that increasing the number of deleted rows in a single statement/transaction improved overall process speed - up to some magic point, at which some overhead in the system began slowing the deletes down, so that deleting a single batch of 10,000 rows took more than twice as much time as deleting two batches of 5,000 rows each.
good rule-of-thumb numbers (or even better, some actual statistics and/or explanations) as to how many records should be deleted in a single transaction/statement for optimum speed? 50,000 - 100,000 - 1,000,000 or unlimited? Are there significant differences between 2008, 2012, 2014?
Hello friends,I have one simple question. I have two tables. 1 ( Table A ) has about2.5 million rows and second one ( Table B ) has about 1 million. Thereare common ID fields in both tables. I want join them on ID field andget all rows of Table A which are not in there Table B.When I ran following two queries, I got same result set, but time ittook was very different.Following query took 1:35 minutesSELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSENUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATEFROM [Table A] Tbl1WHERE NOT exists (SELECT 1 from [Table B] Tbl2 WHERETbl1.UID = Tbl2.UID )vs this one took .45 seconds.SELECT Tbl1.UID, Tbl1.[LAST NAME], Tbl1.[FIRST NAME], Tbl1.[HOUSENUMBER], Tbl1.ADDRESS, Tbl1.CITY, Tbl1.STATEFROM [Table A] Tbl1 LEFT OUTER JOIN [Table B] Tbl2 WHERE ON Tbl1.UID =Tbl2.UIDWHERE Tbl2.UID IS NULLWhich option is better ? I have subsequent joins to another tablewhich has about 2 mil more rows and trying to optimize the responsetime.I appreciate all help from the community.JB
Is there a DMV or similar in SQL 2012, or SQL 2008, that shows when a statistic was last used by the optimizer? I would like to cleanup some of the auto-generated stats, assuming it's possible to do so. In particular I'm looking to drop those statistics that were created by one-off queries, data loads, etc, and are now doing nothing but adding to the execution time of Update Statistics jobs.
I have a SQL command which I run on two separate servers. Both servers and configured and built the same. On server 1 it takes mere seconds, but on server 2 it takes over 5 minutes.
I have checked the execution plan on both servers and they are completely different. I ran UPDATE STATISTICS WITH FULLSCAN on both servers, but the execution plans were still different.
My question is why are the execution plans so different and how do I get them to execute with the same plan.
I'm looking for an in depth book, article, faq, whatever, regarding the query optimizer...
I've read the books online pretty thoroughly and have been sql coding for a number of years. The system I work on relies heavily on real time access to data and the number crunching procedures we use are a critical part of the design. For the most part, sometimes through trial and error, I have been able to find ways to achieve the performance we need, but I'm often surprised by the methods that prove most effective.
For example, I have cases where I can only get the performance I'm looking for using table functions, and other cases where indexed temporary tables are the only way. I have statements that run fast as a select statement, but when converted to an update statement limp along, forcing me to resort to cursors, temp tables, or table hints with varying degrees of success.
I'm wondering if anyone has come across material that takes an in depth look at the various technologies available and how to tweek queries. I want to get away from hours of testing and hacking.
Hello All, I have a series of Stored Procedure that has a query taking a join of 5 tables. These tables are quiet large with couple of them having around 10 million rows. As this is a DSS application having periodic data loads, I thought of creating Indexed View on top of these tables. Now the problem is that the Indexed View is not directly used by the optimizer. I need to change my queries and put a WITH (NOEXPAND) query hint to make sure the indexed views are used. This is inspite getting dramatic improvement in the query timings (from 64 secs down to 3 secs) after using the Indexed Views. I would like to know what can be the possible reason for the optimizer not using the Indexed View by itself. Is it because my Indexed View caters to multiple queries or I am missing out on something basic.
I have SQL 7.0 SP2 on NT 4.0 SP5. My database is 180GIG. 23 Tables. It has been up and running for 2 years without any problems. All of a sudden my queries have started taking a long time to run. The optimizer has decided that table scans are better than indexes. If I use query hints they work just fine, but I can't modify all of our code to make these changes.
This is happening on all tables. Records counts are the in the same range they have always been.
Statistics and indexes are all fine and current. Have dropped and rebuilt both.
Our app has been distributed on more then 300 different sites. On one of the sites we get the error "Could not continue scan with NOLOCK due to data movement" indicating that the query optimizer takes a NOLOCK for our select statement ( has been opened with adOpenDynamic, adLockOptimistic ).
It's no option to change the source, we have to solve this without touching the code.
Is there any way to tweak the query optimizer so that our app works correctly? I know that there will be a reduction of performance but it's our only choose.
I am having an issue with large queries using Microsoft SQL Server 2005 - 9.00.2221.00 (X64).
I have a query with many INNER/LEFT OUTER/RIGHT OUTER joins which is taking very very very long to run. This looks exactly like this problem described in http://support.microsoft.com/kb/318530. However, this doc says it was fixed in SP1, which is already installed.
Basically I have a query:
SELECT .... FROM TABLEA
INNER JOIN TABLEB ... LEFT OUTER TABLEC... LEFT OUTER TABLED... RIGHT OUTER TABLEF... LEFT OUTER TABLEJ.. LEFT OUTER TABLEH... LEFT OUTER TABLEI... RIGHT OUTER TABLEK... LEFT OUTER TABLEM.. ... 17 joined tables in all...... WHERE TABLEB.field1 = 'abc'
The query plan for this is using TABLEA as the "main" table and joining everything else to it. The problem is, TABLEA has 117 MILLION records. TABLEB has 10,000 records which match the WHERE. I stopped this query after it ran for 62 HOURS.
If I simply change the query to:
SELECT .... FROM TABLEB
INNER JOIN TABLEA ... LEFT OUTER TABLEC... LEFT OUTER TABLED... RIGHT OUTER TABLEF... LEFT OUTER TABLEJ.. LEFT OUTER TABLEH... LEFT OUTER TABLEI... RIGHT OUTER TABLEK... LEFT OUTER TABLEM.. ... 17 joined tables in all...... WHERE TABLEB.field1 = 'abc'
The query runs in 15 mins. The query plan now uses TABLEB and the WHERE clause to join all the other tables.
The problem is, this query is generated from a report writter, and I have no control over the way it creates the SQL code.
I am using Full Text Index to index emails stored in BLOB column in a table. Index process parses stored emails, and, if there is one or more files attached to the email these documents get indexed too. In result when I'm querying the full text index for a word or phrase I am getting reference to the email containing the word of phrase if interest if the word was used in the email body OR if it was used in any document attached to the email.
How to distinguish in a Full Text query that the result came from an embedded document rather than from "main" document? Or if that's not possible how to disable indexing of embedded documents?
My goal is either to give a user an option if he or she wants to search emails (email bodies only) OR emails AND documents attached to them, or at least clearly indicate in the returned result the real source where the word or phrase has been found.
I have a clustered index that consists of 3 int columns in this order: DateKey, LocationKey, ItemKey (there are many other columns in this data warehouse table such as quantities, prices, etc.).
Now I want to add a non-clustered index on just one of the other columns, say LocationKey, like this: CREATE INDEX IX_test on TableName (LocationKey)
I understand that the clustered index keys will also be added as key columns to any NC indexes. So, in this case the NC index will also get the other two columns from the clustered index added as key columns. But, in what order will they be added?
Will the resulting index keys on this new NC index effectively be:
LocationKey, DateKey, ItemKey OR LocationKey, ItemKey, DateKey
Do the clustering keys get added to a NC index in the same order as they are defined in the clustered index?
I have an SQL Server 2000 DB running on a 5 year old server. It has 5 drives SCSI 10KRPM drives on IBM ServeRAID 4Lx card. I'm maxing it out to 9 on the same backplane (all 10KRPM).
Not sure the best way to make them count. Here's the particulars:
1. Current config is: Vol1 = RAID1 for OS, swap, and Logging files. Vol2 = RAID5 (3 disks) for DB.
2. The app does heavy writes and use of Temp DB.
I don't have by-volume stats. This stat excludes backup (taken 3 hours after a daytime reboot). Windows Task Manager shows SQL task and SERVICES.EXE both have physical reads about 15% higher than physical writes. SERVICES.EXE has about 3x the IO count as the SQL task. I assume that's mainly SQL activity.
My question for you: How best to configure the 4 new drives.
Redundancy is critical, so any non-RAIDed volume is out.
Option 1: Vol1 = RAID1 for OS. Vol2 = RAID5 (3 disk) for app DB. Vol3 = RAID1 for Sys DBs (Master etc) plus Temp DB. Also OS Swap. Also .BAK scheduled backup files. Vol4 = RAID1 for all .ldf files.
Option 2: Abandon RAID5 due to write penalty (same division of files) Vol1 = RAID1 Vol2 = RAID1 Vol3 = RAID1 Vol4 = RAID1 9th drive = hot swap.
Option 3: Vol1 = RAID1. for OS and .BAK files. Vol2 = RAID10 (4 disk). for all .mdf files Vol3 = RAID1 for all .ldf files. 9th drive = hot swap.
I'm wondering if RAID1 read penalty will outweigh RAID5 write penalty (for 3 stripe RAID5). Will RAID10 advantages outweigh separation of tempDB + System DB on RAID 1 volumes (or RAID5 + RAID1).
So I started a new job recently and have noticed a few strange configurations. Typically I would never mess with min memory per query option and index create memory option configuration because i just haven't seen any need to. My typical thought is that if it isn't broke... They have been modified on every single server in my environment.
From Books Online: • This option is an advanced option and should be changed only by an experienced database administrator or certified SQL Server technician. • The index create memory option is self-configuring and usually works without requiring adjustment. However, if you experience difficulties creating indexes, consider increasing the value of this option from its run value.
SELECT a.AssetGuid, a.Name, a.LocationGuid FROM Asset a WHERE a.AssociationGuid IN ( SELECT ada.DataAssociationGuid FROM AssociationDataAssociation ada WHERE ada.AssociationGuid = '568B40AD-5133-4237-9F3C-F8EA9D472662')
takes 30-60 seconds to run on my machine, due to a clustered index scan on our an index on asset [about half a million rows]. For this particular association less than 50 rows are returned.
expanding the inner select into a list of guids the query runs instantly:
SELECT a.AssetGuid, a.Name, a.LocationGuid FROM Asset a WHERE a.AssociationGuid IN ( '0F9C1654-9FAC-45FC-9997-5EBDAD21A4B4', '52C616C0-C4C5-45F4-B691-7FA83462CA34', 'C95A6669-D6D1-460A-BC2F-C0F6756A234D')
It runs instantly because of doing a clustered index seek [on the same index as the previous query] instead of a scan. The index in question IX_Asset_AssociationGuid is a nonclustered index on Asset.AssociationGuid.
The tables involved:
Asset, represents an asset. Primary key is AssetGuid, there is an index/FK on Asset.AssociationGuid. The asset table has 28 columns or so... Association, kind of like a place, associations exist in a tree where one association can contain any number of child associations. Each association has a ParentAssociationGuid pointing to its parent. Only leaf associations contain assets. AssociationDataAssociation, a table consisting of two columns, AssociationGuid, DataAssociationGuid. This is a table used to quickly find leaf associations [DataAssociationGuid] beneath a particular association [AssociationGuid]. In the above case the inner select () returns 3 rows.
I'd include .sqlplan files or screenshots, but I don't see a way to attach them.
I understand I can specify to use the index manually [and this also runs instantly], but for such a simple query it is peculiar it is necesscary. This is the query with the index specified manually:
SELECT a.AssetGuid, a.Name, a.LocationGuid FROM Asset a WITH (INDEX (IX_Asset_AssociationGuid)) WHERE a.AssociationGuid IN ( SELECT ada.DataAssociationGuid FROM AssociationDataAssociation ada WHERE ada.AssociationGuid = '568B40AD-5133-4237-9F3C-F8EA9D472662')
To repeat/clarify my question, why might this not be doing a clustered index seek with the first query?
I try do some tests and I get one doubt, why the optimizer don€™t make a constant scan in normal tables, for instance:
Code Snippet --drop table #tmp create table #tmp (id Int Identity(1,1) Primary key, name VarChar(250)) go insert into #tmp(name) values(NEWID()) insert into #tmp(name) values(NEWID()) go set statistics profile on go -- Execution plan create a Constant Scan select * from #tmp where id = 1 and id = 5 go set statistics profile off
GO
--drop table tmp create table tmp (id Int Identity(1,1) Primary key, name VarChar(250)) go insert into tmp(name) values(NEWID()) insert into tmp(name) values(NEWID())
go set statistics profile on -- Why execution plan does not create a Constant Scan for this case? select * from tmp where id = 1 and id = 5 go set statistics profile off
I have a table with primary key and also clustered index on that primary key column. I need almost all columns from that table. When I wrote the select column names, it showing that Index scan occurred. How can I avoid that Index scan and change to index seek? When I check the fragmentation of that Index it is showing more than 34%. Is that fragmentation is ok or do I need to reorg the Index?
how to fragment an index so I can test it's fragmented performance on an iSCSI LUN.I can test without an index, that's fine. I can test with a newly created index (of course that means it's not fragmented) and that's fine.But what I want to do is DELIBERATELY FRAGMENT () an index to 90%+ fragmented to test it's performance.
I want to know more details about the Clustered Index Delete. Is that Clustered Index Delete in the execution plan is good or bad or we can neglect that cost. Is there any way to avoid that clustered Index delete operator from the execution plan.
When creating a column store index, are there any reasons not to include all columns, besides index size of course? i.e. will the index be more versatile with more columns or should I treat it exactly like its a standard index, putting only necessary columns, in the correct order?
When I go to SQL 2012 through SSMS 2005, I get the message Index outside the bounds of array. Will it fix if I install SSMS 2012? Do I need to remove SSMS 2005?
- What are your thoughts on adding clustered index on datetime (createdDate , native GUID) column. The data will be be physically organized in the clustered index allowing range operations to perform its duties. But will the GUID column make any impact ( drawbacks) should it be made part of the clustered key ?
The GUID column will provide the lookup with the required indexes to support.
We have a large table with many columns and many indexes. One poorly performing query is having to do a key lookup when the where clause includes a particular column with no covering index.
Are you generally better off adding a new index or adding the column to an existing index ( included columns )Column: LAST_STATE_RESPONSE_CODE
The Query Processor estimates that implementing the following index could improve the query cost by 88.9332%.
*/ /* USE [ database name] GO CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>] ON [dbo].[SERVICE_REQUEST] ([BUSINESS_PROCESS_STATUS],[[color=#F00]LAST_STATE_RESPONSE_CODE[size="3"][/size][/color]],[CONCRETE_TYPE]) INCLUDE ([LIENHOLDER_PERFORMING_LIEN_FILING_ID],[MAKE],[YEAR],[MANUFACTURER_ID],[CLIENT_ID]) GO