Bad Performance Issues When Trying To SELECT TOP 10 * From &&<partitioned_view&&> ORDER BY 1

Nov 28, 2007

Hi.

We are now working with SQL2000sp4, planning migration to SQL2005 in few months though.

I've faced performance issues with large tables (200-500 mln rows, 50-100Gb of data+indexes)

New data are uploaded into tables once a day, around 1mln rows. Thats the only time of inserting data, during daytime tables are used for SELECTs only.

The problem that daily INSERTs are taking too much time now, because of rebuilding few indexes for the table.

I noticed that partitioning solution looks like solving this problem well. So i splitted master data table into 4 tables:

old master table:
CREATE TABLE [dbo].[DTB] (
[report_date] [smalldatetime] NOT NULL ,
[param] [char] (10) COLLATE Cyrillic_General_CI_AS NOT NULL ,
[param_value1] [decimal](18, 2) NULL ,
[param_value2] [decimal](18, 2) NULL ,
[param_value3] [smallint] NULL
)
CREATE INDEX [IX_DTB_DT_ACC] ON [dbo].[DTB]([report_date], [param])

new partition1:
CREATE TABLE [dbo].[be_data_DTB_part_2007_q1] (
[report_date] [smalldatetime] NOT NULL ,
[param] [char] (10) COLLATE Cyrillic_General_CI_AS NOT NULL ,
[param_value1] [decimal](18, 2) NULL ,
[param_value2] [decimal](18, 2) NULL ,
[param_value3] [smallint] NULL ,
CONSTRAINT [CK_be_data_DTB_part_2007_q1_report_date] CHECK ([report_date] >= '2007-Jan-01' and [report_date] <= '2007-Mar-31')
)
CREATE CLUSTERED INDEX [idc_be_data_DTB_part_2007_q1_report_date_param] ON [dbo].[be_data_DTB_part_2007_q1]([report_date], [param])

Similar are definitons for other partitions - q2, q3 and q4.

And here is partitioned view itself:
create view dbo.data_DTB
as
select * from dbo.be_data_DTB_part_2007_q1
union all
select * from dbo.be_data_DTB_part_2007_q2
union all
select * from dbo.be_data_DTB_part_2007_q3
union all
select * from dbo.be_data_DTB_part_2007_q4

I want users to access data SELECTing from view data_DTB, while I perform daily inserts right into be_data_DTB_part_2007_q4.

In general, this solution works well. For example:

Code Block
set statistics profile on
go
select * from data_DTB where report_date = '2007-Apr-16'
go
set statistics profile off
go

1290674 1 SELECT * FROM [data_DTB] WHERE [report_date]=@1
1290674 1 |--Concatenation
0 1 |--Filter(WHERE:(STARTUP EXPR(Convert([@1])<='Mar 31 2007 12:00AM' AND Convert([@1])>='Jan 1 2007 12:00AM')))
0 0 | |--Clustered Index Seek(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q1].[idc_be_data_DTB_part_2007_q1_report_date_param]), SEEK:([be_data_DTB_part_2007_q1].[report_date]=Convert([@1])) ORDERED FORWARD)
1290674 1 |--Filter(WHERE:(STARTUP EXPR(Convert([@1])<='Jun 30 2007 12:00AM' AND Convert([@1])>='Apr 1 2007 12:00AM')))
1290674 1 | |--Clustered Index Seek(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q2].[idc_be_data_DTB_part_2007_q2_report_date_param]), SEEK:([be_data_DTB_part_2007_q2].[report_date]=Convert([@1])) ORDERED FORWARD)
0 1 |--Filter(WHERE:(STARTUP EXPR(Convert([@1])<='Sep 30 2007 12:00AM' AND Convert([@1])>='Jul 1 2007 12:00AM')))
0 0 | |--Clustered Index Seek(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q3].[idc_be_data_DTB_part_2007_q3_report_date_param]), SEEK:([be_data_DTB_part_2007_q3].[report_date]=Convert([@1])) ORDERED FORWARD)
0 1 |--Filter(WHERE:(STARTUP EXPR(Convert([@1])<='Dec 31 2007 12:00AM' AND Convert([@1])>='Oct 1 2007 12:00AM')))
0 0 |--Clustered Index Seek(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q4].[idc_be_data_DTB_part_2007_q4_report_date_param]), SEEK:([be_data_DTB_part_2007_q4].[report_date]=Convert([@1])) ORDERED FORWARD)

As far as i see it checks filter parameter fitting CHECK constraint for each partition. Then it peforms clustered index seek for partition actually containing data and avoids using 3 other partitions. Thats great! This example just illustraits that partitioning actually works for me.

Unfortunately, there is another query with just awful performance on partitions comparing to single table. Lets try to get few rows entered last day:

Code Block
set showplan_text on
go
select top 10 * from DTB order by report_date desc
go
set showplan_text off
go

|--Top(10)
|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([MYDB].[dbo].[DTB]))
|--Index Scan(OBJECT:([MYDB].[dbo].[DTB].[IX_DTB_DT_ACC]), ORDERED BACKWARD)

Excellent! It runs just for only few seconds. But using partitions:

Code Block
set showplan_text on
go
select top 10 * from data_DTB order by report_date
go
set showplan_text off
go

|--Top(10)
|--Merge Join(Concatenation)
|--Merge Join(Concatenation)
| |--Merge Join(Concatenation)
| | |--Sort(ORDER BY:([be_data_ DTB_part_2007_q1].[report_date] ASC, [be_data_ DTB_part_2007_q1].[param] ASC, [be_data_DTB_part_2007_q1].[param_value1] ASC, [be_data_ DTB_part_2007_q1].[param_value2] ASC, [be_
| | | |--Clustered Index Scan(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q1].[idc_be_data_DTB_part_2007_q1_report_date_param]))
| | |--Sort(ORDER BY:([be_data_DTB_part_2007_q2].[report_date] ASC, [be_data_ DTB_part_2007_q2].[param] ASC, [be_data_DTB_part_2007_q2].[prama_value1] ASC, [be_data_DTB_part_2007_q2].[param_value2] ASC, [be_
| | |--Clustered Index Scan(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q2].[idc_be_data_DTB_part_2007_q2_report_date_param]))
| |--Sort(ORDER BY:([be_data_DTB_part_2007_q3].[report_date] ASC, [be_data_DTB_part_2007_q3].[param] ASC, [be_data_DTB_part_2007_q3].[param_value1] ASC, [be_data_DTB_part_2007_q3].[param_value2] ASC, [be_data_
| |--Clustered Index Scan(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q3].[idc_be_data_DTB_part_2007_q3_report_date_param]))
|--Sort(ORDER BY:([be_data_DTB_part_2007_q4].[report_date] ASC, [be_data_DTB_part_2007_q4].[param] ASC, [be_data_DTB_part_2007_q4].[param_value1] ASC, [be_data_DTB_part_2007_q4].[param_value2] ASC, [be_data_
|--Clustered Index Scan(OBJECT:([MYDB].[dbo].[be_data_DTB_part_2007_q4].[idc_be_data_DTB_part_2007_q4_report_date_param]))

As one can see that��s just awful . When I make graphical execution plan with Ctrl+L it says costs for Sort operations are thousands. I didn��t run this query to check statistics profile, because on our server it will run for hours.

I found a topic regarding this problem: http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=1270479&SiteID=17

Mostafa Elhemali describes exactly my problem in the last post. I also though that getting top 10 * from partitioned view shouldn��t be a problem �� it��s quite obvious just to grab top 10 from each partition and then find top 10 amongst them. Looks like it doesn��t work this way though.

So the question is. Is there any new workarounds for this problem? Or maybe it is already solved in latest patches for SQL2005? I know that SQL2005 introduces new way of partitioning tables, maybe the problem will go if using SQL2005 partitioned tables instead of oldstyle partitioned views?

Thank you.

p.s. Upon reviewing my post i noticed that issued ORDER BY report_date DESC against unpartitioned table, and ORDER BY report_date against partitioned view. Well, specifying ORDER BY report_date DESC for partitioned view gives similar results, except for few ASCs are replaced with DESCs.

View 2 Replies

Bad Performance Issues When Trying To SELECT TOP 10 * From &&<partitioned_view&&> ORDER BY 1

Default Sort Order - Open Table - Select Without Order By

Specify Order For Select Results, Order By: Help!

Order Of Joins And Performance

Query Performance With Order By Clause?

INSERT-SELECT Depending On The Select:ed Order

Multiple Tables Select Performance - SQL 2005 - Should It Take 90 Seconds For A Select?

Server Query Execution - Where Filtration Order - Performance Impact

Select Top And Order By

Order By With Select Into

UNION: Select Order

How Do I Use Order By When I Use Select Distinct.

SELECT DISTINCT W/ORDER BY

How To Get Row From Select Query In Order

Select Top Alters Order

Order By In A INSERT INTO..SELECT

Select Parameter Order

Tricky SELECT And ORDER BY

Select Like And Order Results

INSERT INTO As SELECT With ORDER BY

Select At Random Order From Database

SELECT DISTINCT Type ORDER BY

Bug In SELECT TOP (100) PERCENT With ORDER BY In SQLExpress?

How To Select First Missing Number In Order

Can You Use Column Order Value In Select Statement

Select As VarChar But Order By DateTime

ORDER BY Columns In SELECT List?

Select Records By Order Of Insertion???

How Can I Use SELECT DISTINCT And Maintain The Original Order

Select Customer Last Inserted Order Details

SELECT DISTINCT MyId Order By MyDate

Multiple Order By Statements In Union Select

Select Records From Two Tables, Plus Page And Order Them?