Large Table, Really Slow Queries
Jul 26, 2007
I'm working with a table with about 60 million records. This monster is growing every minute of the day as well, by 200,000 - 300,000 records/day. It's 11 columns wide, and has one index on a datetime column. My task is to create some custom reports based on three of these columns, including the datetime one.
The problem is response time. Any query executed on this table takes forever--anywhere between 30 seconds and 4 minutes. Queries such as this one below, as simple as it is, can take a minute or more:
select
count(dt_date) as Searches
from
SearchRecords
where
datediff(day,getdate(),dt_date)=0
As the table gets larger and large, the response time is going to get worse and worse. Long story short, what are my options to get the speed of queries down to just a few seconds with a table this big? So far the best I can come up with is index any other appropriate columns (of which there is one for sure, maybe two).
View 6 Replies
ADVERTISEMENT
May 31, 2007
Hi,
I currently have a large table (35 million rows, over 80GB). I have one varchar(max) column on the table that is used in the fulltext index.
To query the complete index is fast, for example:
SELECT 'ipod', COUNT(*)
FROM CONTAINSTABLE(MyDB.dbo.Contents, [Body], 'ipod') CT
This took 70 seconds (which I can live with). However, I seldom run queries like this, most are more like:
SELECT 'ipod', COUNT(*)
FROM CONTAINSTABLE(MyDB.dbo.Contents, [Body], 'ipod') CT
JOIN Pages ITP ON ITP.PageID = CT.[Key]
JOIN Feeds ITF ON ITP.IPID = ITF.IPID
JOIN Buyers ITB ON ITB.IBID = ITF.IBID
WHERE ITB.ID IN (1342,246)
These queries are much slower (this example took 17 minutes). I understand that FT searches the index and returns all rows that match the query to SQL. SQL then performs the joins and counts only the correct results. (Correct me if I'm wrong here).
One solution I've seen to this to put data or "tags" into the FT column - so my Body column would become something like:
'{ID:1342}' + [Body]
That sounds like a very good idea. I could then change the 2nd query above to be:
SELECT 'ipod', COUNT(*)
FROM CONTAINSTABLE(MyDB.dbo.Contents, [Body], '("ID:1342" OR "ID:246") AND "ipod"') CT
That all works well until I want to select 1000 different ID's because the FT query will become very long and complex. Also I'm only including one column (ID) in this example - but I have about 7 or 8 columns that I would need to include in these "tags". Quering multiple columns become very complex quickly and no doubt I will reach a query limit at somepoint.
If anyone has any other suggestions to the above I'd love to hear them. Another thought I'm having is to partition the table. I can find very little online about how FT behaves on partitioned tables - I fear it behaves exactly the same, what I'd like to think is that I could partition the table on an ID say 100 per partition or something, and then fulltext would only search the relevant partitions. If it behaves like this it may work. If no-one knows then I'll give it ago, but this will take me a while due to the table size - so I'm hoping one of you clever lot know!
Many thanks for any advice.
Simon
View 2 Replies
View Related
Jul 20, 2005
I am having performance issues on a SQL query in Access. My query isaccessing and joining several tables (one very large one). The tables arelinked ODBC. The client submits the query to the server, separated byseveral states. It appears the query is retrieving gigs of data from thetable and processing the joins on the client. Is there away to perform moreof the work on the server there by minimizing the amount of extraneous tabledata moving across the network and improving performance (woefully slowabout 6 hours)?
View 3 Replies
View Related
May 20, 2008
Hello,
We have created several Table Valued User Defined Functions in a Production SQL Server 2005 DB that are returning large (tens of thousands of) rows obtained through a web service. Our code is based on the MSDN article Extending SQL Server Reporting Services with SQL CLR Table-Valued Functions .
What we have found in our implementations of variations of this code on three seperate servers is that as the rowset grows, the length of time required to return the rows grows exponentially. With 10 columns, we have maxed out at approximately 2 500 rows. Once our rowset hit that size, no rows were being returned and the queries were timing out.
Here is a chart comparing the time elapsed to the rows returned at that time for a sample trial i ran:
Sec / Actual Rows Returned
0 0
10 237
20 447
30 481
40 585
50 655
60 725
70 793
80 860
90 940
100 1013
110 1081
120 1115
130 1151
140 1217
150 1250
160 1325
170 1325
180 1430
190 1467
200 1502
210 1539
220 1574
230 1610
240 1645
250 1679
260 1715
270 1750
280 1787
290 1822
300 1857
310 1892
320 1923
330 1956
340 1988
350 1988
360 2022
370 2060
380 2094
390 2094
400 2130
410 2160
420 2209
430 2237
440 2237
450 2274
460 2274
470 2308
480 2342
490 2380
500 2380
510 2418
520 2418
530 2451
540 2480
550 2493
560 2531
570 2566
It took 570 seconds (just over 9 1/2 minutes to return 2566 rows).
The minute breakdown during my trial is as follows:
1 = 655 (+ 655)
2 = 1081 (+ 426)
3 = 1325 (+244)
4 = 1610 (+285)
5 = 1822 (+212)
6 = 1988 (+166)
7 = 2160 (+172)
8 = 2308 (+148)
9 = 2451 (+143)
As you can tell, except for a few discrepancies to the resulting row count at minutes 4 and 7 (I will attribute these to timing as the results grid in SQL Management Studio was being updated once every 5 seconds or so), as time went on, fewer and fewer rows were being returned in a given time period. This was a "successful" run as the entire rowset was returned but on more than several occasions, we have reached the limit and have had 0 new rows per minute towards the end of execution.
Allow me to explain the code in further detail:
[SqlFunction(FillRowMethodName = "FillListItem")]
public static IEnumerable DiscoverListItems(...)
{
ArrayList listItems = new ArrayList();
SPToSQLService service = new SPToSQLService();
[...]
DataSet itemQueryResult = service.DoItemQuery(...); // This is a synchronous call returning a DataSet from the Web Service
//Load the DS to the ArrayList
return listItems;
}
public static void FillListItem(object obj, out string col1, out string col2, out string col3, ...)
{
ArrayList item = (ArrayList) obj;
col1 = item.Count > 0 ? (string) item[0] : "";
col2 = item.Count > 0 ? (string) item[1] : "";
col3 = item.Count > 0 ? (string) item[2] : "";
[...]
}
As you will notice, the web service is called, and the DataSet is loaded to an ArrayList object (containing ArrayList objects), before the main ArrayList is returned by the UDF method. There are 237 rows returned within 10 seconds, which leads me to believe that all of this has occured within 10 seconds. The method GetListItems has executed completely and the ArrayList is now being iterated through by the code calling the FillListItem method. I believe that this code is causing the result set to be returned at a decreasing rate. I know that the GetListItems code is only being executed once and that the WebService is only being called once.
Now alot of my larger queries ( > 20 000 rows) have timed out because of this behaviour, and my workaround was to customize my web service to page the data in reasonable chunks and call my UDF's in a loop using T-SQL. This means calling the Web Service up to 50 times per query in order to return the result set.
Surely someone else who has used Table Valued UDFs has come accross this problem. I would appreciate some feedback from someone in the know, as to whether I'm doing something wrong in my code, or how to optimize an SQL Server properly to allow for better performance with CLR functions.
Thanks,
Dragan Radovic
View 7 Replies
View Related
Nov 29, 2000
We are inserting into a table, which includes an identity primary key column. When the table gets really large (i.e. 1.5 million records), the performance of the inserts reduce.
I noticed that when we insert into the table an exclusive lock on the table is obtained. Do inserts into tables with identities always lock the table?
Given the table size is unavoidable, does anyone have a suggestion to improve the performance?
Thanks,
Matt
View 6 Replies
View Related
Mar 15, 2001
Hi,
Some of my queries are running too slow.It's taking as long as 30secs .Earlier the same query was taking less than 5 secs.
I understand the db has grown BUT I do not know to look at this query where should i start from and what should I look into.
It is on production server.
the db size is 15GB and unallocated is 9GB.
log space used is 4%.
TIA.
View 1 Replies
View Related
Oct 10, 2002
Howdy. I have a table in my DB that has about 2 million records. The search times are taking 15 - 30 seconds depending on the number of records I am returning. Is this normal? The machine is NT 4 sp6a Dual PIII 866's with 1 GB of RAM on RAID5 SCSI disk. This seems like a long time to me. What kind of performance should I expect? Any kind of tuning steps I can take?
Thanks
Shane
View 8 Replies
View Related
Jan 5, 2005
Hi All,
Am very new to SQL server so don't really understand what effects the speed of queries. I have the two below queries, which are nearly the same apart from one has a right join and the other doesn't. The both return about 5000 records, and I am implementing this query from an accss databse with an odbc link to sql server. What I don't understand is it takes about 8 seconds for the query with the right join in to return the records and only about 4 seconds for the one without. What I'm after really is just some general advice on how to bulid fast queries, and any advice on the two below queries would be nice. Thanks
SELECT Employees.Name, Calls.CallDate, Calls.CallTime, Calls.Callername, Contacts.CompanyID, Contacts.ContactID, Calls.CallerNumber, Calls.CallerCompany, Calls.ActionTakenID, Calls.OperatorID, Calls.Confirmed, Calls.Charged, Calls.Notes, Company.CompanyName, Operators.Operatorname, Calls.CallID, Calls.ShortMessage
FROM (Contacts INNER JOIN Company ON Contacts.CompanyID = Company.CompanyID) INNER JOIN (Operators INNER JOIN (Employees RIGHT JOIN Calls ON Employees.EmployeesID = Calls.EmployeesID) ON Operators.ID = Calls.OperatorID) ON Contacts.ContactID = Calls.ContactID
WHERE (((Contacts.ContactID)=1442))
ORDER BY Calls.CallDate DESC , Calls.CallTime DESC;
SELECT Employees.Name, Calls.CallDate, Calls.CallTime, Calls.Callername, Contacts.CompanyID, Contacts.ContactID, Calls.CallerNumber, Calls.CallerCompany, Calls.ActionTakenID, Calls.OperatorID, Calls.Confirmed, Calls.Charged, Calls.Notes, Company.CompanyName, Operators.Operatorname, Calls.CallID, Calls.ShortMessage
FROM (Contacts INNER JOIN Company ON Contacts.CompanyID = Company.CompanyID) INNER JOIN (Operators RIGHT JOIN (Employees RIGHT JOIN Calls ON Employees.EmployeesID = Calls.EmployeesID) ON Operators.ID = Calls.OperatorID) ON Contacts.ContactID = Calls.ContactID
WHERE (((Contacts.ContactID)=1442))
ORDER BY Calls.CallDate DESC , Calls.CallTime DESC;
View 6 Replies
View Related
Jul 28, 2004
Hi All,
I'm currently in the middle of building quite a large CMS using ASP.NET and MSSQL2K and have began to question if the amount of queries I am using for one page to be built is too many?
For one page (View Forum) I am getting all of the templates and checking access then pulling a list of threads, getting the first and last posts, then user info for the first and last posts... anyway to view 10 threads on the page the number of queries comes to about 54 and the page takes 0.064 seconds to load.
My question is, Is this to many queries to be running for a single page load? All queries are using Stored Procedures.
Thanks Guys.
View 3 Replies
View Related
May 29, 2008
Hi guys,
I am asking this question on behalf of a friend. I have little knowledge of SQL 2005 but my friend is quite knowledgeable, although this is the first time he is dealing with large database for a client. So here's the story.
His client has a database containing 1.5 million books. Now he is setting up a website which will enable users to search books. Searching by ISBN is no problem as it only takes 1 seconds. The problem is, searching by Title takes more than 20seconds, which is unacceptable. My friend has only done smaller database and he just recently thought of implementing indexing and now looking for other ideas.
Each row contains book details such as Title, Author1, Author2, Author3, Publisher, Publication Date, ISBN, etc.
Can anyone who are more experienced in doing large database share with me some design ideas? His client is aiming for 8seconds or less.
Thanks in advance!
View 14 Replies
View Related
Aug 6, 2007
Hi, I have absolutely no knowledge of PHP or SQL .... I moderate a PHPBB forum at www.savingshelterpets.com
Our web host (SiteGround) has taken our site down temporarily because we are overloading the server. I have no idea how to fix the problem, so hopefully someone here can help me out! Smiley
PHP version 4.4.4
MySQL version 5.0.27-standard-log
Here's the info sent to me by SiteGround (I don't understand a word of it!):
quote:Upon further investigation, it turned out that the following queries in your account are slow and heavily consume server resources:
# User@Host: savingsh_phpb1[savingsh_phpb1] @ localhost []
# Query_time: 4 Lock_time: 0 Rows_sent: 1 Rows_examined: 1284
use savingsh_phpbb2;
SELECT user_id, username, user_password, user_active, user_level, user_login_tries, user_last_login_try
FROM phpbb_users
--
delete from rs_stat_ip where platnost_do<'2007-08-03 16:49:43';
# User@Host: savingsh_phpb1[savingsh_phpb1] @ localhost []
# Query_time: 5 Lock_time: 3 Rows_sent: 1 Rows_examined: 0
use savingsh_phpbb2;
SELECT * FROM phpbb_optimize_db;
# User@Host: binaryte_lhlp1[binaryte_lhlp1] @ localhost []
--
# Time: 070803 16:50:27
# User@Host: savingsh_phpb1[savingsh_phpb1] @ localhost []
# Query_time: 4 Lock_time: 2 Rows_sent: 1 Rows_examined: 0
use savingsh_phpbb2;
SELECT t.topic_id, t.topic_title, t.topic_status, t.topic_replies, t.topic_time, t.topic_type, t.topic_vote, t.topic_last_post_id, f.forum_name, f.forum_status, f.forum_id, f.auth_view, f.auth_read, f.auth_post, f.auth_reply, f.auth_edit, f.auth_delete, f.auth_sticky, f.auth_announce, f.auth_pollcreate, f.auth_vote, f.auth_attachments
FROM phpbb_topics t, phpbb_forums f
In order to have the limitations removed, please optimize your script.
View 3 Replies
View Related
Jul 17, 2007
Some queries take a long time to complete.
Setup is:
- SQL Express SP2
- Windows Vista Business
- 2 GB RAM
- Core 2 Duo processor
- Connecting to (local) server with SQL Authentication
- only 1 Instance of MSSQLSERVER
Simple queries (SELECT * FROM TableName) wher the table has only a few records. This query may take up to 30 or more to execute. This slowness is consistent to certain tables. Other much larger tables run queries fine.
If a different computer logs in to the same server, queries provide instantaneous results.
View 4 Replies
View Related
Feb 14, 2008
Hi,
I am with the response time for a simple count on a fulltext search that is too slow.
Even using the most simple query on a good server (64 bit Dual Opteron 4GB Ram with high speed 16 raid disk storage)):
select count(*) from content_books where contains(searchData,'"english"')
Takes 4 seconds to count the avg 500.000 resultsI have removed all the joins with real table data so that the query is only inside the fulltext engine..
I would expect this to be down to 4 milli seconds. Isn't it just getting the size of the "english" word result index?
It seems the engine is going through all the results because if a do a more complex search that returns less results the performance is better.
Any clues of how to do this faster? I never read the thousands of records BUT i need to count them...
Thank you very much.
View 2 Replies
View Related
Feb 27, 2008
Hi All
I struck up with Slow perfornace query,Please some body help me how to analyze Slow perforamnce queris.
View 6 Replies
View Related
Nov 26, 2007
Hi,
I want to pass below given query into a variable
"if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[ <POS_MONTH>]]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
DROP TABLE [dbo].[ <POS_DATE>]
GO
SELECT * INTO [dbo].[<POS_DATE>] ]
FROM SG_POS_Template
WHERE 1 = 0;
GO"
Where [<POS_DATE>] is a parameter by which value will be assigned dymanically.....anybody please help me out....!!
View 5 Replies
View Related
Oct 12, 2005
I've been using MS-SQL Server for many years but never come across this problem before.
When I try and run a very simple query from Query Analyzer it takes a LONG time. Even when there are no tables involved!
Even:-
select 1
go
takes 28 seconds to return '1' when running against the local server. i.e. both QA and the Server are running on the same machine.
Can anyone help explain how to get my performance back! Thanks.
View 1 Replies
View Related
May 30, 2007
Hi,
I have succesfully created a Stored Procedure which runs under 2 seconds locally.
However when i run the same proc from another machine in the LAN, the response times vary from 5 sec to over 40 Secs and even occassionally times out.
My server is SQL 2005 Dev Edition (32 Bit) running on a Dual Core Box with 2GB memory.
Any Ideas why this would be happening?
View 6 Replies
View Related
Apr 11, 2006
I have some VB.NET code that starts a transaction and after that executes one by one a lot of queries. Somehow, when I take out the transaction part, my queries are getting executed in around 10 min. With the transaction in place it takes me more than 30 min on one query and then I get timeout.
I have checked sp_lock myprocessid and I've noticed there are a lot of exclusive locks on different objects. Using sp_who I could not see any deadlocks.
I even tried to set the isolation level to Read UNCOMMITED and still have the same problem.
As I said, once I execute my queries without being in a transaction everything works great.
Can you help me to find out the problem?
Thanks,
Laura
View 11 Replies
View Related
Jul 4, 2007
Hi,
We are running SQL Server 2005 Ent Edition with SP2 on a Windows 2003 Ent. Server SP2 with Intel E6600 Dual core CPU and 4GB of RAM. We have an C# application which perform a large number of calculation that run in a loop. The application first load transactions that needs to be updated and then goes to each one of the rows, query another table get some values and update the transaction.
I have set a limit of 2GB of RAM for SQL server and when I run the application, it performs 5 records update (the process described above) per second. After roughly 10,000 records, the application slows down to about 1 record per second. I have tried to examine the activity monitor however I can't find anything that might indicate what's causing this.
I have read that there are some known issues with Hyper-Threaded CPUs however since my CPU is Dual-core, I do not know if the issue applies to those CPUs too and I have no one to disable one core in the bios.
The only thing that I have noticed is that if I change the Max Degree of Parallelism when the server slows down (I.e. From 0 to 1 and then back to 0), the server speeds up for another 10,000 records update and then slows down. Does anyone has an idea of what's causing it? What does the property change do that make the server speed up again?
If there is no solution for this problem, does anyone know if there is a stored procedure or anything else than can be used programmatically to speed up the server when it slows down? (This is not the optimal solution however I will use it as a workaround)
Any advice will be greatly appreciated.
Thanks,
Joe
View 3 Replies
View Related
Jul 2, 2015
so async cursor population is supposed to create the cursor and return the cursor id quickly, while the server works on async populating the results. For a keyset-driven cursor, SQL Server stores the key sets in tempdb, which it then uses to fetch data for cursor results. Anyway, this works fine for smaller tables, but I'm finding for large result sets, the async cursor population is very slow and indeed seems to approximate synchronous time. The wait stat I get while it is running (supposedly asynchronously) is TRANSACTION_MUTEX.
Example:
--enable async cursor
exec dbo.sp_configure 'cursor threshold', 0; reconfigure;
declare @cursor int, @stmt nvarchar(max), @scrollopt int, @ccopt int, @rowcount int;
--example of giant result set
set @stmt = 'select * from sys.all_objects o1, sys.all_objects o1';
[code]...
Note that using the SQL "select * from sys.all_objects o1" is much faster than "select * from sys.all_objects o1, sys.all_objects o2". However, if cursor population is async, I'd expect the time to return a cursor id to be similar between the two.
View 7 Replies
View Related
Sep 10, 2007
I have a 2GHZ cpu with 1GB of RAM. I occassionally see very slow (long) queries against a local SQL Server 2005 Express (SP2) database. The issue occurs against different SQL Queries, but all queries are rather basic select statements Perfmon shows that the SQL Server counter for the "MEMORY GRANT QUEUE WAIT Avg MS" gets extremely high (25000+ ms). Perfmon also also shows that PAGING is not occuring, and the system is not under unsual stress. The problem is not reproducible with MSDE.
Has anyone seen this issue, or have any recommendations for a next course of action?
View 1 Replies
View Related
Mar 30, 2015
Our monitoring tool shows that our production system periodically experiencing large rate - up to 800 memory pages/sec. How to find out which particular queries, S.P., processes that initiate this?
View 3 Replies
View Related
Jul 7, 2014
I have a remote server with SQL server 2014 instance on it. There is nothing else running on the SQL Server box(dedicated SQL box). There is only one instance of SQL 2014 on the server. No other versions of SQL server are on the server.
Issue:
1. When I execute a query connecting to the SQL server instance through my local SSMS, the query executes in 30 secs.
2. When i connect to remote server through windows RDP session and execute the same query in the SSMS(on server), then query executes in 1 minute.
View 9 Replies
View Related
Jun 25, 2007
Hi, all experts here,
I am wondering if tempdb stores all results tempararily whenever I query a large fact table with over 4 million records which joins another dimension table? Since each time when I run the query, the tempdb grows to nearly 1GB which nearly runs out all the space on my local system drive, as a result the performance totally down. Is there any way to fix this problem? Thanks a lot in advance and I am looking forward to hearing from you shortly for your kind advices.
With best regards,
Yours sincerely,
View 11 Replies
View Related
Mar 7, 2014
Why the Indexes on table slow down the DML operation on table, what is the exact reason?
View 5 Replies
View Related
Aug 27, 2007
Hi everyone,
I use sql 2005. What is the best practice for dealing with large table (more than million rows)? Table Partition, View or other?
Can you please give some suggestions? It will be very helpful if you can post some references or examples.
Thank you!
View 12 Replies
View Related
Nov 16, 2007
I am developing an application that has a table with lots of records(network traffic) but the data is summarize every so often to create summary records (old records are deleted). The problem is that I have a PK based on an autoincrement ID (int) that will run out of numbers. However, this ID is not referenced anywhere, (not a foreign key from another table, not use for deletion and there is no update in this table whatsoever).
So my possibilites are:
1.- reseed the id when it is about to run out.
2.- make the id bigint
3.- remove the id and change the PK to 2 other fields
4.- remove the id and without PK
I am leaning toward option 4, because I do not see the need for a PK, but I understand that it is quite out of the normal.. So I would like to hear from other people ( I do not have much experience with DB).
I also like option 3. I already have a index on one of the other fields (time).
Any input will be appreciated.
Claudio Robles
View 7 Replies
View Related
Jul 23, 2005
If I use BCP to export a very large table will that table be blockedfor writes during the export process? I don't want to prevent usersfrom accessing that table during the bcp process?Thank You, TFD.
View 1 Replies
View Related
May 26, 2008
Hi,
I have this page that upload's PFD's to a table. In principle this works fine.
Until I try to upload large files (3 to 4 MB)I need to even upload larger files than that. (Don't really know as of yet what users are going to come up with) I get TimeOut problems. Now some people say it is not possible to exceed a limit of about 4 MB. But that there is a workaround by changing something to the web.config file.Can somebody give me info about that, (I am quite a novice really)I tried to change it like this, but to no avail:
<system.web><httpRuntime maxRequestLength="102400"enable = "True"requestLengthDiskThreshold="102400" useFullyQualifiedRedirectUrl="True"executionTimeout="102400"/></system.web>
Thanks for any help!
View 2 Replies
View Related
Mar 3, 2004
I have a table of approx 1/2 million rows.
On a nightly basis, this table gets rebuilt in a temporary database. Once the table has been built and scrubbed, i need to move it into our webservers db.
I'd like to do this with minimal interuption to the website.
Possible techniques:
1) I could set up a DTS package to copy the table object overwriting the destination table
2) I could export to a flat file and then bulk import into the live table (after truncating it)
3) I could run a process to update smaller chunks of data at a time running delete queries and insert queries.
Anybody have a thought on the best way to do this so that the web users would be virtually unaware that anything was happening ?
View 4 Replies
View Related
Mar 1, 2002
Hi,
I am absolutely innocent as far as T-SQL is concerned. I need to detect all duplicates (key consists of 5 fields) in the table and delete the duplicates.
I tried different approaches like joins etc but nope.
Any help is appreciated
Thanks
View 2 Replies
View Related
Aug 6, 2004
OK, I imported 680 million records into an unindexed table. That went well.
Then, I went into Enterprise Manager and added a two column non-unique clustered index to that table to speed access.
It's been running for ~36 hours and I have no idea when it will complete. I have deadlines that I'm going to miss and am very nervous; what can I do?
SQL Server 2000 Enterprise Edition (8.00.818 - sp3 + hotfixes)
Dual 3Ghz Xeon (two physical CPUs each have HyperThreading enabled)
Windows 2000 SP4
4GB RAM (although I just noticed the 3GB OS switch wasn't on)
SCSI boot drive
tempdb, data, and transaction log are on a FibreChannel RAID SAN
Help! Thanks in advance!
View 8 Replies
View Related
Nov 14, 2007
Hi folks! I'm looking for advice on partitioning a large table. In the DDL below I've changed names to protect the guilty.
My table has this schema:
CREATE TABLE [dbo].[BigTable]
(
[TimeKey] [int] NOT NULL,
[SegmentID] [int] NOT NULL,
[MyVal] [tinyint] NOT NULL
) ON [BigTablePS1] (TimeKey) -- see below for partition scheme
alter table [dbo].[BigTable] add constraint [PK_BigTable]
primary key (timekey asc, SegmentID asc)
-- will evaluate whether this one is needed, my thinking is yes
-- based on the expected select queries.
create index NCI_SegmentID on BigTable(SegmentID asc)
The TimeKey column is sort of like a unix time. It's the number of minutes since 2001/01/01, but always floored to a 5 minute boundary. so only multiples of 5 are allowed.
Now, this table will be rather big. There are about 20k possible SegmentIDs. For every TimeKey from 2008/01/01 to 2009/01/01 (12 months), I'll have on the order of 20000 rows, one for each SegmentID.
For the 12 month period, there are 365*24*60/5=105120 possible TimeKey values. So the total rowcount is over 2 billion. (20k * 105120)
Select queries are expected to be something like this:
-- fetch just one particular row...
select MyVal from BigTable
where TimeKey=5555 and SegmentID=234234
--fetch for a certain set of SegmentID and a particular time...
select
b.SegmentID
,b.MyVal
from BigTable b
join OtherTable t on t.SegmentID=b.SegmentID
where b.TimeKey=5555
and t.SomeColumn='SomeValue'
Besides selects, also I need to be able to efficiently issue update statements against the table with new values in the MyVal column based on a range of TimeKey values (a contiguous span of a few days) and sets of about 1000 SegmentID. updates would always look like this:
update t
set t.MyVal=p.MyVal
from BigTable t
join #myTempTable p on t.TimeKey=p.TimeKey
and t.SegmentId=p.SegmentId
where #myTempTable would have order of 1000*24*60 rows in it, all with contiguous TimeKey values, and about 1000 different SegmentID values. #myTempTable also has a clustered pk on (timekey asc, SegmentId asc).
After the table is loaded, it would never get any inserts or deletes. only selects and updates.
Given the size, and the nature of the select and update queries, this table seems like a good candidate for partitioning. I'm thinking it makes sense to partition on TimeKey.
So my question is, is it stupid to create a separate partition for each day in the year long span of TimeKeys this table covers? That would mean 365 partitions in the partition function and partition scheme. Something like this:
CREATE PARTITION FUNCTION [BigTableRangePF1] (int)
AS RANGE LEFT FOR VALUES
(
3680640 + 0*1440, -- 3680640 is the number of minutes between 2001/01/01 and 2008/01/01
3680640 + 1*1440,
3680640 + 2*1440,
3680640 + 3*1440,
...snip...
3680640 + 363*1440,
3680640 + 364*1440,
3680640 + 365*1440
);
GO
CREATE PARTITION SCHEME [BigTablePS1]
AS PARTITION [BigTableRangePF1]
TO
(
[PRIMARY],[PRIMARY],[PRIMARY],
...snip...
[PRIMARY],[PRIMARY],[PRIMARY]
);
GO
does anyone have any experience with partitioned tables with so many partitions? Is a few hundred partitions too many? From my understanding of partitions, seems like having so many will be ok. Is it somehow worse than having hundreds of tables in a database?
Even with one partition for each day, I'll still have 24*60*20000/5 ~ 5m rows in each one.
5m seems like a manageable number. 2b does not.
elsasoft.org
View 2 Replies
View Related