Partitioning A Large Table - How Much Is Too Much?
Nov 14, 2007
Hi folks! I'm looking for advice on partitioning a large table. In the DDL below I've changed names to protect the guilty.
My table has this schema:
CREATE TABLE [dbo].[BigTable]
(
[TimeKey] [int] NOT NULL,
[SegmentID] [int] NOT NULL,
[MyVal] [tinyint] NOT NULL
) ON [BigTablePS1] (TimeKey) -- see below for partition scheme
alter table [dbo].[BigTable] add constraint [PK_BigTable]
primary key (timekey asc, SegmentID asc)
-- will evaluate whether this one is needed, my thinking is yes
-- based on the expected select queries.
create index NCI_SegmentID on BigTable(SegmentID asc)
The TimeKey column is sort of like a unix time. It's the number of minutes since 2001/01/01, but always floored to a 5 minute boundary. so only multiples of 5 are allowed.
Now, this table will be rather big. There are about 20k possible SegmentIDs. For every TimeKey from 2008/01/01 to 2009/01/01 (12 months), I'll have on the order of 20000 rows, one for each SegmentID.
For the 12 month period, there are 365*24*60/5=105120 possible TimeKey values. So the total rowcount is over 2 billion. (20k * 105120)
Select queries are expected to be something like this:
-- fetch just one particular row...
select MyVal from BigTable
where TimeKey=5555 and SegmentID=234234
--fetch for a certain set of SegmentID and a particular time...
select
b.SegmentID
,b.MyVal
from BigTable b
join OtherTable t on t.SegmentID=b.SegmentID
where b.TimeKey=5555
and t.SomeColumn='SomeValue'
Besides selects, also I need to be able to efficiently issue update statements against the table with new values in the MyVal column based on a range of TimeKey values (a contiguous span of a few days) and sets of about 1000 SegmentID. updates would always look like this:
update t
set t.MyVal=p.MyVal
from BigTable t
join #myTempTable p on t.TimeKey=p.TimeKey
and t.SegmentId=p.SegmentId
where #myTempTable would have order of 1000*24*60 rows in it, all with contiguous TimeKey values, and about 1000 different SegmentID values. #myTempTable also has a clustered pk on (timekey asc, SegmentId asc).
After the table is loaded, it would never get any inserts or deletes. only selects and updates.
Given the size, and the nature of the select and update queries, this table seems like a good candidate for partitioning. I'm thinking it makes sense to partition on TimeKey.
So my question is, is it stupid to create a separate partition for each day in the year long span of TimeKeys this table covers? That would mean 365 partitions in the partition function and partition scheme. Something like this:
CREATE PARTITION FUNCTION [BigTableRangePF1] (int)
AS RANGE LEFT FOR VALUES
(
3680640 + 0*1440, -- 3680640 is the number of minutes between 2001/01/01 and 2008/01/01
3680640 + 1*1440,
3680640 + 2*1440,
3680640 + 3*1440,
...snip...
3680640 + 363*1440,
3680640 + 364*1440,
3680640 + 365*1440
);
GO
CREATE PARTITION SCHEME [BigTablePS1]
AS PARTITION [BigTableRangePF1]
TO
(
[PRIMARY],[PRIMARY],[PRIMARY],
...snip...
[PRIMARY],[PRIMARY],[PRIMARY]
);
GO
does anyone have any experience with partitioned tables with so many partitions? Is a few hundred partitions too many? From my understanding of partitions, seems like having so many will be ok. Is it somehow worse than having hundreds of tables in a database?
Even with one partition for each day, I'll still have 24*60*20000/5 ~ 5m rows in each one.
5m seems like a manageable number. 2b does not.
elsasoft.org
View 2 Replies
ADVERTISEMENT
May 15, 2014
I have a very large table that I need to partition. Ideally the table will write to three filegroups. I have defined the Partition function and scheme as follows.
CREATE PARTITION FUNCTION vm_Visits_PM(datetime)
AS RANGE RIGHT FOR VALUES ('2012-07-01', '2013-06-01')
CREATE PARTITION SCHEME vm_Visits_PS
AS PARTITION vm_Visits_PM TO (vm_Visits_Data_Archive2, vm_Visits_Data_Archive, vm_Visits_Data)
This should create three partitions of the vm_Visits table. I am having a few issues, the first has to do with adding a new clustered index Primary Key to the existing table. The main issue here is that the closed column is nullable (It is a datetime by the way). So running the following makes SQL Server upset:
ALTER TABLE dbo.vm_Visits
ADD CONSTRAINT [PK_vm_Visits] PRIMARY KEY CLUSTERED
(
VisitID ASC,
Closed
)
ON [vm_Visits_PS](Closed)
I need to define a primary key on the VisitId column, but I need to include the Closed column in order to partition on it.how I would move data between partitions on a monthly basis. Would I simply update the Partition function, or have to to some sort of merge, split, or switch function?
View 2 Replies
View Related
May 9, 2014
The SQL CAT team's recommendation is to avoid partitioning dimension tables: URL.....I have inherited a dimension table that has almost 3 billion rows and is 1TB and been asked to look at partitioning and putting maintenance in place, etc.I'm not a DW expert so was wondering what are the reasons to not partition dimensions?
View 7 Replies
View Related
Feb 9, 2015
We have an existing BI/DW process that adds large chunks of data daily (~10M rows) to an existing table, as well as using Deletes to remove stale data. This scenario seems to beg for partitioning to support switching in/out data.
After lots of reading on this, I have figured out the mechanics of the switching, bit I still have some unknowns about the indexes needed to support this.
The table currently has several non-clustered indexes, including one on the partitioning column - let's call that column snapshotdate. Fortunately there are no FKs involved, and no constraints.
Most of the partitioning material I see focuses on creating a clustered PK to assist with switching. Not sure if this is actually necessary, but assume I create one using an Identity column (currently missing) plus snapshotdate.
For the other non-clustered, non-unique indexes, can I just add the snapshotdate to the end of the index? i.e. will that satisfy the switching requirement?
View 1 Replies
View Related
Oct 7, 2015
I have a requirement of table partitioning. we have 10 years of data on a table which is 30 billion up rows on 2005 server we are upgrading it to 2014. we have to keep 7 years of data. there is no keys on table or date column. since its a huge amount of data and many users its slow down the process speed. we are thinking to do partition on 7 years for Quarterly based. but as i said there is no date column on table we have to use reference table to get date. is there a way i can do the partitioning with out adding date column on table? also does partition will make query faster?Â
I have think three ways to do it.
1. leave as it is.
2. 7 years partition on one server
3. 3 years partition on server1 and 4 years partition on server2 (for 4 years is snapshot better?)
View 3 Replies
View Related
Mar 27, 2008
i have a table named "user" in which user which are located at different places within a city are recorded.
i want to group user with respect to there location like users of northern region are recorded first then users of western region and so on.
tell me from horizontal and vertical partitioning wh technique is better or i should use some other technique.
thanks for ur consideration.
View 5 Replies
View Related
Nov 14, 2000
Hi,
I want to know more on table partitioning.I do not know where to get the right info.from.
I have a doubt - if a table is partitioned horizontally how does a query identifies where to pick up the data from i.e. from which part of partitioned table?
View 2 Replies
View Related
Jun 16, 2008
i want to partition a table containing about 3 million rows. The partition column will be of datetime type.
following is the partition function i have used
create partition function MyPartFun
(datetime) as range left for values ('07/30/2007','09/30/2007','11/30/2007','01/30/2008','04/30/2008')
following is the partition scheme i have used
create partition scheme PartScheme as
partition MyPartFun all to ([primary])
i know how to add partition column while creating the table
But dont know how to add above partition scheme to an already populated table
Plz help...
View 2 Replies
View Related
Apr 24, 2007
Hi,
I have a database created using Enterprise Manager Wizard.
For example datafile db1_data.mdf and log file db1_log file exists.
All the tables are created in datafile db1_data.mdf.
Now to improve performance I want to implement table partitioning.
Can anybody tell me howto implement it with existing strutcure.
Suppose there is table Mytable in which all update and delete actions are performed regularly.And it contains about 10,0000 records.
I want to partition the table so that it contains 5000 records.
Solution with example highly appreciated.
Satish
View 7 Replies
View Related
Dec 13, 2007
Hi Experts,
I am new to Table Partitioning, Can any body guide me how to do table partitioning?
any way here is my scenario, we are having one database called "DATA" in SQL 2000 server and we have migrated to SQL 2005 by using backup and restore. and "DATA" is having about 15 tables and they are very very very big in size. and they dont have any index on a coulum name "DATETIME", but i want make table partition according to that perticular field "DATETIME" and right present we are having 6 months of data.
So, how to proceed further?
Your help will be appreciable..
View 1 Replies
View Related
Mar 19, 2007
i am trying to partition an sql table in sql server 2005, i created the partition schema and the data files that i want the data to be filled in after the partition. After the partition is finished sql gave me partition is successful , but i noticed that the size of data files i created has not increased and their sizes are the same.
notice: i have a clustered index on this table, so i dropped this index and recreated it
Bellow the script that i am using
and thank you for your help in advance
View 1 Replies
View Related
Mar 7, 2008
Hello frnds....what does mean by this ?
View 1 Replies
View Related
Oct 11, 2006
Hello,
I have a Sql Server 2005 database with many tables, each with millions of records within them.
They all have a Receive Date field, with records going back 10 years or so.
What would be the best way to partition it? I was thinking of partitioning them by years, but that would give me 10+ partitions -- would that be alot of overhead?
~Le
View 2 Replies
View Related
Jun 5, 2007
Hi!I have a question:I already have a DB that uses partitions to divide data in USCounties, partitioned by state.Can I use TWO levels of partitioning?I mean... 3077 filegroups and 50 partition functions that addressthem, but can I use another function to group the 50 states?Thanks!Piero
View 15 Replies
View Related
Apr 30, 2007
We had data in tables for multiple users (Logins) .Each user data is identified by a one column named €śUSER€?. No user has direct access to tables and only through views .we have created views and stored proc .Views will perform DML operations on tables using condition WHERE USER=SUSER_SNAME() (i.e Logged in user).So no point of getting others user data.
Each table has a column USER and we are queering data based on login user .this is the foreign key of USER table. Each view contains user column in where clause .So for every query we are searching all records .instead of that is there any way to get data with out searching all records.
I heard about table Partitioning, index Partitioning, view Partitioning. Are they helpful to boost my query performance?
And also let me know is there any good way of designing apart from above options
View 3 Replies
View Related
Aug 10, 2006
Hi,
I have the following doubt about table lockinglocking in case of partitioning:-
Say we have 5 partition on the table Employee on the key Joining_Date and when we run 5 select queries on each of the parition in parallel will there be locking on the table when the 1st query is running or all the 5 queries can run in parallel. Basically, I am trying to see if parallelism and partitioning can work in sync or there will be locking at the table level if I don't specify any query hints?
Any help in this regard is highly appreciated.
Thanks,
Ritesh
View 1 Replies
View Related
Jan 15, 2007
Conrad writes "I'm currently working on Table Partitioning. I have done everything succesfull for partitioning, what I'm struggling with is to use
" SELECT MAX(NAME) from sys.filegroups WHERE NAME NOT LIKE 'PRIMARY' " to get the last used FileGroup. Now this works just fine, but when I run the following script:
--Decalre variables
DECLARE @LastFilegroupName VARCHAR(50)
DECLARE @FilegroupName VARCHAR(50)
--Retuns the next FileGroup to be used
SET @LastFilegroupName = (select MAX(NAME) from sys.filegroups WHERE NAME NOT LIKE 'PRIMARY')
SET @LastFilegroupName = Replace(@LastFilegroupName,'FileGrp','')
SET @FilegroupName = 'FileGrp' + CAST((@LastFilegroupName + 1) as varchar(10))
--Alter database statement
ALTER DATABASE VadivelTesting
ADD FILEGROUP @NewFG_Name
This script gives the following error "Incorrect syntax near '@NewFG_Name'."
When I give it a static name it works fine, but not with the variable.
Please can someone help me, I'm in struggeling with this one."
View 1 Replies
View Related
Dec 2, 2006
Is there any benefit in creating seperate file groups for a partitionedtable on a multi-processor server with RAID5 and 1 Logical Drive?
View 8 Replies
View Related
Oct 30, 2015
In our environment we have some "Huge" tables with around 2 million records. Planning to implement table partition.We have 4 LUN's.Â
View 3 Replies
View Related
Sep 28, 2007
I have an existing database with a table of about 50 milion records. There are also about 20 other tables, but they are alot smaller. The large table has a uniqueidentifier as it's Primary key (not sequential) and a forien key to a 'parent' table. The table also has a column telling when it was created. So, a bit simplified, it looks like:
ChildTable
---------------
Id uniqueidentifier <PK>
ParentId uniqueidentifier <FK>
CreationDate DateTime
ParentTable
-----------------
Id uniqueidentifier <PK>
CreationDate DateTime
Most of the questions accessing the Child table (the large table) is doing so by referensing the parent table, and not the CreatingDate, i.e.
SELECT *
FROM ChildTable
WHERE ParentId = '......'
All records with a specific ParentId will have very similiar CreationDates.
Now, my question is, will Partitioning the ChildTable boost performance for me? In case it will, what column(s) would define the Partitions? If I do it by CreationDate, a select-query like the one above will have to scan all partitions anyway, doesn't it? Doing it by Id isn't soo easy either I guess? If it helps, it might be possible to change the primary keys in the tables to have sequential guids.
Is there perhaps a performance tool to get help with suggestions about how to partition the table? Something like the 'Performance dashboard' reports, but for partitioning?
Regards Andreas
View 10 Replies
View Related
Mar 27, 2008
Hi,
I have a big table which is partitioned , I need to change the portioning function (including the column used by partition key). I don€™t want to change the table or use a temp table as there are a lot of dependent objects to this table
I wrote:
DECLARE @partition_count INT,@cmd VARCHAR(8000)
SELECT @partition_count=MAX(p.partition_Number) FROM sys.allocation_units A,sys.partitions P ,sys.data_spaces D
WHERE P.OBJECT_ID=OBJECT_ID('TABLEA')
AND A.container_id=P.Partition_id
AND A.data_space_id=D.data_space_id
SELECT @partition_count
WHILE (@partition_count>1)
BEGIN
SELECT @cmd='ALTER PARTITION FUNCTION MainPartitionFunction() MERGE range ('+CONVERT(VARCHAR(10),@partition_count-1)+')'
exec (@cmd)
SELECT @partition_count=@partition_count-1
End
Which will merge all the existing partitions to one. But I don€™t seem to be able to find a way change the existing funtion. ( in need to change the partition key all together)
Is there a way to disable or drop parition of the table all together (convert the table to a non-partitioned table) and then I can re define the parition function again.
Any ideas?
Shaunt
View 2 Replies
View Related
Dec 3, 2014
I need Dynamic Partition of SQL Table.
1. What is the best practice for partitioning (on date column)
2. The project on which i am working correctly have a case where in i get the update of my status flag after few days (Say 15 - 30) in that case if my data got into partition table how to update and how to search which partition has my data
3. Is creating partition occupies more disk space?
4. Is every partition would need index?
View 7 Replies
View Related
Jul 31, 2015
I am new to Partitioning tables. My scenario is as listed below.
I am getting Monthly Transaction data on Every First Monday of the Month and I want to do partition for those data.
For Example: Let's say I will get my next monthly data on August 3rd 2015 which is First Monday of the month of August.
I want those Transaction data to go in new partitioned FileGroup in my existing partitioned table. How can I do partition for this kind of scenario ? Can we create one or multiple Stored Procedure which will create New Partition and load data in that partition ?Â
FYI, this monthly data will be loaded in Staging table and that table has LoadDate column which will have 2015-08-03 in it. I am using SQL 2012 Enterprise edition.
View 17 Replies
View Related
Oct 25, 2007
We are facing few issues pertaining to creation of primary key on a non - partitioned column in sql server 2005. Herewith attaching the text file containing the detailed scenario.
Pls advice.
Pls find some of the scenario with the example given below:
We have done the following steps
1.Creating a Partition Function
CREATE PARTITION FUNCTION pf_EncounterODS_StateID (CHAR(2))
AS RANGE RIGHT FOR VALUES
(
'CA', -- CA
'MI', -- MI
'NM', -- NM
'OH', -- OH
'TX', -- TX
'UT', -- UT
'WA' -- WA
)
GO
2.Creating a Partition Schema
CREATE PARTITION SCHEME [ps_EncounterODS_StateID]
AS PARTITION pf_EncounterODS_StateID TO
(
[PRIMARY],
[ENC_DM_DATA_01], -- CA
[ENC_DM_DATA_03], -- MI
[ENC_DM_DATA_04], -- NM
[ENC_DM_DATA_05], -- OH
[ENC_DM_DATA_06], -- TX
[ENC_DM_DATA_07], -- UT
[ENC_DM_DATA_02] -- WA
)
GO
3.Creating a table using above partitioned schema
CREATE TABLE [dbo].[Fact] (
[ValidationErrorSID] [bigint] NOT NULL IDENTITY(1,1),
[ProcessID] [int] NULL ,
[StateID] [char] (2) NOT NULL ,
[ClaimID] [char] (15) NULL ,
) ON [ps_EncounterODS_StateID](StateID)
4.Creating a primary key on validationErrorSID column in the fact table
ALTER TABLE [Fact] ADD CONSTRAINT NQValidationError PRIMARY KEY CLUSTERED ([ValidationErrorSID])
Step 4 throws an error as
---------------------------------
Column 'StateID' is partitioning column of the index 'NQValidationError'. Partition columns for a unique index must be a subset of the index key.
If we include StateID along with ValidationErrorSID for index,then it works fine.But we need to have only ([ValidationErrorSID]) for indexing.
View 3 Replies
View Related
May 23, 2008
I have gone through the table partitioning in MSDN, like Designing Partitions to Manage Subsets of Data . But how to do this in actual world, since some db need to partition for 7 days, then archive these days records on the 8th day, while other prefer 14 days/monthly, and run this repetitively for many years? If this is running weekly, how can i generate the scheme and function dynamically? What if ID for row count is not viable?
Sure this will not work:
Code Snippet
CREATE PARTITION FUNCTION [TransactionRangePF1] (datetime)
AS RANGE RIGHT FOR VALUES ('10/01/2003', '10/02/2003', '10/03/2003',
'10/04/2004', '10/05/2004', '10/06/2004', '10/07/2004');
GO
Then how to drop partition periodically?
View 6 Replies
View Related
Apr 18, 2006
Table partitioning with multiple key coulmns
Hi champs!
I have one table with around 600 million records.
I want to partition this table on 5 different coulumns.
year, month. coulumn3, coulumn4 and coulumn5.
How do I do this best?
What is the best way of doing this; is there any best practice on ths?
The obvious thing is to make a partition funktion based on the year and month coulmns, but I also want to partition on coulumn3, coulumn4 and coulumn5.
Many thanks
Kurlan
View 1 Replies
View Related
May 31, 2007
Hello,
I am building partitiong tables, partitioning on different file groups:
the question is:
Partitioned table referred to old data that are not frequent accessed for reporting can be stored on separate location(External storage, tape and so on) or to make partitioning functioning must all file groups must be presents?
If not, how can I separate old data from current ones (still using partitioning) to reduce the size of DB?
What it is the best for storage data and easy to access it when needs arise (eg reporting): Tape, external storage, others?
Thank
View 1 Replies
View Related
Feb 25, 2014
Script to do the table partitioning for a 500gb in sliding window technique?
View 9 Replies
View Related
Mar 6, 2014
If the partitioning MERGE command attempts to drop historic data at the wrong boundary point then data movement between file groups may be necessary before or during the next index rebuild. The script below creates 2 test tables, one using a range right function and the other using range left. The partitioning key is a number between 0 - 59, an empty partition is maintained at the start and end of ranges, 4 partitions contain data in the ranges between 0-14, 15-29, 30-44, 45-59. Data in the lowest range (0 - 14) is switched out and a merge command is run, edit the script to try the different merge boundaries, edit the variables at the start to suit runtime environment 'Data Drive' & 'Log Drive' paths.Variables are redeclared but commented out at the start of code blocks to allow stepping through if desired.
--=================================================================================
-- PartitionLabSetup_20140330.sql - TAKES ABOUT 1 MINUTE TO EXECUTE
-- Creates a test database (workspace)
-- Adds file groups and files
-- Creates partition functions and schema's
-- Creates and populates 2 partitioned tables (PartitionedRight & PartitionedLeft)
[Code] ....
The T-SQL code below illustrates one of the problems caused by MERGE at the wrong boundary point. File Group 3 of the Range Right table is empty according to the data space views, it cannot be dropped though. File Group 2 contains data according to the views but you are allowed to drop it's file.
USE workspace;
DROP TABLE dbo.PartitionedRightOut;
USE master;
ALTER DATABASE workspace
REMOVE FILE PartitionedRight_f3 ;
--Msg 5042, Level 16, State 1, Line 2
--The file 'PartitionedRight_f3 ' cannot be removed because it is not empty.
ALTER DATABASE workspace
REMOVE FILE PartitionedRight_f2 ;
-- Works surprisingly although contains data according to system views.
If the wrong boundary point is used then the system 'Data Space' views show where the data should be (FG2), not where it actually still is (FG3). You can't tell if data movement between file groups is pending and the file group files are not protected from deletion by the OS.
I'm not sure this is worth raising a connect item for but it would be useful knowing where data physically resided after a MERGE RANGE and before an INDEX REBUILD, the data space views reflect the logical rather than the physical location if a data movement is pending.
View 0 Replies
View Related
Jul 14, 2015
We have a requirment, we have different databases in different servers, we need to syncronize the data in data bases in different server. created Partitioned on these tables.Â
We tried some options:
1) Snapshot replication
2)CDC (Change Data Capture)Â
What are the best approaches to achieve the same.
View 2 Replies
View Related
Oct 11, 2007
Hi all
I cannot find this info anywhere. We have Hyperion 8.x running against SQL 2000. We will upgrade side-by-side in the coming weeks. It appears that Hyperion 8.x is assured to continue to work without issue with a 2K5 database in 80 compatibility level.
I have seen the BOL table on differences between 80 and 90 (specifically affects us - WITH on Index hints), but what I wonder are about 3 specific features.
1. Do you still get DMV capabilities in 80 mode? (I ran a test on sys.dm_db_index_usage_stats for an 80 compat DB, creating index and using query to hit index, and yes - it does look as if this DMV does get populated).
2. Can you still Table Partition in 80 mode? (Ran a test in DEV, and yes - this does look possible - any gotchas?)
3. Does the optimizer behave quite differently in 80 mode?
I have tested in DEV moving a DB from 80 to 90 and back, and this seems to work with no problems.
TIA from Grand Fenwick!! ;-)
View 1 Replies
View Related
May 8, 2013
we planning to create partitioning on existing tables. The partitioning is on date column, there should be one partition for each year.
Creating of new partitions should be automated, and also we dont have any plans of archiving old data, all we want is that new partition creation should be automated.
View 6 Replies
View Related
Aug 25, 2014
We have a database and have 6-7 growing tables. All the tables have Primary and foreign key relation. I want to do partition based on the date column.
I need 3 partitions
First partition has to hold present data
second partition need to hold the previous year data (SAS storage)
Third partition need to hold all the old data and need to be in the archive database
I understand that first we need to disable the constraints (Indexes PK & FK)
Then create partition function and partition schema
Then Create the Constraints again
View 9 Replies
View Related