DB Design :: Table Partitioning Using Reference Table Data Column
Oct 7, 2015
I have a requirement of table partitioning. we have 10 years of data on a table which is 30 billion up rows on 2005 server we are upgrading it to 2014. we have to keep 7 years of data. there is no keys on table or date column. since its a huge amount of data and many users its slow down the process speed. we are thinking to do partition on 7 years for Quarterly based. but as i said there is no date column on table we have to use reference table to get date. is there a way i can do the partitioning with out adding date column on table? also does partition will make query faster?Â
I have think three ways to do it.
1. leave as it is.
2. 7 years partition on one server
3. 3 years partition on server1 and 4 years partition on server2 (for 4 years is snapshot better?)
I have gone through the table partitioning in MSDN, like Designing Partitions to Manage Subsets of Data . But how to do this in actual world, since some db need to partition for 7 days, then archive these days records on the 8th day, while other prefer 14 days/monthly, and run this repetitively for many years? If this is running weekly, how can i generate the scheme and function dynamically? What if ID for row count is not viable?
Sure this will not work:
Code Snippet CREATE PARTITION FUNCTION [TransactionRangePF1] (datetime) AS RANGE RIGHT FOR VALUES ('10/01/2003', '10/02/2003', '10/03/2003', '10/04/2004', '10/05/2004', '10/06/2004', '10/07/2004'); GO
We are facing few issues pertaining to creation of primary key on a non - partitioned column in sql server 2005. Herewith attaching the text file containing the detailed scenario.
Pls advice.
Pls find some of the scenario with the example given below:
We have done the following steps 1.Creating a Partition Function CREATE PARTITION FUNCTION pf_EncounterODS_StateID (CHAR(2)) AS RANGE RIGHT FOR VALUES ( 'CA', -- CA 'MI', -- MI 'NM', -- NM 'OH', -- OH 'TX', -- TX 'UT', -- UT 'WA' -- WA ) GO 2.Creating a Partition Schema CREATE PARTITION SCHEME [ps_EncounterODS_StateID] AS PARTITION pf_EncounterODS_StateID TO ( [PRIMARY], [ENC_DM_DATA_01], -- CA [ENC_DM_DATA_03], -- MI [ENC_DM_DATA_04], -- NM [ENC_DM_DATA_05], -- OH [ENC_DM_DATA_06], -- TX [ENC_DM_DATA_07], -- UT [ENC_DM_DATA_02] -- WA ) GO
4.Creating a primary key on validationErrorSID column in the fact table ALTER TABLE [Fact] ADD CONSTRAINT NQValidationError PRIMARY KEY CLUSTERED ([ValidationErrorSID]) Step 4 throws an error as --------------------------------- Column 'StateID' is partitioning column of the index 'NQValidationError'. Partition columns for a unique index must be a subset of the index key. If we include StateID along with ValidationErrorSID for index,then it works fine.But we need to have only ([ValidationErrorSID]) for indexing.
We have a database and have 6-7 growing tables. All the tables have Primary and foreign key relation. I want to do partition based on the date column.
I need 3 partitions
First partition has to hold present data second partition need to hold the previous year data (SAS storage) Third partition need to hold all the old data and need to be in the archive database
I understand that first we need to disable the constraints (Indexes PK & FK) Then create partition function and partition schema Then Create the Constraints again
We have a requirment, we have different databases in different servers, we need to syncronize the data in data bases in different server. created Partitioned on these tables.Â
We tried some options:
1) Snapshot replication 2)CDC (Change Data Capture)Â
I have to tables like given below Landing table "A" (Data load will happen over here, No primary keys mentioned over here) table "B"Â .Now I want to move the data from A to B.I have made use of below query insert into B select * from A...Landing table "A" has huge no of records, MS SQL server is taking huge amount of time.any alternative way to make this insertion process faster?
Hi all, I am not over familiar with SQL, I am a VB programmer, simply I need to achieve the following within Enterprise Manager.
I have 2 tables, different designs, different number of rows, I simply need to check whether the contents of a column in the first table is in a column in the second table, just simply a table/column to table/column data check for the same data content.
Easy Peasy for you guys, any help would be appreciated.
We want to add a new int identity column as a primary key to an already existing table that has a primary key on Guid. Here is the DDL:
CREATE TABLE [dbo].[VRes]( [VResID] [uniqueidentifier] NOT NULL, [Mes] [varchar](max) NOT NULL, [PID] [uniqueidentifier] NOT NULL, [Segt] [int] NOT NULL,
[code]....
Also we currently have 3 million rows on this table. Is having an integer column as identity column and primary key better or shd I consider using BigInt?
I'd like to create a table that will store different order items. Several order items make up one single order. Order items can have 0 or more children (max depth will never be deeper than one). Order items can have up to 150 attributes/values. The way I think this should be done is using XML column instead of the EAV type of model. My table structure currently looks like this:
* child_order_item_id (PK) * parent_order_item_id (FK to child_order_item_id) * order_id (FK to Order table) * product_id (FK to Product table) * price * attribute_XML
How my attribute_XML should look like or how to validate the xml.
I have come up with one scenarios where I have three table like Product, Services and Subscription. I have to create one table say Bundle where I can have some of the product id , service id and Subscription id , i.e. a bundle may contains sum prduct , services and subscription . How I can design these relations ?
If on the source I have a new column, the script generated by SqlPackage.exe recreates the table on the background with moving the data into a temp storage. If the table is big, such approach can cause issues.
Example of the script is below: in the source project I added columns [MyColumn_LINE_1]  and [MyColumn_LINE_5].
Is there any way I can make it generating an alter statement instead?
BEGIN TRANSACTION; SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; SET XACT_ABORT ON; CREATE TABLE [dbo].[tmp_ms_xx_MyTable] ( [MyColumn_TYPE_CODE] CHAR (3) NOT NULL,
[Code] ....
The same script is generated regardless the table having data or not, having a clustered or nonclustered PK.
I am trying to create a table that holds info about a user; with the usual columns for firstName, lastName, etc.... no problem creating the table or it's columns, but how can I "restrict" the values of my State column in the 'users' table so that it only accepts values from the 'states' table?
I am trying a create views that would join 2 tables:
Table 1: Has all the columns need by a view ( Name: Product Structure: ID, Attribute 1, Attribute 2, Attribute 3, Attribute 4, Attribute 5 etc Table 2: Is a lookup table that provides the names of columns Name: lookupTable Structure: tableName, ColumnName, columnValue Values: Product, Attribute1, Color Product, Attribute2, Size Product, Attribute3, Flavor Product, Attribute4, Shape
Hello,I have a query that I need help with.there are two tables...Product- ProductId- Property1- Property2- Property3PropertyType- PropertyTypeId- PropertyTypeThere many columns in (Product) that reverence 1 lookup table (PropertyType)In the table Product, the columns Property1, Property2, Property3 all contain a numerical value that references PropertyType.PropertyTypeIdHow do I select a Product so I get all rows from Product and also the PropertyType that corresponds to the Product.Property1, Product.Property2, and Product.Property3ProductId | Property1 | Property2 | Property3 | PropertyType1 | PropertyType2 | PropertyType3 PropertyType(1) = PropertyType for Property1PropertyType(2) = PropertyType for Property2PropertyType(3) = PropertyType for Property3I hope this makes sence.Thanks in advance.
Hi, I am new to SQL 2005. I have to design schema for scientific data warehouse. Data is available in 2 or more flat data files recorded at 1 sec interval. At Least 2 of the data files have 100+ columns. I am inclined to create a table per data file type. I want to know If this is correct/optimal for me to do?
I don't think I can create normalize tables based on the headers in these Data files.
Primary Objective of this data warehouse is make it available for reporting services and Analysis Services.
Hi, I am new to SQL 2005. I have to design schema for scientific data warehouse. Data is available in 2 or more flat data files recorded at 1 sec interval. At Least 2 of the data files have 100+ columns. I am inclined to create a table per data file type. I want to know If this is correct/optimal for me to do?
I don't think I can create normalize tables based on the headers in these Data files.
Primary Objective of this data warehouse is make it available for reporting services and Analysis Services.
I have data that is similar in nature to stock market data. There are about 100 entries per day. I would like to setup a second table to supply chart information. (assuming 1000 pixels per chart max)Â
What is the best way to condense the information in this table so that if I query for a chart, the table returns about 1000 points? I could have several tables setup, one for short term, and one or two for longer term. But I still don't know how to condense them.Â
Hi, All, I have agentID in product table. Now I add agentID column in transaction table. Now I want to copy all agentID from product table to transaction table based on the order_id in both table. Can you show me an example? Thanks Betty
i have a table named "user" in which user which are located at different places within a city are recorded. i want to group user with respect to there location like users of northern region are recorded first then users of western region and so on. tell me from horizontal and vertical partitioning wh technique is better or i should use some other technique. thanks for ur consideration.
Hi, I want to know more on table partitioning.I do not know where to get the right info.from. I have a doubt - if a table is partitioned horizontally how does a query identifies where to pick up the data from i.e. from which part of partitioned table?
i want to partition a table containing about 3 million rows. The partition column will be of datetime type.
following is the partition function i have used create partition function MyPartFun (datetime) as range left for values ('07/30/2007','09/30/2007','11/30/2007','01/30/2008','04/30/2008')
following is the partition scheme i have used create partition scheme PartScheme as partition MyPartFun all to ([primary])
i know how to add partition column while creating the table But dont know how to add above partition scheme to an already populated table Plz help...
Hi, I have a database created using Enterprise Manager Wizard. For example datafile db1_data.mdf and log file db1_log file exists. All the tables are created in datafile db1_data.mdf. Now to improve performance I want to implement table partitioning. Can anybody tell me howto implement it with existing strutcure. Suppose there is table Mytable in which all update and delete actions are performed regularly.And it contains about 10,0000 records. I want to partition the table so that it contains 5000 records.
Hi Experts, I am new to Table Partitioning, Can any body guide me how to do table partitioning? any way here is my scenario, we are having one database called "DATA" in SQL 2000 server and we have migrated to SQL 2005 by using backup and restore. and "DATA" is having about 15 tables and they are very very very big in size. and they dont have any index on a coulum name "DATETIME", but i want make table partition according to that perticular field "DATETIME" and right present we are having 6 months of data. So, how to proceed further? Your help will be appreciable..
i am trying to partition an sql table in sql server 2005, i created the partition schema and the data files that i want the data to be filled in after the partition. After the partition is finished sql gave me partition is successful , but i noticed that the size of data files i created has not increased and their sizes are the same.
notice: i have a clustered index on this table, so i dropped this index and recreated it
I have a very large table that I am trying to partition and use to reduce maintenance overhead as well as improve performance. The table contains about 12 years worth of data but only the most recent years is inserted/updated/deleted from thru the app. I created partitions on a computed(persisted) column which holds the "year" value derived from a date column. I have created the partitions with all the default set options, and the stored procedure which performs the delete against this table also was created with no special set options(basically database/session default). Yet, every time I try to run the proc to delete data thru the app, I get this error:
Msg 1934, Level 16, State 1, Procedure xxxx, Line 118 DELETE failed because the following SET options have incorrect settings: 'ANSI_WARNINGS'. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and/or query notifications and/or XML data type methods and/or spatial index operations.
I've tried setting ANSI_WARNINGS on and off when creating the proc, inside the proc etc.., its always the same error whatever I set the option to.
I have 3 tables (accnt, jobcost, and servic15). all with the same fields (code, jno, ven, date). I need to insert the data from these tables into another table called dummy with the same fields, in one statement or query.
I have a Sql Server 2005 database with many tables, each with millions of records within them.
They all have a Receive Date field, with records going back 10 years or so.
What would be the best way to partition it? I was thinking of partitioning them by years, but that would give me 10+ partitions -- would that be alot of overhead?
Hi folks! I'm looking for advice on partitioning a large table. In the DDL below I've changed names to protect the guilty.
My table has this schema:
CREATE TABLE [dbo].[BigTable] ( [TimeKey] [int] NOT NULL, [SegmentID] [int] NOT NULL, [MyVal] [tinyint] NOT NULL ) ON [BigTablePS1] (TimeKey) -- see below for partition scheme
-- will evaluate whether this one is needed, my thinking is yes -- based on the expected select queries. create index NCI_SegmentID on BigTable(SegmentID asc)
The TimeKey column is sort of like a unix time. It's the number of minutes since 2001/01/01, but always floored to a 5 minute boundary. so only multiples of 5 are allowed.
Now, this table will be rather big. There are about 20k possible SegmentIDs. For every TimeKey from 2008/01/01 to 2009/01/01 (12 months), I'll have on the order of 20000 rows, one for each SegmentID.
For the 12 month period, there are 365*24*60/5=105120 possible TimeKey values. So the total rowcount is over 2 billion. (20k * 105120)
Select queries are expected to be something like this:
-- fetch just one particular row... select MyVal from BigTable where TimeKey=5555 and SegmentID=234234
--fetch for a certain set of SegmentID and a particular time... select b.SegmentID ,b.MyVal from BigTable b join OtherTable t on t.SegmentID=b.SegmentID where b.TimeKey=5555 and t.SomeColumn='SomeValue'
Besides selects, also I need to be able to efficiently issue update statements against the table with new values in the MyVal column based on a range of TimeKey values (a contiguous span of a few days) and sets of about 1000 SegmentID. updates would always look like this:
update t set t.MyVal=p.MyVal from BigTable t join #myTempTable p on t.TimeKey=p.TimeKey and t.SegmentId=p.SegmentId
where #myTempTable would have order of 1000*24*60 rows in it, all with contiguous TimeKey values, and about 1000 different SegmentID values. #myTempTable also has a clustered pk on (timekey asc, SegmentId asc).
After the table is loaded, it would never get any inserts or deletes. only selects and updates.
Given the size, and the nature of the select and update queries, this table seems like a good candidate for partitioning. I'm thinking it makes sense to partition on TimeKey.
So my question is, is it stupid to create a separate partition for each day in the year long span of TimeKeys this table covers? That would mean 365 partitions in the partition function and partition scheme. Something like this:
CREATE PARTITION FUNCTION [BigTableRangePF1] (int) AS RANGE LEFT FOR VALUES ( 3680640 + 0*1440, -- 3680640 is the number of minutes between 2001/01/01 and 2008/01/01 3680640 + 1*1440, 3680640 + 2*1440, 3680640 + 3*1440, ...snip... 3680640 + 363*1440, 3680640 + 364*1440, 3680640 + 365*1440 ); GO
CREATE PARTITION SCHEME [BigTablePS1] AS PARTITION [BigTableRangePF1] TO ( [PRIMARY],[PRIMARY],[PRIMARY], ...snip... [PRIMARY],[PRIMARY],[PRIMARY] ); GO
does anyone have any experience with partitioned tables with so many partitions? Is a few hundred partitions too many? From my understanding of partitions, seems like having so many will be ok. Is it somehow worse than having hundreds of tables in a database?
Even with one partition for each day, I'll still have 24*60*20000/5 ~ 5m rows in each one.
Hi!I have a question:I already have a DB that uses partitions to divide data in USCounties, partitioned by state.Can I use TWO levels of partitioning?I mean... 3077 filegroups and 50 partition functions that addressthem, but can I use another function to group the 50 states?Thanks!Piero
We had data in tables for multiple users (Logins) .Each user data is identified by a one column named €œUSER€?. No user has direct access to tables and only through views .we have created views and stored proc .Views will perform DML operations on tables using condition WHERE USER=SUSER_SNAME() (i.e Logged in user).So no point of getting others user data.
Each table has a column USER and we are queering data based on login user .this is the foreign key of USER table. Each view contains user column in where clause .So for every query we are searching all records .instead of that is there any way to get data with out searching all records.
I heard about table Partitioning, index Partitioning, view Partitioning. Are they helpful to boost my query performance?
And also let me know is there any good way of designing apart from above options
I have the following doubt about table lockinglocking in case of partitioning:-
Say we have 5 partition on the table Employee on the key Joining_Date and when we run 5 select queries on each of the parition in parallel will there be locking on the table when the 1st query is running or all the 5 queries can run in parallel. Basically, I am trying to see if parallelism and partitioning can work in sync or there will be locking at the table level if I don't specify any query hints?