1 Billion Row Table (Loan Performance)
Mar 28, 2008
I will be receiving data from a company called Loan Performance, that has one file/table that will hold 1 billion rows. They send data by period, and I plan to load the data via BCP via NT/DOS scripts. The 1 billion rows represents data for 200+ periods.
Are the following design plans feasible
1. Partition table by period value, I'm not sure of the max number of partitions per table in 2005, but I think we have periods data back to 1992 and a new one gets created every month, so the possibility of having > 1000 partitions exists. I plan on just pre-creating partitions for future data, instead of dynamically creating when a new period is sent.
2. Load data via BCP in DOS shell scripts that will drop index (by partition), BCP in data, and they re-create indexes by partition, is this possible ? and will I see a performance increase as opposed to one huge table (I'm pretty much sure I will). There is usually one periods data present per day, but sometimes the vendor resends all data (would get loaded on weekend).
I'm a bit unsure of where to start being I never worked with this amount of data. I worked with partitioning in Oracle a long time ago.
I plan on having an 2XQuadCore 2.66Ghz CPU with 32GB of RAM and SQL2005EE 64Bit connected to 1 Terabyte SAN Disk.
Thanks all,
PMA
View 7 Replies
ADVERTISEMENT
Jul 30, 2007
Hi;
I want to know, if a table has 1 billion recoreds then it will work eaisly with .net objects like datasets etc?
if not, what is the normal amount of data for sql server which is easy to manage for system.
Adnan
View 2 Replies
View Related
Aug 5, 2014
It was an interview question, what's the best way to update a table with 10 billion rows.
View 16 Replies
View Related
Mar 20, 2008
HELP. I've inherited some less-than-stellar dessign.
I have a table:
tbMD
ObjectID uniqueidentifier not null
Tag nvarchar(50) not null
Data nvarchar(400) not null
This table has 1.6Billion records and has filled up the drive (nearlly 1 TB in the DB, with about 600GB in this one table).
I need to move this table to another drive with ZERO down time.
I have created a new filegroup on a second drive array and created a new table (tbMD_TEMP) and my thought was to copy all the data from tbMD into tbMD_Temp then drop tbMD, and rename tbMD_Temp.
So, anyone have any great ideas on how to get this done very quickly?
View 5 Replies
View Related
Aug 19, 1999
After moving our database between servers, one table had a hiccup in it's
identity column. The number jumped from 761 to 1886863475 on the next
insert. This is a production server and I didn't catch it until yesterday.
So I have several days worth of numbers in a central table. I can't clean
them out, but I wondered if I could do the following:
DBCC CHECKIDENT ('table_name', RESEED, 800)
I tested it on our development server and it doesn't seem to cause any
problems. I know that when we get to 1.8 billion on the identity column
again we'll get an error, but I'm willing to risk it.
A few questions. 1) Am I insane for trying this? 2) Has anyone else ran into
a similar problem and what did you do to fix it? 3) If I do this, is there
anything in particular to watch out for? 4) What caused the jump in the
first place?
My other option is to change the datatype from int to decimal(19) or
something along those lines. This will upset our programmers and I'll have
to change all my foreign keys (not a big deal, just a pain).
Any input at all here would be appreciated.
Thanks,
Grant
View 1 Replies
View Related
Jan 31, 2008
hi,
I want to Display the image records of each employee of having 10 images of each.the number of records are more then one billion.......
please provid me the optimized query that can i used in SP and display in Gridview.
View 7 Replies
View Related
Feb 29, 2012
So I just got an email from Production Support saying an hour and a half downtime is unacceptable to move a half billion rows between 2 partitions because I am moving a clustered index and space is a consideration.
I can not use partition switching because the clustered index is changing.
This is what I am doing...
1. I am creating a new table with the new cluster on a new partiton
2. I am moving the records in 5K set based batches by doing a range search on the existing clustered index on the existing table.
3. I then reapply all of the nonclustered index from the original table to the new one.
4. I do a sp_rename swap out.
The same way I have done this many times before. Is there some new secret special sauce (other than partition switching) I can use?
View 14 Replies
View Related
Mar 30, 2007
Hi everyone. I hope that my question isn't too broad, but here it goes...
I am trying to figure out the best way to scale a SQL Server database so that it can handle a billion simultaneous users querying the same tables, and can easily scale to handle many billions of simultaneous users. The database must also have 999.99% availability. The number of licences and amount of hardware needed is not an issue.
Thanks in advance.
View 15 Replies
View Related
Jan 23, 2008
Hi!
What is the difference in performance if I use a Temp-table or a local-table variable in a storedprocedure?
Why?
//Daniel
View 5 Replies
View Related
Sep 27, 2005
Hi,I have a small theoretical issue.I have one table, which is prettyu large. There is lot of evaluationsrunning on this table, that's why, each process need to wait foranother to be finished. Sometimes, for some critical functions, ittakes to long time.I don't think that I can speed up processes, by changing the indexes onthe tables (to increase scan time for example), because this issomething what I was experimenting with already, and it was not enoughtgood.My question is, will it improve performance, if I will create secondtable, exactly like this one, and I will split some evaluations, thatthe one, which defenately need to run on the source table will run onthe first one, and the second evaluations, will run on the other one.To keep data consistance between this two tables, I was thinking baouttrigger on insert on the mother table, which will transport the data toanother one.Second part is: to improve selects on the table, should I set indexeswith option of Fill factor as possible close to 100% or as possibleclose to 0%. Or maybe should I set the pad index option?What about clustered indexes. Is it better to use them if I would liketo increase performace for selects?Thanks in advanceMateusz
View 4 Replies
View Related
Jan 4, 2004
I've read that table variables give better performance than temporary tables as they are kept in memory and don't need to be recorded in transaction logs ect however I have a stored proceedure which takes 0.183 seconds to execute but when I change the one temporary table used in the proceedure to a table variable the execution time increases to 0.223.
Not much of an increase I admit but it just seems contrary to what I've read.
I want to get the best performance possible so can someone explain to me what is going on ?
View 2 Replies
View Related
Apr 30, 2007
I find that joining to the same table twice is OK, but as soon as you do it 3 or more times you get a massive performance hit.
Does anyone know the reason for this? Whats special about 3?
What's the best approach to do this sort of thing?
(I've used the SQL Server 2005 Tuning Advisor to add indexes for the query).
Rather than:
Select ..., sum(a1.<column>), sum(a2.<column>), sum(a3.<column>) from master_table
left join table_1 a1 on ...
left join table_1 a2 on ...
left join table_1 a3 on ...
group by ...
I have to select all the table and filter it using case:
Select ...,
sum(case when table_1.<column> = '...') as a1,
sum(case when table_1.<column> = '...') as a2,
sum(case when table_1.<column> = '...') as a3,
from master_table
left join table_1
group by ...
View 1 Replies
View Related
Jul 26, 2006
I know we are not allowed to benchmark SQL Server but..... It would be nice to have material to present which demonstrates the performance gains using a queue compared to insert/delete in a SQL table.
Logically it seems faster to use a queue due to the conversation grouping locking and the service broker itself. But there seems to be some overhead involved just to manage these queues that the service broker has to perform.
I am sure we are not unique with the choice to figure out if we will get a boost in performance using SQL a queue between services rather than a table to queue data. What is available to help understand the performance gains of using a queue?
View 2 Replies
View Related
Dec 28, 2007
Hi,
I have a denormalized table (done so with reason) with around 40 columns. I would never have to retrieve data for all of those columns together.
I haven't done any performance measurements yet but just wondering if anyone has ready answer to this: Will there be a performance degradation if I retrieve data from a table with many columns, even if not all columns are referred in the query? (for making it simple, lets assume that all or varchar type of columns, I just want to find out if performance degrades if there are too many columns in table)
Thanks in advance,
Sandeep
View 1 Replies
View Related
Jan 11, 2007
I created two tables one is based on partition structure and one is non-partition structure.
File Groups= Jan,Feb.....Dec
Partition Functions='20060101','20060201'......'20061201'
I am using RIGHT Range in Partition function.
Then I defined partition scheme on partition function.
I have more than 7,00,000 data in my database.
I checked filegroups and count rows. It works fine.
But When I check the estimation plan time out for query it is same for both partition table and non partition table.
View 1 Replies
View Related
Jan 11, 2007
I created two tables one is based on partition structure and one is non-partition structure.
File Groups= Jan,Feb.....Dec
Partition Functions='20060101','20060201'......'20061201'
I am using RIGHT Range in Partition function.
Then I defined partition scheme on partition function.
I have more than 7,00,000 data in my database.
I checked filegroups and count rows. It works fine.
But When I check the estimation plan time out for query it is same for both partition table and non partition table.
View 1 Replies
View Related
Jul 11, 2007
Hello -- thank you for taking the time to read this.
I have a very large table that is used both for archives and new information. To get the current information, the table is queried by many different users at various polling periods. The SELECT required includes about fifteen JOINS, and only returns about 200 rows at any given time.
So I got to thinking if it might be faster to periodically run the big query as a SELECT INTO into a smaller table and letting the polling clients query the smaller table with SELECT *. Periodically, the smaller table would be DROPPED and refereshed with another SELECT INTO.
Trouble is, the data would have to be updated once every 30 seconds, and there are inbound polls coming at the rate of about 200 per minute. It got me to thinking what might happen if a client attemtped to query the smaller table when it was in the process of being dropped and refilled.
So my question is three-part:
1) assuming a larger table of about 500,000 records and only 500 pertinent at any given time, is there any real potential of performance enhancement by switching to a SELECT INTO table?
2) if so, is there a chance of a client failing a query if the inbound query somehow collides with the DROP/SELECT INTO procedure?
3) if so, is there any way to prevent it or a better way of doing this?
Thanks again for reading, and in advance for any help you can provide. I apologize if I sound like a dummy - it's hard to fake intelligence!
View 3 Replies
View Related
Oct 21, 2007
Hello,
Working with SQL Server 2000, I have a table with the following structure:
ID (INT)
userID (INT, foriegn key)
productID (INT)
productQTY (DECIMAL(5,2))
purchaseDate(smalldatetime)
I have about a 1000 users, entering about 20-30 rows per day each, i.e ~20,000 - 30,000 new rows per day. The table might be queried with a simple "SELECT" for the products a user ordered per day or per time frame (purchaseDate column).
My question (finally) is - when should I expect to see performance degradation? Is there anything I can do to prevent it (i.e splitting this table somehow to several tables)?
Thank you all in advance
View 2 Replies
View Related
Mar 31, 2008
Monthly, I copy a table from one database to another database. I delete the original table and copy the table back speed the performance of the query on the order of 10 to 1. Why does this work?
Detail:
I have a legacy table that a small application queries about once a month. The table was poorly designed and the query runs a date range comparison on one field and has a sub query that runs string comparison against six fields. I cannot change the calling app or table design. When the app calls the query, the call times out due to the inordinate length of time. To fix this until next months query, I copy the table out, delete the original and copy back. What changes when the table is copied to another database and then copied back? The performance of the query changes from 10sec to 1.
View 3 Replies
View Related
Sep 6, 2005
Hello all,I've following problem. Please forgive me not posting script, but Ithink it won't help anyway.I've a table, which is quite big (over 5 milions records). Now, thistable contains one field (varchar[100]), which contains some data inthe chain.Now, there is a view on this table, to present the data to user. Theproblem is, in this view need to be displayed some data from this onelarge field (using substring function or inline function returningvalue).User in the application, is able to filter and sort threw this fields.Now, the whole situation starts to be more complicated, if I would likecombine this table, with another one, where is one additional much morlarger field, from which I need to select data in the same way.Problem is: it takes TO LONG to select the data according to userrequest (user access view, not table direct)Now the question:- using this substring (as in example) is agood solution, or beter todo a inline function which will return me the part of this dataset(probably there is no difference)- will it be much faster, if i could add some fields in toSource_Table, containing also varchar data, but only this part whichI'm interested in and binde these fields in view instead off usingsubstring function?Small example:CREATE TABLE [dbo].[Source_Table] ([CID] [numeric](18, 0) IDENTITY (1, 1) NOT NULL ,[MSrepl_tran_version] uniqueidentifier ROWGUIDCOL NULL ,[Date_Event] [datetime] NOT NULL ,[mama_id] [varchar] (6) COLLATE Latin1_General_CI_AS NOT NULL ,[mama_type] [varchar] (4) COLLATE Latin1_General_CI_AS NULL ,[tata_id] [varchar] (4) COLLATE Latin1_General_CI_AS NOT NULL ,[tata_type] [varchar] (2) COLLATE Latin1_General_CI_AS NULL ,[loc_id] [nvarchar] (64) COLLATE Latin1_General_CI_AS NOT NULL ,[sn_no] [smallint] NOT NULL ,[tel_type] [smallint] NULL ,[loc_status] [smallint] NULL ,[sq_break] [bit] NULL ,[cmpl_data] [varchar] (100) COLLATE Latin1_General_CI_AS NOT NULL ,[fk_cmpl_erp_data] [numeric](18, 0) NULL ,[erp_dynia] [bigint] NULL) ON [PRIMARY]GOcreate view VIEW_AllDataasselect top 100 percentisnull(substring(RODZ.cmpl_data,27,10),'-') as ASO_NO,(RODZ.mama_type + RODZ.mama_Id) as MAMA,isnull(substring(RODZ.cmpl_data,45,5),'-') as MI,isnull(substring(RODZ.cmpl_data,57,3),'-') as ctl_EC,isnull(substring(RODZ.cmpl_data,60,3),'-') as ctl_IC,RODZ.Date_Event as time_time,RODZ.sn_no as SNFROMSource_Table RODZ with (nolock)goThanks in advanceMateusz
View 6 Replies
View Related
May 2, 2007
Hi :I have a TableA with around 10 columns with varchar and numericdatatypesIt has 500 million records and its size is 999999999 KB. i believe itis kbi got this data after running sp_spaceused on it. The index_size wasalso pretty big in 6 digits.On looking at the tableAit didnot have any pks and hence no clustered index.It had other indicesIX_1 on ColAIX_2 on ColBIX_3 on ColCIX_4 on ColA, ColB and ColC put together.Queries performed in this table are very slow. I have been asked totune up this table.I know as much info as you.Data prior to 2004 can be archived into another table. I need to run aquery to find out how many records that is.I am thinking the following, but dont know if i am correct ?I need to add a new PK column (which will increase the size of thetableA) which will add a clustered index.Right now there are no clustered indices2. I would like help in understanding should i remove IX_1, IX_2, IX_3as they are all used in IX_4 anyway .3. I forget what the textbox is called on the index page. it is set to0 and can be set from 0 to 100. what would be a good value for it ?thank you.RS
View 8 Replies
View Related
Aug 13, 2007
Right now, a client of mine has a T-SQL statement that does thefollowing:1) Create a temp table.2) Populate temp table with data from one table using an INSERTstatement.3) Populate temp table with data from another table using an INSERTstatement.4) SELECT from temp table.Would it be more efficient to simply SELECT from table1 then UNIONtable 2? The simply wants to see the result set and does not need tore-SELECT from the temp table.
View 1 Replies
View Related
Oct 19, 2007
Hi folks,
Environment:
We change from SQL Server CE 2.0 to SQL Server CE 3.0, and we got our customer complaining about the performance lost in the process that makes 50 updates in the Pocket PC with windows mobile 5.0.
Our customer says that when they used the SQL Server CE 2.0 the process was quicker.
Process:
We need to do an update to 50 rows every time we lose the focus. Those 50 update are creating a deterioration of performance.
There is any Patch to add to SQL Server CE 3.0? There is any problem updating 50 times on a row losing performance?
Thanks in advance.
View 1 Replies
View Related
Sep 28, 2007
I have an existing database with a table of about 50 milion records. There are also about 20 other tables, but they are alot smaller. The large table has a uniqueidentifier as it's Primary key (not sequential) and a forien key to a 'parent' table. The table also has a column telling when it was created. So, a bit simplified, it looks like:
ChildTable
---------------
Id uniqueidentifier <PK>
ParentId uniqueidentifier <FK>
CreationDate DateTime
ParentTable
-----------------
Id uniqueidentifier <PK>
CreationDate DateTime
Most of the questions accessing the Child table (the large table) is doing so by referensing the parent table, and not the CreatingDate, i.e.
SELECT *
FROM ChildTable
WHERE ParentId = '......'
All records with a specific ParentId will have very similiar CreationDates.
Now, my question is, will Partitioning the ChildTable boost performance for me? In case it will, what column(s) would define the Partitions? If I do it by CreationDate, a select-query like the one above will have to scan all partitions anyway, doesn't it? Doing it by Id isn't soo easy either I guess? If it helps, it might be possible to change the primary keys in the tables to have sequential guids.
Is there perhaps a performance tool to get help with suggestions about how to partition the table? Something like the 'Performance dashboard' reports, but for partitioning?
Regards Andreas
View 10 Replies
View Related
Aug 15, 2007
We have a table that is 800GB. We are planning to re-build the clustered index on this table to a different filegroup. The new filegroup and files associated with it will sit on a SAN which will have a 1.5TB allocation. Does anyone have any suggestions in regards to how many files to have associated with the filegroup to provide optimal performance? Apparently we could have 3 LUNS (500gb each), so would 1 file on each LUN provide additional performance as opposed to one file on 1 LUN?
View 1 Replies
View Related
Aug 18, 2006
I am making a ASP.NET web application that involves 2 SQL Server(A & B).
I created a view in SQL server A pointing to the table in SQL Server B. I found out my application will run REALLY slow when accessing such a view. so I try to avoid using them. But in the case of 2 table joining from 2 different SQL Servers, I have no choice.
Can anyone help me with this?
Thanks!
View 4 Replies
View Related
Feb 11, 2004
I need to know what is the best performance for needing to do calculations for a particular column. I want to do something like:
Select IID
, ItemNo
, StdRun
, ActRun
, dbo.fnCalc(OutCount)
From myTable
The function is basically a set of Case Statements and various calculations dependant upon the Case.
Is this the best (performance wise) way to do it or should I dump the needed info in a Temp Table and do the calcs on it and then tie the select statement to the table.
I've seen both approaches done, but they both seem to be a different way of getting to the same conclusion. I'm just wondering which puts the lightest load on the server.
Thanks,
Tim
View 2 Replies
View Related
Sep 20, 2007
I have two table A and B:
A 2000000 Rows 569288KB 8KB index
B 3000000 Rows 853712KB 8KB index
But when do "SELECT COUNT(*) FROM A/B", table B is significantly slower than A:
A 0 secend
B 8 seconds
Does anyone know why? So I can boost the performance to search table B.
Thanks in advance.
View 8 Replies
View Related
Nov 19, 2007
I have two databses SIS and SIS_Pro. Users tables should be used in both of them because I have some relations between this table with other table in SIS and SIS_Pro. Users in SIS only have one column and it is the UserId which is the primary Key in both of them, but in SIS_Pro Users table have Firstname Lastname and... now. In my program I need some informatin from SIS and some from SIS_Pro so I create a view which is joining of forexample exam in SIS and Users in SIS_Parnian, becuase I don't have the firstname and lastname in a Users table which is in SIS_Pro databse.Does it reduce the performance?is it better to copy datas which are in Users in SIS to Users in SIS_Pro( I mean all columns firstname, lastname ,,.....)
Sincerely
Kianoosh
View 1 Replies
View Related
Sep 16, 2006
An application uses a database table having proprietary information. We do not want our customer to be able to look at that.
This being a real-time application, performance can not be sacrificed. What is the best way to keep the table data non-viewable without sacrificing the performance?
View 4 Replies
View Related
Sep 15, 2015
Altering a table which is having more than 100 million rows. Would like to know the best possible way to add a new column to this table without impacting the performance much.
View 7 Replies
View Related
Jul 9, 2001
In my database/MY SERVER (SQL7/Win2K), I run a simple query with a Table/10000 rows (without cluster index):
SELECT * FROM TABLE
it take over 30s. Why it's slow? How can I check for reason? How to configure my server to improve performance?
Thanks in advance.
TH
----------------------------------
SP_CONFIGURE's RESULT in MY SERVER
----------------------------------
Table 'spt_values'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0.
name minimum maximum config_value run_value
----------------------------------- ----------- ----------- ------------ -----------
affinity mask 0 2147483647 0 0
allow updates 0 1 1 1
cost threshold for parallelism 0 32767 5 5
cursor threshold -1 2147483647 -1 -1
default language 0 9999 0 0
default sortorder id 0 255 52 52
extended memory size (MB) 0 2147483647 0 0
fill factor (%) 0 100 0 0
index create memory (KB) 704 1600000 0 0
language in cache 3 100 3 3
language neutral full-text 0 1 0 0
lightweight pooling 0 1 0 0
locks 5000 2147483647 0 0
max async IO 1 255 32 32
max degree of parallelism 0 32 0 0
max server memory (MB) 4 2147483647 2147483647 2147483647
max text repl size (B) 0 2147483647 65536 65536
max worker threads 10 1024 255 255
media retention 0 365 0 0
min memory per query (KB) 512 2147483647 1024 1024
min server memory (MB) 0 2147483647 0 0
nested triggers 0 1 1 1
network packet size (B) 512 65535 4096 4096
open objects 0 2147483647 0 0
priority boost 0 1 1 1
query governor cost limit 0 2147483647 0 0
query wait (s) -1 2147483647 -1 -1
recovery interval (min) 0 32767 0 0
remote access 0 1 1 1
remote login timeout (s) 0 2147483647 5 5
remote proc trans 0 1 0 0
remote query timeout (s) 0 2147483647 0 0
resource timeout (s) 5 2147483647 10 10
scan for startup procs 0 1 0 0
set working set size 0 1 0 0
show advanced options 0 1 1 1
spin counter 1 2147483647 10000 10000
time slice (ms) 50 1000 100 100
two digit year cutoff 1753 9999 2049 2049
Unicode comparison style 0 2147483647 196609 196609
Unicode locale id 0 2147483647 1033 1033
user connections 0 32767 0 0
user options 0 4095 0 0
Table 'spt_values'. Scan count 43, logical reads 108, physical reads 0, read-ahead reads 0.
Table 'sysconfigures'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 2.
View 4 Replies
View Related
Oct 16, 2013
I need to figure out the correct update statement syntax for the following integration.
I have a "Performance Table" which i insert weekly performance numbers into for each store. The table is constructed w/ columns such as Store, Weekenddate, Sales, Refunds, #ofPatients
In a "Averages Table" i have every weekenddate for each store populated. So 52 Weeks for 10 stores = 520 Rows of Store numbers & WeekendDates.
What i would like to do is run a loop or update statement which would update the store average for each weekendate based on the last 13 weeks.
This is my query
update performancestore_avgs set SalesAvg =
(select sum(SalesHit)/Count(Store) from performance_store where performance_store.weekenddate >= performancestore_avgs.weekenddate-84 and performancestore_Avgs.store = performance_store.store)
The update statement runs but the averages are completely wrong.
View 3 Replies
View Related