DataWarehouse - 140 Million Rows - Load Performance Issue
Mar 10, 2000
Need help, I am managing a Data Warehouse (80 G.B Database Size), I purge older than 6 months data from a table which has more than 140 Million rows on daily basis. The daily data load performance is degrading. The table has no clustered indexes (only non-clustered indexes).
Tried dropping and rebuilding the non-clustered indexes, didn't work.
One way to solve the problem is drop the non-clustered index, bcp out the data, truncate the table and bcp in the data and rebuild the non-clustered indexes. This is too risky and taking 14 hours to bcp out the data.
This was not the issue in SQL Server 6.5, because SQL 6.5 always insert new record indexes at the end of the heap link (heap = non-clustered indexes without clustered index). In contrast, SQL Server 7.0 first checks for available space in existing pages by using percent free space pages (this is where it is killing the performance ).
Thanks for your help!
View 4 Replies
ADVERTISEMENT
Mar 2, 2007
Could not load file or assembly 'Microsoft.DataWarehouse, version 9.0.242.0, Culture=neutral,publicKeyToken=89845dcd8080cc91' - the system cannot find the file specified.
I get this message when I try to do a forcast from the Data Mining - Forecast menu.
Forecast from Table Tools, Analyse, Forecast works perfect.
Running Vista and using the DMAddins_SampleData workbook.
Any ideas ?
Trond
View 8 Replies
View Related
Dec 12, 2014
I run the following statement and it will not update beyond 7 million plus rows and I have about 38 million to complete. I keep checking updated row counts and after 1/2 day it's still the same so I know something is wrong because it was rolling through no problem when I initiated it. I need to complete ASAP so it's adding to my frustration. The 'Acct_Num_CH' field is an encrypted field (fyi).
SET rowcount 10000
UPDATE [dbo].[CC_Info_T]
SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v'
WHERE [Acct_Num_CH] IS NOT NULL
WHILE @@ROWCOUNT > 0
BEGIN
SET rowcount 10000
UPDATE [dbo].[CC_Info_T]
SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v'
WHERE [Acct_Num_CH] IS NOT NULL
END
SET rowcount 0
View 5 Replies
View Related
Apr 9, 2008
I'm new to using a DB and have a few questions about what I'm trying to do. I have some historical options data and want to place it into a sql express database. (I understand I might need to use a none express version once the db gets to big.) A months worth of data is over 5.5 million rows of data. So six years worth is ~400 million rows. Is it possible to put this into a sql db and be able to search it very fast? I have a months worth in a db now and it is pretty slow. Should I use a new table for each month and then have 6 years * 12 month = 72 tables to increase the search speed? I search by date and stock_symbol and the data looks like this:
Date, Stock_Symbol, Option_Symbol, Strike, BidPrice, AskPrice, Volume, OpenInterest, (and a few others)
The select statement is simple: SELECT * FROM Options WHERE Date = @Date and StockSymbol = @Symbol
Thanks
View 4 Replies
View Related
Mar 21, 2000
In our database, we have a very large table that gets updated every morning, start of the day is copying 4 million rows from the fact table from previous date to today's date in the same table and then some other processing. It takes 1 1/2 to 2 hrs to do this. There is a dts package created to copy these rows into temp table and then to this fact table.
This table has more than 200 million rows
Any ideas on how to accomplish this without doing the copy twice and not running into locking problems.
Thanks for any suggestions.
View 5 Replies
View Related
Jul 16, 2013
i am deleating 8 Million rows from my database,I am wondering how to control T-Log,also I heard something about row lock and table lock
View 4 Replies
View Related
Feb 27, 2015
i have a following table
table name : emp_master
empid efname emname elamane efathername emothername deptno edob edoj createdby updateby lastupdatedatetime lastactionperformed
empid is primarykey.
this table contains 20million of records and i want to fire following query on this to get employye all data where eployee is more than 10 year old
select empid ,efname, emname, elamane, efathername, emothername, deptno ,edob ,edoj ,createdby, updateby, lastupdatedatetime ,lastactionperformed
from emp_master
where year(doj)+10 > year(getadate())
this will return approx 10 million rows and taking 18 mins. tune this query what approaches should i take to reduce the time of execution.
View 6 Replies
View Related
Mar 24, 2008
Hi,
i have to load 1million rows from database( or flatfile) to the database(or flat file).
which task is used as the best solution for this?
Appreciate any assistance in this regard.
Thanks,
Das
View 5 Replies
View Related
May 4, 2000
I need to update about 1.3 million rows in a table of mine.
I am getting the data from one of the columns of the same table and
updating the new column.
I am doing this using a cursor which I have put in a stored procedure.
As this is a production table which users might be accessing.It is a
web based application and I can't slow the system down.
So I am willing to run the stored prcedure during off peak hours.
However, do I need to put this in a transaction?
If I did put it in a transaction what type of isolation level should I
opt for?
Data integrity is very important for me and I don't mind to compromise
on the performance.
I am doing this because one of the columns which has "short description"
entry is has become too small for business purposes and we want to increase it's
length from varchar(100) to varchar(150).
As this is SQL 6.5, I can't increase the lenght of the column.
So Iadded a new column and will run the stored proc.
What precautions are to be taken?
This is on a high priority basis and very important too.
Thanks in advance...
Stored procedure code:
USE DB_Registration_Dev
GO
IF EXISTS (SELECT NAME FROM SYSOBJECTS WHERE NAME='usp_update_product' AND TYPE='P')
DROP PROCEDURE usp_update_product
GO
CREATE PROC usp_update_product
AS
DECLARE @short_desc varchar(100)
DECLARE @prod_id int
DECLARE sdesc_curs CURSOR
FOR
SELECT [Product].[product_id] , [Product].[short_description]
FROM Product
OPEN sdesc_curs
FETCH NEXT FROM sdesc_curs
INTO @prod_id, @short_desc
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE Product
SET [Product].[sdesc] = @short_desc
WHERE Product_id=@prod_id
FETCH NEXT FROM sdesc_curs
INTO @prod_id, @short_desc
IF @@FETCH_STATUS <> 0
PRINT ' Finished Updating the table...go ahead and have fun ...! '
END
DEALLOCATE sdesc_curs
GO
View 1 Replies
View Related
Mar 12, 1999
Hi, I used the /e in my bcp code. yet did not get all the rows from the main frame into the sql talbes... here is the case I have 11 million rows in an ftp server I use this code to bcp into sql server can anyonecheck if this code is good for the process, I am missing one million row in the bcp process and do not know why??? I put the /e to see if there is any error but could not see any error file in my hard drive?
Please check it out and let me know
regards
Ali
Exec master..xp_cmdshell "bcp dbname..tablename in c:ftprootNbtorder.txt /fd:ftprootformatfileablename.fmt /Servername /Usa /Password /b250000 /a8000 /eerrfileORD"
View 2 Replies
View Related
Feb 27, 2008
I'm looking for some performance assistance on updating a column value in a table that contains approximately 50 million rows. I have a permanent table in another database that has the key column and value to be set. My query is listed below, but I'm afraid it will run quite awhile. Any suggestions would be appreciated.
update mytable
set column2 = b.column2
from mytable as a
join mytable1 as b
on a.column1 = b.column1
There is a one to one relationship between the two tables.
View 8 Replies
View Related
Aug 21, 2007
I have 1+ CSV files (using a foreach loop) which I'm doing a lot of transform work on and then inserting into a SQL database table.
Each CSV file usually contains about 2 days worth of data (contains date stamps) - somewhere in the region of 60k records per day.
The destination table currently contains 3 million+ rows and will get bigger.
I need to make sure that before inserting into the destination table, the data doesn't already exist.
I've read the following article: http://www.sqlis.com/311.aspx
While the lookup method works, it takes ages and eats up memory as it caches the 3m+ records before running for each CSV. Obviously this will only get worse as the table grows in size.
To make things a little more efficient what I'd like to do, is first derive the dates I'm dealing with in the current file - essentially storing the max(date) and min(date) in variables. Then in the lookup SQL use those vars, to reduce the amount of data that needs to be brought into the transformation to check against before inserting into the destination table.
Lookup SQL eg. SELECT * FROM MyTable WHERE Date BETWEEN varMinDate AND varMaxDate.
Ideally I'd use an aggregate transformation and then use the subsequent output from that either in the lookup query or store the output in vars, but I don't think you can do that and I get the feeling I'm approaching this with the wrong mindset.
Any thoughts would be great!
View 6 Replies
View Related
Jun 12, 2015
I have a requirement to delete 1 Million records from a table having 10 Million data and it's being queried on 24/7 basis (don't have a downtime). how can I achieve that?
View 13 Replies
View Related
Feb 1, 2005
1) I write a select statemnet
select a.columname1, b.columname1
from table1 a,table2 b
where a.columname2 = b. columname3
How do i compare
a.columname2 <--> int type column
while b. columname3 <--->varchar type
how should i use convert function
2) whats the best way to import 1.5 million rows ?
How about text file
View 2 Replies
View Related
Jul 29, 2014
I am doing a performance testing for In-memory option is sql server 2014. As a part I want to insert 500 million rows of records into a in-memory enabled test table I have created.
I need a sample script to insert 500 million records into a table ....
View 9 Replies
View Related
Sep 17, 2015
I have been tasked with writing an update query to update a table with more than 150 million rows of data. Here are the table structures:
Source Tables :
OC
CREATE TABLE [dbo].[OC](
[OC] [nvarchar](255) NULL,
[DATE DEBUT] [date] NULL,
[DATE FIN] [date] NULL,
[Code Article] [nvarchar](255) NULL,
[INSERTION] [nvarchar](255) NULL,
[Code] ....
The update requirement is as follows:
DECLARE @Counter INT=0 --This causes the @@rowcount to be > 0
while @@rowcount>0
BEGIN
SET rowcount 10000
update r
set Comp=t.Comp
[Code] ....
The update took more than 48h and didn't terminate , how to accelerate it ?
View 6 Replies
View Related
Aug 30, 2007
I have a row that is being used log track plays on our website.
Here's the table:
CREATE TABLE [dbo].[Music_BandTrackPlays](
[ListenDate] [datetime] NOT NULL DEFAULT (getdate()),
[TrackId] [int] NOT NULL,
[IPAddress] [varchar](20)
) ON [PRIMARY]
There's a CLUSTERED INDEX on ListenDate ASC and a NON CLUSTERED INDEX on the TrackId.
I have a TRIGGER on the Music_BandTrackPlays table that looks like the following:
CREATE TRIGGER [trig_Increment_Music_BandTrackPlays_PlayCount]
ON [dbo].[Music_BandTrackPlays] AFTER INSERT
AS
UPDATE
Music_BandTracks
SET
Music_BandTracks.PlayCount = Music_BandTracks.PlayCount + TP.PlayCount
FROM
(SELECT TrackId, COUNT(*) AS PlayCount
FROM inserted
GROUP BY TrackId) AS TP
WHERE
Music_BandTracks.TrackId = TP.TrackId
When a simple INSERT statement is done on the Music_BandTrackPlays table, it can take quite a long time. When I remove the TRIGGER the INSERTs are immediate. The Execution plan for the TRIGGER shows that a 'Inserted Scan' is taking up most of the resources.
How exactly is the pseudo 'inserted' table formed?
For now, I think the easiest thing to do is update my logging page so it performs 2 queries. One to UPDATE the Music_BandTracks table and increment the counter, and perform the INSERT into the Music_BandTrackPlays table seperately.
I'm ok with that solution but I would really like to understand why the TRIGGER is taking so long. The 'inserted' pseudo table will be 1 row 99% of the time. Does SQL Server perform a table scan on all 20 million rows in order to determine what's new and put it in the inserted pseudo table?
Thanks!
View 6 Replies
View Related
Apr 24, 2013
IF NOT EXISTS (SELECT TOP 1 1 FROM dbo.syscolumns WHERE id = OBJECT_ID(N'dbo.Employee) and name = 'DoNotCall')
BEGIN
ALTER TABLE [dbo].[Employee] ADD [DoNotCall] bit not null Constraint DoNot_Call_Default DEFAULT 0
IF ( @@ERROR <> 0 )
GOTO QuitWithRollback
END
It just takes a LOT of time in SQL Server Management studio. I have to cancel the query and cancelling takes a whole lot time. I am using SQL Server 2008.
View 4 Replies
View Related
Aug 17, 2007
Hi i tried designing a SSIS package which loads only those rows which were different from existing rows in the table , i need to timestamp the existing row with an inactive date when a update of that row is inserted (ex: same studentID )
and the newly inserted row with a insert time stamp
so as to indicate the new row as currently active, in short i need to maintain history and current rows in same table , i tried using slowly changing dimension but could not figure out, anyone experience or knowledge regarding the Data loads please respond.
example of Data would be like
exisiting data
StudentID Name AGE Sex ADDRESS INSERTTIME UPDATETIME
12 DDS 14 M XYZ ST 2/4/06 NULL
14 hgS 17 M ABC ST 3/4/07 NULL
New row to insert would be
12 DDS 15 M DFG ST 4/5/07
the data should reflect
StudentID Name AGE Sex ADDRESS INSERTTIME UPDATETIME
12 DDS 14 M XYZ ST 2/4/06 4/5/07
12 DDS 15 M DFG ST 4/5/07 NULL
14 hgS 17 M ABC ST 3/4/07 NULL
Please provide your input as much as you can even though it might not be a 100% solution.
View 4 Replies
View Related
Sep 22, 2015
I'm trying to improve the loading of some tables with large amounts of data that forms part of an ETL. I was going to try removing any indexes before the inserting to speed up the process, but I had some questions on whether or not I should include the clustered index (assuming one exists).
I was originally planning on including a step to disable all indexes on the destination table using the following:
ALTER INDEX ALL ON MyTable DISABLE
Once the load had finished I'd simply rebuild all the indexes.
should I simply disable the non-clustered indexes?
View 9 Replies
View Related
Nov 27, 2007
I think this question has been asked number of times. However, I amlooking for some specific information. Perhaps some of you can helpclose the gap. Or perhaps you can point me towards right direction.Perhaps this group can help me fill in ms-sqlserver related followingquestions.1. Do this database have data Clustering capabilities?1a. If yes, what mechanism is used such as shared disk, share nothing,etc.2. Do these dB have Security features?2a. If yes, what security features are supported? For instance do theysupport encryption or SSL connection?3. How does the database perform and what is the criteria for theperformance matrix?4. Do they have inbuilt load balance capabilities?I want to thank everyone for taking your time to read thiscorrespondence. I will also greatly appreciate your efforts in sharingyour thoughts.Regards,Manish
View 3 Replies
View Related
Aug 31, 2007
Hello!! guys..
I am using sql server 2005 enterprise edition in the clustered environment with netapp storage (one server) I monitor the server performance, basically using windows performance monitor counters such as avg.disk queue length,avg. disk reads/sec, avg disk writes/sec, processor time, available memory Mbytes, cache hit ratio etc. and using some sql server dmvs .
At this point all looks good.
My problem i dont know how much load sql server can handle more .How much it being utilized now? whehet it has reached its limits...how do i know it is just about to reach its linits??
how much throughput sql server is handling at present, how much more it can handle if incoming traffic increases.
Basically i want to find performance baselines , how much more it can handle so that i can plan for the future ..
How do i measure all this?
what are different methods available if i want to handle increased incoming traffic? (e.g. adding multiple servers etc.)
It will be really great if someone can share his experience on this..
Is there any article/white paper on this ..
Any suggestion/help appreciated
Thanks
View 1 Replies
View Related
Feb 13, 2006
Is there any information around what the SSIS packages are doing in the first 5-10 seconds of execution, and ways to speed this process up?
View 3 Replies
View Related
May 2, 2006
Hello to all the members.
I have a SQl server on a Win2000 box. The server is located behind a screened subnet, and has no access to the outside world except for 1433 and 445 inbound from a webserver located in a DMZ, forward of the internal firewall. Rules are very tight, and the IIS box is patched and has been hardened further including running urlscan. (allowed verbs=get,head,post. all executable and script types disallowed, permissions very tight) I have written an appication that monitors bad http requests in real time to check that the URL scan is working, and it seems to be. The only way to get to the SQL box, would be running injection on page code (majority written by myself no raw sql) on the web server or launching an exploit over HTTP against known ADODB flaws on the web server. The SQL server is a domain member, the IIS server is a standalone. If theres a way to compromise the SQL box without first breaching both firewalls, or first compromising the IIS server, I dont know how to do it. Still, the sql box began acting up, refusing to accept a terminal services connection (the error stated that there was a failure to load Win32.sys) No sign that the machine had been compromised. I restored from a complete back up, which seemed to fix the terminal services problem, but now the sql performance counters fail to load. Specifically MQPERF.dll. Any ideas on what may have caused the server to have 'issues' (prevailing theory centers around evil spirits), and what can I do to get the performance counters to load. Tried replacing the library.
Thanks in advance,
VLG
View 4 Replies
View Related
Feb 8, 2006
Hi I Need to know about datawarehouse what I need to start working with dw.
View 1 Replies
View Related
Aug 11, 2014
I need to use Bulk insert statement for copying a table with 200 million rows to another table on the same server...the table has no primary key or identity column.... script for BULK INSERT ...
View 9 Replies
View Related
Sep 29, 2006
We are working on a DataWarehouse app. The DW has been loaded wiith transactional data from the start of September. and we want refresh the DW with a full load from the original source. This full load wil consist largely of the same records that we loaded initially in the DW but some records will be new and others will have changed.
During the load I want to direct input records NOT already in the DW to a "mods" table and ignore those input records that alreayd exist in the DW. Can SSIS help out with this task?
TIA,
Barkingdog
View 1 Replies
View Related
Jan 7, 2005
Hi Guys,
I am looking for a large datawarehouse database implemented in SQL Server to use it to run some examples in my thesis. if someone knows where I can find a large datawarehouse database (e.g Sales database) please tell me.
I appreciate any help.
Thanks
View 1 Replies
View Related
Feb 19, 2008
I have been advised to design multidimensional model based on the operation data model.
What I have is ----
Couple of documents about the application.
Data model of operational data.
Now I have been told to create a datawarehouse from which any type of wuestion can be answered.
This seems horrendous.
Please guide.
View 5 Replies
View Related
Oct 11, 2000
I am re-engineering the data warehouse and my client is currently using autogenerate keys, their concern is that after a certain amount of keys (can't remember the figure) sql server starts having problems, dose anyone know how i should handle it when i am doing the designing?
thanks any input will be appreciated
View 2 Replies
View Related
Apr 22, 2006
i am writing a graduation thesis about clickstream datawarehouse with SQL Server , now who can give me some samlie about it? I just only want to learn it,thank you! :)
View 1 Replies
View Related
Mar 8, 2002
I am testing different raid setups in a Datawarehousing environment.
E.G
3 X Raid 5 (with four filegroups)
3 X Raid 10 (with four filegroups)
At the moment I have chance to trial different hardware configurations.
I want to measure what are the performance differences
I need to stress test these configs. Is there any tools I can use to do this.
Any loads I can use. What should I measure.
Are there any aricles for the environment
Any help would be great
Pete
View 2 Replies
View Related
Jun 4, 2007
Hi
Iam using SQl server 2005 Integration Service to transfer the data from Source Data Base to Datawarehouse DB. Here my main problem is after trnsfering the data from main Data Base to datawarehouse DB. If any records is delete from main database, how can we delete same records in Datawarehouse DB. If I want to delete any record in the datawarehouse that record mapped with some other tables [Foriegn Keys].
Regards,
Hanu
View 1 Replies
View Related