Performance Issue With Huge Data
Mar 20, 2007
Hello All,
I am using SSIS to transfer data between two SQL Servers (2000). There is no transformation involved as the source and destination table structure is same. Even then the package execution takes lot of time.
The data in the tables is of the order of 66000000 the we were required to kill the package execution after it took more than 24 hours. The CPU usage was more than 13000s and disk I/O was well above 330000000. I am new to the tit-bits of SIS. Can anyone please tell me the reason as to why the package has gone so resource hungry.
Thanks in advance,
Atul
View 3 Replies
ADVERTISEMENT
Jan 23, 2007
I have a medical DB with the loads 150,000 transactions per month. Each month, I load the tranactions into a table for the current year. I also have to update records for prior months based on current month information.
For example, out of 186,000 dump records...150,000 will be loaded into the main table and 36,000 will be used to update records already loaded into the main table.
The tables have 90 columns, I have a clustered PK using [Soc_Sec_Number] & [Month] & [Row Index]. I need the row index counter (like auto number in MS Access) because I can have multiple transactions per month for the same Soc Sec Number.
===========================================================
My steps are
1) Load 150,000 records into main table (For december, this makes the table have 1,800,000 rows
2) Run queries for the remaining 36,000 rows to update records already loaded into the table containing the 1,800,000 rows.
3) The 36,000 queries have to be splits depending upon the update type code, So I am actually running 6 queries using 6,000 rows each against the 1,800,000 records.
The update queries are using inner joins with [Soc Sec #] and [Date], part of my composite Primary Key on both tables.
=============================================================
Problem
This process takes forever, about 4 hours per monthly update. As the months go out, the main table gets larger and the time increases. It took almost 24 hrs to get from January 2004 to June 2004.
I am running Sql Server on my PC, no seperate workstation. My PC has 2.8 GHZ with about 1 Gig in RAM. Could my PC specs be too low. I noticed that the task mamager shows sqlservr.exe using over 657,000 mb of RAM when running.
I also ran a simple Select MAX(Soc_Sec_Number) query that took over 5 minutes. This is way too long especially since Soc_sec_Number is part of the composite PK.
Could my queries actually take that long or are my pC specs too low. MY PC seemed to freeze after the JUNE update? Any help appreciated.
View 17 Replies
View Related
Oct 26, 2007
was trying to performance tune a query and came up on this weird issue(Weird to me atleast.. possibly because I am a newbie:-))).
When I run the query mentioned below, I get the results in 1 second, while the same query, if I declare a variable for the value I am comparing, it takes 40 seconds. I am on SQL Server 2000. Both queries are being run in the same DB on the same tables and the only difference between the two is that the date has been declared as a variable. I am able to consistantly reproduce this. Tried casting the variable, declaring it as varchar , replacing the date by getdate() at both places, but the moment I use the variable, the performance goes for a toss. Has anyone came across any similar issues?
TIA
Callista
Query 1
select ae.Col1, ae.Col2, ae.Col3, ae.Col4, ge.Col4, c.Col5, 0
from table1 ae WITH (INDEX=ind_Col3 NOLOCK), table2 ge with (nolock), table3 c with (nolock)
where ae.Col1 = ge.Col1
AND ae.Col4 = c.Col1
AND ae.Col3 > '2007-10-25 14:18:12.380'
and ae.Col6 is not null
Query 2
declare @MyVariable datetimeselect @MyVariable = '2007-10-25 14:18:12.380'
select ae.Col1, ae.Col2, ae.Col3, ae.Col4, ge.Col4, c.Col5, 0
from table1 ae WITH (INDEX=ind_Col3 NOLOCK), table2 ge with (nolock), table3 c with (nolock)
where ae.Col1 = ge.Col1
AND ae.Col4 = c.Col1
AND ae.Col3 > @MyVariable
and ae.Col6 is not null
View 5 Replies
View Related
Dec 13, 2007
Hi,
I created a CLR UDF that returns a large number of rows, when I run it from my VPC (XP, SQL Server Developer Edition and 1GB Memory) it takes approx 2 min and 30 secs to start displaying the rows (Using Management Studio), when I run the same query in our development server (Win 2003, SQL Server Enterprise Edition, 8 GB Memory and 8 Processors) it takes more than 15 min to start displaying the results, does anybody have an idea why is this happening?
Thanks in advance
View 2 Replies
View Related
Jun 22, 2015
I have encountered a problem with a specific set of tables. The same select yields slightly differing execution plans in two different environments (instances). But the slight variation seems to contain a huge differences in stats. I don't know the significance of these stats. The two tables have the exact same indices.
This is the selcet statement:
SELECT 'xx' FROM DUKS.dbo.Profiler
WHERE DNA_Løbenummer IN
(SELECT DNA_Løbenummer FROM DUKS.dbo.Effektregister
WHERE Sagsnummer = '2015-00002')
View 17 Replies
View Related
Jun 7, 2006
I need to periodically import a (HUGE) table of data from an external data source (not SQL Server) into SQL Server, with the following scenarios:
Some of the records in the external data source may not exist in SQL.Some of the records in the external data source may have a different value at different imports, but this records are identified univocally by the same primary key in the external datasource and in SQL Server.Some of the records in the external data source may be the same in SQL.
Due to the massive volume of the import, I would like to import only the records which are different from what I have in SQL Server (cases 1 and 2 above). In fact case 2 is the most critical.
I thought of making a query with a left outer join between the data in the external data source table (SOURCE) and the data in the SQL Server table (DESTIN). The join is done on the respective primary keys (composed keys of up to 10 columns) and one of the WHERE conditions will be that the value in SOURCE is different from the value in DESTIN.
The result of this query would be exactly what I need to import.
How to do this in SSIS??? I couldn't figure out how to join tables in different data sources yet.
In fact I cannot write a stored procedure to do that, since one of the sources is in a datasources not SQL Server.
I have seen the Lookup transformation in this article http://www.sqlis.com/default.aspx?311 but this is not exacltly what I want to do.
Another possibility is to use the merge join, but due to the sorting I believe its performances would be terrible!
Thanks in advance for your suggestions!
View 9 Replies
View Related
Jul 7, 2015
We have a daily process, which copies millions of rows of data from one DB to another over Linked Server. Just checking on the best practise, are there more efficient ways than the Linked server to copy millions of rows of data from one DB to another? I checked bulk insert but that transfers the data from the file to DB not DB to DB.
View 6 Replies
View Related
Apr 3, 2000
SQL 7 SP1 NT4 SP5
I have a TRANSACTION table with 150 million rows.
I have a USER table.
Each user has about 600 records in the TRANSACTION table.
The TRANSACTION cluster index is on USERID + RECID . The second index is on USERID + Fieldx + Fieldy.
The TRANSACTION table gets about 1.4 million inserts in a normal day and about 40,000 updates.
I want to go through the USER table and delete all users who have not visited me in a while.
I want to do this without substantially hindering performance in a production environment. I can perform this over a week period or two if needed.
The best way I thought of doing this was to grab x amount of users in a cursor and loop through deleting their corresponding TRANSACTION records.
Does anyone have any ideas on a better way. What is going to happen to my indices during this time ?
Thanks !!!
View 3 Replies
View Related
Jun 12, 2006
Hi,
i have 4 tables, each consist of app. 10000000 rows.They have same columns (fTime[datetime] and bid[money]).What i wanna do is to collect all of datas into one of the tables, in ascending order by fTime.
PS i wanna do it as fast as possible as well
View 1 Replies
View Related
Nov 20, 2006
Hi,I've an application, lets call it simply "A", which creates in a Microsoft Sql Database two huge tables.Lets call them "table1" and "table2"It safes really much data into this tables.After application "A" has finished another application is executed which deletes this two tables.Then application "A" is started again and it will create this two tables again, but the amount of data becomes bigger.It can only proceed if the tables were deleted completely before and the database is empty. This is the procedure which I repeat very often, but everytime the amount of data becomes bigger (table1 and table2 becomes bigger).A couple if times it works fine, but once it seems data becomes too big and application "A" fails. Mostlikely because the data wasnt removed correctly / completely. This is my code of deleting the two tables, maybee there is something I have to change:</p><p> try { SqlConnectionStringBuilder builder = new SqlConnectionStringBuilder("Server=mycomputerdbname;Integrated Security=SSPI;" + "Initial Catalog=testing"); builder["Server"] = "(local)dbname"; builder["Connect Timeout"] = 10; builder["Trusted_Connection"] = true; builder["Initial Catalog"] = ((ComponentConfiguration)this.componentConfig).Persistency.DatabaseName; SqlConnection sqlconnection = new SqlConnection(); sqlconnection.ConnectionString = builder.ConnectionString; sqlconnection.Open(); SqlCommand cmd1 = new SqlCommand("DROP TABLE table1"); // TO Do delete all tables SqlCommand cmd2 = new SqlCommand("DROP TABLE table2"); // TO Do delete all tables cmd1.Connection = sqlconnection; cmd2.Connection = sqlconnection; cmd1.ExecuteNonQuery(); Thread.Sleep(7000); cmd2.ExecuteNonQuery(); Thread.Sleep(7000); sqlconnection.Close(); Thread.Sleep(3000); } catch { }</p><p> </p><p> Thanks for help! mulata
View 6 Replies
View Related
May 31, 2007
Hi Good morning to all,
My day started with loading huge volume of data and my data flow task failed to do so.
My data flow has a flat file connected to a OLEDB target. This is a one to one mapping. My source file contains 50 lac records and it is of 500 MB in size.
I'm processing the data with all the default buffer settings. I have 4 CPUs in my server.
the system process DTSDebug.exe is utilizing more than 2GB page size. My average CPU usage being 70% when one of those CPU s is hitting 100% utilization.
I'm very new to SSIS. So, please provide me some info how do i set my buffers and do we have any PDF for performance and tuning in SSIS ?
Do we have any bulk load transformation in SSIS to load into DB2UDB ?
If so how do i get it installed?
Thanks in advance,
Suresh N
View 2 Replies
View Related
Mar 13, 2007
Actually in my transformation i am transferring huge amount of data.
i have been using oledb command finally to dump my incoming data to respective tables.
For Example :
if you have two tables
table 1,table 2
in my incoming data i have a lookup and check for two unique columns with that of the unique columns in the table 1.if the record does not exsist i try inserting a record into table 2 and get the unique filed of the record and store that in particular column of table 1.
the data is very large an is this the better why or any suggesstions do let me know..
View 5 Replies
View Related
Aug 19, 2007
Hi All!
I have an Issue.
I am calling a rdl file through the Url and i am passing the Format=Excel in the Url.
Eg. http://harinarayana/ReportServer/......&Format=Excel&...........
If the data is around more than some 20000 records, its not able to export and
a Error like "The Service is not available " is being displayed.
Does anyone have any solution for scenarios like this? It would be of great help to me.
Regards
Hari
View 1 Replies
View Related
Mar 16, 2004
I have a huge table with 4 primary keys on it. I need to delete the data from this table ( approx. 5.6 millions records to be deleted). It takes a hell lot of time to delete it by normal query.
Can someone please suggest me a better way?
Any help will be appreciated.
View 14 Replies
View Related
Oct 12, 2006
Hi,
I have a situation where I have 4 tables:
1. 2 Dimensional tables(Parent), DIM1 with 50000 rows, and DIM2 with 1000 rows
2. Fact 1 with 50 columns, 25 Million rows and with FK to DIM1 and DIM2
3. Fact 2 with 40 columns, and 25 Million rows and with FK to DIM1 & DIM2 tables.
Actually the fact 1 and fact 2 have same related data but since our Analysis cube person wanted the fact table not to have more than 50 columns we divided the tables into 2, but they have the same compound key.
Above said, I have a situation where I have to select all the columns, in both fact tables, and do a group by. I wrote the query and ran "Analyze Query in the Database Engine Tuning Advisor" for it. It gave bunch of recomendations about the statistics and indexes which I created. When I executed the query the result came up in matter of seconds, which was good.
In the query I had a condition having MarketName='Bridgeview' and DateID = 344 (FK of today-1).
When I wanted the data for last 30 days I changed to DateID in ( > FK of today -32 and < FK of today), the query responded and worked fine.
But when I changed the query to get MarketName='Aurora' (other than I used when I ran Tuning Advisor), the result returned is empty set. When I removed the MarketName condition, it is supposed to return all markets' data, but it returns only Bridgeview data.
I know the data is in the table for all markets, since reports are rendered from these fact tables for all of these markets(also ran queries to check the fact table data).
I am unable to point out the reason why the query behaves like this. It responds to the date change, but not to the MarketName change.
I really appreciate if anyone can help me point out the problem.
Thanks,
Venkat
View 3 Replies
View Related
Aug 21, 2007
Hi all,
I've faced a problem with the below error when I load 1.5m data into oracle database.
The buffer manager detected that the system was low on virtual memory, but was unable to swap out any buffers.
Please help. Thanks
View 19 Replies
View Related
Dec 3, 2006
Hi all,
In an approach of building an ETL tool, we are into a situation wherein, a table has to be loaded on an incremental basis. The first run all the records apporx 100 lacs has to be loaded. From the next run, only the records that got updated since the last run of the package or newly added are to be pulled from the source Database. One idea we had was to have two OLE DB Source components, in one get those records that got updated or was added newly, since we have upddate cols in the DB getting them is fairly simple, in the next OLEDB source load all the records form the Destination, pass it onto a Merge Join then have a Conditional Split down the piple line, and handle the updates cum insert.
Now the question is, how slow the show is gonna be ? Will there be a case that the Source DB returns records pretty fast and Merge Join fails in anticipation of all the records from the destination ?
What might be the ideal way to go about my scenario.. Please advice...
Thanks in advance.
View 13 Replies
View Related
Aug 30, 2005
Has anyone implemented split data for an application between two databases because the data size is extremely large? If so could you please point me to relevant information.In this split data scenario, a table will automatically carry over to another database whenever the size limit for the current database is reached. The challenge is here for the DAL (data access layer) to automatically look into the appropriate database when the next row of data is in another database. OR Perhaps there is another solution to this terasize data problem..Any help on this would be greatly appreciated.
View 8 Replies
View Related
Jun 14, 2006
Hi.
I am trying to put a hugh chunk of text into my database for example information to a particular product which has more than 2000 characters. I had saw this datatype "nvarchar(MAX)" in SQL Server 2005 and was wondering if i can use this to store my text.
Thanks
View 1 Replies
View Related
Oct 31, 2015
I have a database data file almost at 2tb maxing out a windows drive. Only 16gb left. Should I just add another data file on another Windows drive for growth? Or just move current huge data file to a new GPT drive? Or do both adding another data file and moving existing to its own new GPT drive?
Primary objective is to make do for now.
View 1 Replies
View Related
Jul 20, 2005
I have a table which contains approx 3,00,000 records. I need toimport this data into another table by executing a stored procedure.This stored procedure accepts the values from the table as params. Mycurrent solution is reading the table in cursor and executing thestored procedure. This takes tooooooo long. approx 5-6 hrs. I need tomake it better.Can anyone help ?Samir
View 2 Replies
View Related
Dec 30, 2007
Hi.
I am working on a serial tracking application using Sql Server 2005 and .Net. One of the requirments is to have an ad-hoc file export utility in which users can drag-n-drop fields from a set of tables and export the results to CSV. It all sounds ok and Sql Server Reporting Services' Report Builder seem to be just the right tool for it, but there is one problem :
The report size is big, about 7K - 8K pages and 4 - 5 columns wide; while rendering the report, IIS memory usage shoots up to about 2GB and remains at about 2GB.
Any idea if something can be done to mitigate this problem? Note that I dont need the HTML rendering at all. All I need is to have the CSV at the end of the day, while users are able to chose columns in an ad-hoc manner.
View 7 Replies
View Related
May 14, 2015
declare @error int, @rowcount int
select @rowcount = COUNT(1) FROM STG_BCDR;
while @rowcount > 0
begin
BEGIN TRAN Deletion
[code]....
Above code i try to delete records batch by batch to avoid table locking at BCDR table.total records in this BCDR table is 40,000 records. However I run the code at execution plan, the BCDR table still clustered index scan which means that the locking still happend.
If i change the delete top (5000)...... to delete top (5).... then thre is clustered index seek, which is good..The problem here is each time only delete top 5 records which is means it will realy take very long time to remove those data.
how to cater the situation inorder for me to delete those huge data without table locking happend. If table locking happend , then other user will not be able to access this table at the same time.
View 6 Replies
View Related
Jul 20, 2005
I'm seearching for information regarding database replicationperformance. We need to compare the performance of replication for SQLServer and Oracle and it is urgent! Anyone who can describe theperformance bottlenecks for each database when performing replication,or can point me to a white paper or webpage.
View 1 Replies
View Related
Apr 13, 2008
Hi,
I have a table (replica of my original table for test purpose) with half a million records. The structure of the table is as following:
id (int), Name (varchar), DataComplete (DateTime), Age (int), bit1 (bit), bit2 (bit), bit3 (bit), bit4 (bit), bit5 (bit), bit6 (bit), bit7 (bit), bit8 (bit), bit9 (bit), bit10 (bit), bit11 (bit), bit12 (bit)
The data retrievel on this table is (in my opinion) pretty slow. A simple (SELECT * FROM MYTABLE) takes 5 seconds. How can i improve the time of retrievel. I have tried to define some indexes but no improvement. Can some guru help me in this regard. Atleast the SELECT * must return data in 1 or under 1 second.
Thanks in adavance...
View 2 Replies
View Related
Dec 12, 2005
Well, I hope, after my endless search in the web... maybe, somebody can help me here.I'm passing a application to a web-based application, therefore I have to use ADO.net to access a sql server DB. But the performance is extremly poor!!!!Just to compare: with the normal application or with the SQL Query Analyzer, the query needs about 2 seconds. Well, by using ado.net, it's about 200 seconds......!!!!Of course, the query is quite long. But the difference is extrem... too extrem for using the same query....I already tried a lot... actually everything I found on the web. What is really strange is that neither the processor nor the memory is fully used... And the web server use more of the cpu power than the sql server itself (the 2 server are still on my machine where I'm programming). So, could it be that the problem is by filling up the dataset with the sqladapter or something similar?Thx for your help!!
View 8 Replies
View Related
Dec 13, 2005
I was wondering if someone could clear this up for me...
I have a table that will be used to store information about products that I will be selling on a new website. I would like to store each product in a table which will include a description of around 1000 words. I was deliberating over whether to store this chunk of text in a column with the data type set as 'text', or to store the text in a seperate txt file on the server. The search facility on the site will not be required to query this text so which would offer the best in terms of performance?
View 3 Replies
View Related
Nov 9, 2007
I am running a trace and one of the columns is start time
I want to import corresponding performance log data in the profiler
It is a new feature in 2005
however this option is disabled in the profiler
this option is at
File -> Import performance Data
Please advice on how to enable this option
thanks in advance
View 5 Replies
View Related
Feb 2, 2006
Hi SQL gurus,I have a table structure question. I will have a table 'Models' thathas one to many 'incomes' and one to many 'costs'. These 2 entitieshave exactly the same structure, which is 7 smallmoney and a name. Isit better to create a table 'Incomes' and a table 'Costs', with boththe same number of fields like this :Incomes-------------in_idmodelin_1in_2in_3in_4in_5in_6in_7in_nameCosts-------------c_idmodelc_1c_2c_3c_4c_5c_6c_7c_nameor is it better to create one single table that will contain bothentities like that :Incomes_Costs-------------ic_idmodelic_1ic_2ic_3ic_4ic_5ic_6ic_7ic_nameic_isIncomewhich only differs from the 2 above by the isIncome field to know whichrow is an income and which row is a cost.I'd like to know which method is the best in terms of performance andgeneral structure and would greatly appreciate if you explain a littlethe reasons that drove you to suggest me a method over the other.Thanks all for your time!ibiza
View 4 Replies
View Related
Jul 20, 2005
Hello guys,Wonder if any of you could help me out here. I have just created annew empty database and imported data from another database into it.This was done with the import wizard from MMC.First thing that I noticed was the size difference, the old databasewas well over 1GB, but the new one was only about 400MB.Second thing I've noticed, and this is the problem, is that accessingthe new (smaller) database instead of the old one causes a huge speeddegradation, about 5 times slower than the old version.We are using MS SQL Server 7. Any help would be very gratefullyreceived.RegardsGethyn
View 1 Replies
View Related
May 27, 2007
Dear all,
i have problems with log .
the mssql write 4 G log so plz how can i Eliminate log huge size
View 1 Replies
View Related
Apr 22, 2008
I have an opportunity to rebuild a database model with the express purpose of improving query performance. So given the following I have a few questions.
Table A (~500M records)
Primary Key Field (int)
Field 1 (varchar)
Field 2 (varchar)
Field 3 (varchar)
Field 4 (varchar)
Field 5 (varchar)
Table B (1B+ records)
Primary Key Field (int)
Foreign Key Field (int)
Field 1 (varchar)
Field 2 (varchar)
Field 3 (varchar)
Field 4 (varchar)
Field 5 (varchar)
* Assumed: Tables are inner joined on all queries. The database is readonly.
-- Most of my lookups are based on querying Field 1 of Table A. The data content of Field 1, Table A is 90% unique.
1) Would it be more beneficial to put the clustered index on Field 1 instead of the PK field in Table A?
2) Can an Identity column be non-clustered?
3) Alternatively, would it be beneficial to build a separate lookup table with just the PK & Field 1 of Table A, with a clustered index on the lookup table Field 1 which I join on Table A? (did that make sense?)
-- I have a secondary lookup that performs queries on Fields 1, 2, 3, 4 & 5 of Table B
1) Would it be more beneficial to create an additional indexed lookup column of the concantenated values of Fields 1-5 of Table B versus a covering index of all 5 columns?
2) Does a clustered index have to be unique?
3) Would a clustered index be more beneficial over Fields 1-5 or the special lookup column versus the PK or FK fields?
4) Would creating a special lookup table with just the requisite fields be more beneficial?
An extra question. The existing data model uses the CHAR datatype for all columns less than 9 characters wide and the columns are set to allow nulls. This requires every select statement to COALESCE() and RTRIM() all these columns. I intend to make all (affected) columns VARCHAR, NOT NULL with a default value of a 0-length string.
Will this enhance query performance?
Thanks in advance for any insight.
View 7 Replies
View Related
Jul 23, 2005
Hi,I am using SQL 2000 and has a table that contains more than 2 millionrows of data (and growing). Right now, I have encountered 2 problems:1) Sometimes, when I try to query against this table, I would get sqlcommand time out. Hence, I did more testing with Query Analyser and tofind out that the same queries would not always take about the sametime to be executed. Could anyone please tell me what would affect thespeed of the query and what is the most important thing among all thefactors? (I could think of the opened connections, server'sCPU/Memory...)2) I am not sure if 2 million rows is considered a lot or not, however,it start to take 5~10 seconds for me to finish some simple queries. Iam wondering what is the best practices to handle this amount of datawhile having a decent performance?Thank you,Charlie Chang[Charlies224@hotmail.com]
View 5 Replies
View Related