My day started with loading huge volume of data and my data flow task failed to do so.
My data flow has a flat file connected to a OLEDB target. This is a one to one mapping. My source file contains 50 lac records and it is of 500 MB in size.
I'm processing the data with all the default buffer settings. I have 4 CPUs in my server.
the system process DTSDebug.exe is utilizing more than 2GB page size. My average CPU usage being 70% when one of those CPU s is hitting 100% utilization.
I'm very new to SSIS. So, please provide me some info how do i set my buffers and do we have any PDF for performance and tuning in SSIS ?
Do we have any bulk load transformation in SSIS to load into DB2UDB ?
I am copying data from database to an excel file through SSIS. database is MS SQL 2005 and BIDS is also 2005.However, the job doing this task fails every time.As per investigation, the result of the query is more than 100,000 rows and we know that excel has a limit of 65000 rows of data.Is there a way of setting the limit up? or something? a better approach maybe so that everything will be copied to the excel file successfully.
The PrimeOutput method on component "Source - Query" (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure. End Error Error: 2015-10-22 04:34:58.05 Code: 0xC0047021
Source: Data Flow Task Description: SSIS Error Code DTS_E_THREADFAILED.
Thread "SourceThread0" has exited with error code 0xC0047038. There may be error messages posted before this with more information on why the thread has exited. End Error DTExec: The package execution returned DTSER_FAILURE (1). Started: 4:30:00 AM Finished: 4:35:05 AM Elapsed: 304.844 seconds. The package execution failed. The step failed. "
Hi, I am using VS2003 and SQL Server 2005. I will get the huge size xml from the web service. I need to load this xml to sql server 2005 table as records. I got below solution from microsoft site. Load XML to Dataset and from Dataset to SQL Server 2005. Note If you call ReadXml to load a very large file, you may encounter slow performance. To ensure best performance for ReadXml, on a large file, call the DataTable.BeginLoadData method for each table in the DataSet, then call ReadXml. Finally, call DataTable.EndLoadData for each table in the DataSet. Any other better solution is there to load huge XML to SQL server 2005?
Hi there - can anyone advise on the following issue. We have recently performed some server side tracing on a particular SQL instance over 24hr period. We are now attempting to load these into a database for analysis. Here lies the problem.
When we are loading the profiler trace files (one at a time) into the database the transaction log is growing at an excessive rate. Even though the database is in SIMPLE mode.
We are loading the traces using the command:
INSERT INTO sqlTableToLoad SELECT * FROM ::fn_trace_gettable('MytraceFileName', DEFAULT)
Can anyone advise how we could possibly get round this issue as we're running out of space due to the transaction log.
I am facing problems with concurrent access in SQL Server 2000,The scenario is that the DB contains one huge de-normalized table containing 40 million records.
The application frequently queries this table to populate other derived tables,the sql queries take a long time to return results.
So if one query is in execution the other user's query goes into a wait mode.Please suggest how I can better this.
I am in the process of choosing between either SQL Workgroup or Standard Edition. I see the differences in features on the comparison table, but do not see any references to the differing capabilities in handling transactions.
Is there any differences between Workgroup and Standard in terms of handling transaction/data capabilities? i.e. Does Standard have the superior capability in handling X times more TPMs than Workgroup?
If not, am I correct to assume that this is totally determined by hardware configuration (# of CPUs, processor speed, HD speed, RAM) ?
If the data volume / transactions handling is solely determined by hardware configuration, and I know the # of transactions and amount of R/W per second, .......where would be a good reference to know what kind of hardware configuration I need (ideally, once I know the hardware configuration, I guess I would be able to determine I need Workgroup or Standard)
I have been asked to design a solution for a client of mine who basically requires the daily analysis and reconciliation of the differences between 2 extremely large text files.
The files are not in an identical format but are both in some form of delimited format (one is CSV, the other is a little more complex). For the sake of this question, let's assume that I can effectively import each file into an MS SQL table.
Each file will have in excess of 100,000 rows each day (new data for each day).
Whilst I know that MS SQL does easily have the capacity to store the data, is there a recommended way to tackle the potential problems (I imagine that performance is important... they will be running the report every day)
Or is building the solution as simple as importing the data into 2 tables, and then querying the differences and outputting as a report using Crystal?
I need to periodically import a (HUGE) table of data from an external data source (not SQL Server) into SQL Server, with the following scenarios: Some of the records in the external data source may not exist in SQL.Some of the records in the external data source may have a different value at different imports, but this records are identified univocally by the same primary key in the external datasource and in SQL Server.Some of the records in the external data source may be the same in SQL.
Due to the massive volume of the import, I would like to import only the records which are different from what I have in SQL Server (cases 1 and 2 above). In fact case 2 is the most critical.
I thought of making a query with a left outer join between the data in the external data source table (SOURCE) and the data in the SQL Server table (DESTIN). The join is done on the respective primary keys (composed keys of up to 10 columns) and one of the WHERE conditions will be that the value in SOURCE is different from the value in DESTIN.
The result of this query would be exactly what I need to import. How to do this in SSIS??? I couldn't figure out how to join tables in different data sources yet.
In fact I cannot write a stored procedure to do that, since one of the sources is in a datasources not SQL Server. I have seen the Lookup transformation in this article http://www.sqlis.com/default.aspx?311 but this is not exacltly what I want to do. Another possibility is to use the merge join, but due to the sorting I believe its performances would be terrible!
We have a daily process, which copies millions of rows of data from one DB to another over Linked Server. Just checking on the best practise, are there more efficient ways than the Linked server to copy millions of rows of data from one DB to another? I checked bulk insert but that transfers the data from the file to DB not DB to DB.Â
i have 4 tables, each consist of app. 10000000 rows.They have same columns (fTime[datetime] and bid[money]).What i wanna do is to collect all of datas into one of the tables, in ascending order by fTime.
Hi,I've an application, lets call it simply "A", which creates in a Microsoft Sql Database two huge tables.Lets call them "table1" and "table2"It safes really much data into this tables.After application "A" has finished another application is executed which deletes this two tables.Then application "A" is started again and it will create this two tables again, but the amount of data becomes bigger.It can only proceed if the tables were deleted completely before and the database is empty. This is the procedure which I repeat very often, but everytime the amount of data becomes bigger (table1 and table2 becomes bigger).A couple if times it works fine, but once it seems data becomes too big and application "A" fails. Mostlikely because the data wasnt removed correctly / completely. This is my code of deleting the two tables, maybee there is something I have to change:</p><p> try { SqlConnectionStringBuilder builder = new SqlConnectionStringBuilder("Server=mycomputerdbname;Integrated Security=SSPI;" + "Initial Catalog=testing"); builder["Server"] = "(local)dbname"; builder["Connect Timeout"] = 10; builder["Trusted_Connection"] = true; builder["Initial Catalog"] = ((ComponentConfiguration)this.componentConfig).Persistency.DatabaseName; SqlConnection sqlconnection = new SqlConnection(); sqlconnection.ConnectionString = builder.ConnectionString; sqlconnection.Open(); SqlCommand cmd1 = new SqlCommand("DROP TABLE table1"); // TO Do delete all tables SqlCommand cmd2 = new SqlCommand("DROP TABLE table2"); // TO Do delete all tables cmd1.Connection = sqlconnection; cmd2.Connection = sqlconnection; cmd1.ExecuteNonQuery(); Thread.Sleep(7000); cmd2.ExecuteNonQuery(); Thread.Sleep(7000); sqlconnection.Close(); Thread.Sleep(3000); } catch { }</p><p> </p><p> Thanks for help! mulata
Actually in my transformation i am transferring huge amount of data.
i have been using oledb command finally to dump my incoming data to respective tables.
For Example :
if you have two tables
table 1,table 2
in my incoming data i have a lookup and check for two unique columns with that of the unique columns in the table 1.if the record does not exsist i try inserting a record into table 2 and get the unique filed of the record and store that in particular column of table 1.
the data is very large an is this the better why or any suggesstions do let me know..
I am using SSIS to transfer data between two SQL Servers (2000). There is no transformation involved as the source and destination table structure is same. Even then the package execution takes lot of time.
The data in the tables is of the order of 66000000 the we were required to kill the package execution after it took more than 24 hours. The CPU usage was more than 13000s and disk I/O was well above 330000000. I am new to the tit-bits of SIS. Can anyone please tell me the reason as to why the package has gone so resource hungry.
I have a huge table with 4 primary keys on it. I need to delete the data from this table ( approx. 5.6 millions records to be deleted). It takes a hell lot of time to delete it by normal query. Can someone please suggest me a better way? Any help will be appreciated.
1. 2 Dimensional tables(Parent), DIM1 with 50000 rows, and DIM2 with 1000 rows
2. Fact 1 with 50 columns, 25 Million rows and with FK to DIM1 and DIM2
3. Fact 2 with 40 columns, and 25 Million rows and with FK to DIM1 & DIM2 tables.
Actually the fact 1 and fact 2 have same related data but since our Analysis cube person wanted the fact table not to have more than 50 columns we divided the tables into 2, but they have the same compound key.
Above said, I have a situation where I have to select all the columns, in both fact tables, and do a group by. I wrote the query and ran "Analyze Query in the Database Engine Tuning Advisor" for it. It gave bunch of recomendations about the statistics and indexes which I created. When I executed the query the result came up in matter of seconds, which was good.
In the query I had a condition having MarketName='Bridgeview' and DateID = 344 (FK of today-1).
When I wanted the data for last 30 days I changed to DateID in ( > FK of today -32 and < FK of today), the query responded and worked fine.
But when I changed the query to get MarketName='Aurora' (other than I used when I ran Tuning Advisor), the result returned is empty set. When I removed the MarketName condition, it is supposed to return all markets' data, but it returns only Bridgeview data.
I know the data is in the table for all markets, since reports are rendered from these fact tables for all of these markets(also ran queries to check the fact table data).
I am unable to point out the reason why the query behaves like this. It responds to the date change, but not to the MarketName change.
I really appreciate if anyone can help me point out the problem.
In an approach of building an ETL tool, we are into a situation wherein, a table has to be loaded on an incremental basis. The first run all the records apporx 100 lacs has to be loaded. From the next run, only the records that got updated since the last run of the package or newly added are to be pulled from the source Database. One idea we had was to have two OLE DB Source components, in one get those records that got updated or was added newly, since we have upddate cols in the DB getting them is fairly simple, in the next OLEDB source load all the records form the Destination, pass it onto a Merge Join then have a Conditional Split down the piple line, and handle the updates cum insert.
Now the question is, how slow the show is gonna be ? Will there be a case that the Source DB returns records pretty fast and Merge Join fails in anticipation of all the records from the destination ?
What might be the ideal way to go about my scenario.. Please advice...
Has anyone implemented split data for an application between two databases because the data size is extremely large? If so could you please point me to relevant information.In this split data scenario, a table will automatically carry over to another database whenever the size limit for the current database is reached. The challenge is here for the DAL (data access layer) to automatically look into the appropriate database when the next row of data is in another database. OR Perhaps there is another solution to this terasize data problem..Any help on this would be greatly appreciated.
Hi. I am trying to put a hugh chunk of text into my database for example information to a particular product which has more than 2000 characters. I had saw this datatype "nvarchar(MAX)" in SQL Server 2005 and was wondering if i can use this to store my text. Thanks
I have a database data file almost at 2tb maxing out a windows drive. Only 16gb left. Should I just add another data file on another Windows drive for growth? Or just move current huge data file to a new GPT drive? Or do both adding another data file and moving existing to its own new GPT drive?
I have a table which contains approx 3,00,000 records. I need toimport this data into another table by executing a stored procedure.This stored procedure accepts the values from the table as params. Mycurrent solution is reading the table in cursor and executing thestored procedure. This takes tooooooo long. approx 5-6 hrs. I need tomake it better.Can anyone help ?Samir
I am working on a serial tracking application using Sql Server 2005 and .Net. One of the requirments is to have an ad-hoc file export utility in which users can drag-n-drop fields from a set of tables and export the results to CSV. It all sounds ok and Sql Server Reporting Services' Report Builder seem to be just the right tool for it, but there is one problem : The report size is big, about 7K - 8K pages and 4 - 5 columns wide; while rendering the report, IIS memory usage shoots up to about 2GB and remains at about 2GB. Any idea if something can be done to mitigate this problem? Note that I dont need the HTML rendering at all. All I need is to have the CSV at the end of the day, while users are able to chose columns in an ad-hoc manner.
Hi, I have to transform about 60 millions of data and it runs so slow that it never finishes in my testing. Should I have to process it chunk by chunk? Or is there any other techniques I can use (I am using data flow task). Thanks for advice.
declare @error int, @rowcount int select @rowcount = COUNT(1) FROM STG_BCDR; while @rowcount > 0 begin  BEGIN TRAN Deletion
[code]....
Above code i try to delete records batch by batch to avoid table locking at BCDR table.total records in this BCDRÂ table is 40,000 records. Â However I run the code at execution plan, the BCDR table still clustered index scan which means that the locking still happend.
If i change the delete top (5000)...... to delete top (5).... then thre is clustered index seek, which is good..The problem here is  each time  only delete top 5 records which is means it will realy take very long time to remove those data.
how to cater the situation inorder for me to delete those huge data without table locking happend. If table locking happend , then other user will not be able to access this table at the same time.
I'm using Script Component to load data into Oracle DB due to the poor performance issue. Now, I found it will missing some data during the transmission. Please see the screenshot below:Â
I am getting ErrorCode 8 while loading the data from stage to model. I have checked my error view it states that "Member Code is Inactive".
Initially I have loaded same set of data in Model from MDS Stage table but then deleted with ImportType = 5 which removed all the data from the MDM model.
Now i want to load it back but its giving the Error Code 8 .. Before loading the same data i have changed the stage table Importtype to 2 and Importstatusid to 0.
Hi, all experts here, Do we always have to use SCD component for the loading of data into data warehouse to handle changes of rows? I am looking forward to hearing from you and thank you very much in advance for your help. With best regards,
Hi i am trying to do a straight forward load from a Flatfile source , i have defined the columns according to the lenghts defined in the Data Dictionary Provided but when i am trying to run the Task i am encounterring this error
The column data for column "Column 20" overflowed the disk I/O buffer.
I tried to add another column 21 at the end and truncate or leave that column unmapped to destination but the same problem occurs for column 21 what should i do to over come this .
In case of Bad Data how to clean up the source.. Please help me with this
Hello!! searching information about how to migrate some date from an old data base (any tipe) from SQL I´v found this: LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt' [REPLACE | IGNORE] INTO TABLE tbl_name [FIELDS [TERMINATED BY 'string'] [[OPTIONALLY] ENCLOSED BY 'char'] [ESCAPED BY 'char' ] ] [LINES [STARTING BY 'string'] [TERMINATED BY 'string'] ] [IGNORE number LINES] [(col_name_or_user_var,...)] [SET col_name = expr,...)] Does anybody know how does it works and how to use it????I´d like to know because I have to load data from a text file to a SQL Data Base and this seems to be te fastest an easiest way to do it...Thanks!!!!bye!