Why could the following query be going slowly (over 30 minutes and running)? The ID is the PK and clustered index on the table. The estimated execution plan is a basic clustered index seek.
Primary platform XP Sp2 RAM 2gb. Sp2 for sql25k not applied yet.
Does anyone have any tip about how to speep up the development when you have plenty of tasks? My dtsx owns several loop container, for each, and ten tasks between sql and script tasks.
Hi group,I have a 175MM record table, with a record length of 200 bytes (about 20columns).Sometimes when I run a very simple DTS to import our monthly text fileof new records (about 10 million records) it really flies (takes lessthan an hour or 45 minutes).However, sometimes it takes forever...running over 6 or 7 hours beforefinishing.When it takes forever I run sp_who and don't see any blockingprocesses...To ensure that things move as quickly as possible I always drop theprimary key and the indexes before importing, so it shouldn't be gettingtied up trying to update the indexes.What sort of things could be holding up what I would expect to be a verysimple process of appending the records to the end of the currenttable...?Thanks,Warren WrightExperian-Scorex*** Sent via Developersdex http://www.developersdex.com ***Don't just participate in USENET...get rewarded for it!
I am new to SSIS and I am investigating using the Slowly Changing Dimension transform.
The data source that I receive is a daily snapshot of the external source system table. I need to store the history of the entity attributes (Type 2 SCD) and I am using the Start / End Date mechanism.
When an entity (identified by the business key) is no longer received in the source snapshot, I would like the data flow to update the End Date of the current row to show that the entity has now expired.
Does anyone have any suggestions for a good way to achieve this ?
NB: Changing the source system extract to include and flag expired entities is not an option for me.
Hi, I have a report with 18 cascading report parameters. Each report parameter has a unique dataset which passes the value of the previous parameter into the sql string. As I am selecting the report parameters it is taking longer to query the further down I go. I think this is because of the number of where conditions that are being passed through the sql query. The last report parameter is passing 17 where conditions.
In access when I have done this the parameters near the bottom were being refreshed quicker than the top ones - why is it the opposite way round in reporting services??? Any ideas of how to speed this process up?
I have one question regarding Slowly Changing Dimension component in SSIS. Does SCD also delete records in warehouse if they does not exist in source anymore, or does SCD only insert new and update existing records? Can someone explain me a little bit more about inferred members? Thanks.
I have a frontend Access and backend SQL works fine but when i in my customer table seek in name it takes very very long time . I use ODBC to my connection
I've just had a call saying that their SQL Server v7 box on a Compaq 7000 that has been installed for about 6 months is now running slowly, a look on the CPU shows that SQL server is grabbing 50% upwards of the CPUs (dual processor) even when the front end application is not running , is there anything that I should be checking
I have and SQL server that had NamedPipes, Multiprotocol,IPX/SPX and TCP/IP all loaded. It is running on a 10Mbps network. I have run setup and turned all other protocols off except TCP/IP to increase speed. This has increased speed on all the workstations but the server now runs very very very slowly. If anyone out there has any ideas please let me know.
I have a Type I SCD situation, ie, insert if new (by checking the business ID) and update if any attributes for a given business ID has changed.
The way I usually do this, (and I believe this is how most people do it), is I use a LOOKUP TASK to determine if the business ID exsist in the target table. If it doesn't then I insert. If the business ID exists, then I bring back the associated attributes and use a CONDITIONAL SPLIT TASK to compare if any of the incoming attributes are different. If there are changes, then update.
In doing this comparision, I often run into situations where I end up comparing a NULL value to something, which does not result in FALSE, but a NULL result. To get around this, I first check for NULLs and convert them into something valid before I do the comparision, but this results in a messy comparison expression, especially if I have to compare a lot of attributes.
So, how do you guys handle this?
As an alternative, I am looking into the SLOWLY CHANGING DIMENSION TASK, which I also have some questions on, but I would like to first address the above. Thank you.
I have a package using Slowly changing dimension in the data flow task. It works fine if the number of records are less but for a large file the package fails with the "Violation of Primar Key" error even though there are no duplicate records in the table. for eg i have a table with employee database with a composite primary key comprsing of Name and Employee Id. I need to do an UPSERT depending on the Name and Employee Id combination. I have a file with 100,000 records and when i try to execute the package it gives an error 'cannot insert duplicate data' even though the combination does not exist in the database.
We have been using tasks generated from the SCD wizard. We have smaller dimensions (< 30,000 rows) that work well. Our Product Dimension package is giving us performance problems (taking 7 hours to do 600,000 rows when 80,000 records are updated; the rest new inserts). It is similar to the smaller dimensions. Several columns are type 1 and are doing update statements; several are type 2 doing updates and inserts. The package had a complicated view as the initial task, but we have since modified to use a SQL command with variable and now the initial read appears quick, but is chunking in 10,000 record increments and taking the 7 hours (never let finish previously). So the package is pretty basic now (reading a source, a small derive and data conversion, a small lookup (cached 30,000 records) for a description, then the SCD). Before I start replacing what the SCD generates with stored procedures, anyone have any suggestions as to what might be the issue? We believe we have increased the number of type 2 columns and the SCD definately has more to do than just an insert or update, but 7 hours for 600,000 records seems excessive. Interestingly, the source task never turns green. Previously when we had a Merge Join it completed the read and bottlenecked at a sort and a Merge Join. Now that has been removed and simplified, and all tasks remain yellow with the 10,000 (actually 9,990 I think) chunks appearing at the source, then the SCD before the next chunk appears to be read. On the general release (not the beta). Thanks in advance!
I am using SQL Server 2005 Developer Edition and have created several reports, deployed them to SQL Server and tested them.
Everything works correctly, but when I open the Report Manager using localhost/Reports, the page takes several minutes to connect and open. I am doing this on a new laptop I use for development and testing, so I can see where it would be a little slow, but a few minutes seems extreme. I don't have any problem with SQL Server itself, nor with VS 2005.
Are there any tips or setup issues which might affect the startup speed?
i am using SCD to insert or update .my source and destination table are Oracle and i am using Orcale OLEDB provider . i am getting the following error while executing the package.what could be the solution
[Slowly Changing Dimension [58]] Error: An OLE DB error has occurred. Error code: 0x80040E5D. An OLE DB record is available. Source: "Microsoft OLE DB Provider for Oracle" Hresult: 0x80040E5D Description: "Parameter name is unrecognized.".
Just wondering if any of you implemented a (Kimball type 2) dimension structure, in which a ParentID column exists which points to a record from the same dimension table, using a SCD objects in SSIS. The ParentID column would have to be "Historical".
The challange here is that you would need to go through the table twice somehow, because if I would do a lookup of the parent record in the first run, I wouldn't be sure if I got the right parent record.
I have a Company Dimension table that consists of various sources. One source will provide me address information, another source will provide industry info, etc. I created a historical load package that will pull all of this together so that I have all the necessary data related to a company in one record. All is well.
Since my company data is coming from various sources, how can I tell the SCD to update certain fields but not others for a type 2 change. In essence, I would like to "pull forward" the data that was in the original database row and then update it with only the changes coming from the proper source data. For example, if an address changed I will get the new address from the source but will not have the industry info. I would like to create the new record with the new address but also keep the industry data in tact. Is this possible?
Currently I will get the new record with the new address but will have null values for the industry data.
I have questions about Slowly Changing Dimensions. I am quite confused about when should we use type 1 ( changing), type2 (historical), or type3( fixed) for the dimensions in each table? Is there any good suggestions on that?
Thank you in advance and I am looking forward to hearing from you.
I am trying to move data from a transactional database to a data warehouse using a slowly changing dimension. The transactional data comes from a view in SQL server that takes <60 seconds to run and returns about 60k rows. The warehouse table is currently 80k rows long (and growing), and contains 7 historical (type 2) dimensions. When I execute the package in BIDS the DataFlow Task begins to execute, and shows that between 20k and 30k rows have been pulled from the data source into the SCD Transform in the first hour before it simply stops doing anything. This is not to say execution stops; it continues. There is no error thrown. No warning given. System resources are 98% free. The database is not being hit at all. And yet, I have let the package sit 'still' as it were for over 8 hours, and nothing ever happens.
4/8/2008 9:36 4/8/2008 9:36 PrimeOutput will be called on a component. : 1715 : Union All 4/8/2008 9:36 4/8/2008 9:36 A component has returned from its PrimeOutput call. : 1715 : Union All 4/8/2008 9:36 4/8/2008 9:36 PrimeOutput will be called on a component. : 2912 : Staged Queues 4/8/2008 9:36 4/8/2008 9:36 Rows were provided to a data flow component as input. : : 2970 : DataReader Output : 70 : Slowly Changing Dimension : 81 : Slowly Changing Dimension Input : 9947 4/8/2008 9:37 4/8/2008 9:37 A component has returned from its PrimeOutput call. : 2912 : Staged Queues 4/8/2008 9:37 4/8/2008 9:37 A component has returned from its PrimeOutput call. : 2912 : Staged Queues 4/8/2008 9:59 4/8/2008 9:59 Rows were provided to a data flow component as input. : : 1718 : New Output : 1715 : Union All : 1716 : Union All Input 1 : 3825 4/8/2008 9:59 4/8/2008 9:59 Rows were provided to a data flow component as input. : : 1688 : Historical Attribute Inserts Output : 1682 : Get End Date : 1683 : Derived Column Input : 645 4/8/2008 9:59 4/8/2008 9:59 Rows were provided to a data flow component as input. : : 1702 : Derived Column Output : 1692 : Update End Date : 1697 : OLE DB Command Input : 645 4/8/2008 10:01 4/8/2008 10:01 Rows were provided to a data flow component as input. : : 1759 : OLE DB Command Output : 1715 : Union All : 1758 : Union All Input 2 : 645 4/8/2008 10:01 4/8/2008 10:01 Rows were provided to a data flow component as input. : : 2970 : DataReader Output : 70 : Slowly Changing Dimension : 81 : Slowly Changing Dimension Input : 9947 4/8/2008 10:24 4/8/2008 10:24 Rows were provided to a data flow component as input. : : 1718 : New Output : 1715 : Union All : 1716 : Union All Input 1 : 3859 4/8/2008 10:24 4/8/2008 10:24 Rows were provided to a data flow component as input. : : 1688 : Historical Attribute Inserts Output : 1682 : Get End Date : 1683 : Derived Column Input : 641 4/8/2008 10:24 4/8/2008 10:24 Rows were provided to a data flow component as input. : : 1702 : Derived Column Output : 1692 : Update End Date : 1697 : OLE DB Command Input : 641 4/8/2008 10:26 4/8/2008 10:26 Rows were provided to a data flow component as input. : : 1759 : OLE DB Command Output : 1715 : Union All : 1758 : Union All Input 2 : 641 4/8/2008 10:26 4/8/2008 10:26 Rows were provided to a data flow component as input. : : 2970 : DataReader Output : 70 : Slowly Changing Dimension : 81 : Slowly Changing Dimension Input : 9947 4/8/2008 10:49 4/8/2008 10:49 Rows were provided to a data flow component as input. : : 1718 : New Output : 1715 : Union All : 1716 : Union All Input 1 : 3969 4/8/2008 10:49 4/8/2008 10:49 Rows were provided to a data flow component as input. : : 1688 : Historical Attribute Inserts Output : 1682 : Get End Date : 1683 : Derived Column Input : 662 4/8/2008 10:49 4/8/2008 10:49 Rows were provided to a data flow component as input. : : 1702 : Derived Column Output : 1692 : Update End Date : 1697 : OLE DB Command Input : 662 4/8/2008 10:49 4/8/2008 10:49 Rows were provided to a data flow component as input. : : 1793 : Union All Output 1 : 1787 : Get Start Date : 1788 : Derived Column Input : 9947 4/8/2008 10:49 4/8/2008 10:49 Rows were provided to a data flow component as input. : : 1814 : Derived Column Output : 1797 : Insert Destination : 1810 : OLE DB Destination Input : 9947 4/8/2008 15:34 4/8/2008 15:34 The pipeline received a request to cancel and is shutting down. 4/8/2008 15:34 4/8/2008 15:34 Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown or an error in another thread is causing the pipeline to shutdown. 4/8/2008 15:34 4/8/2008 15:34 Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown or an error in another thread is causing the pipeline to shutdown. 4/8/2008 15:34 4/8/2008 15:34 The pipeline received a request to cancel and is shutting down. 4/8/2008 15:34 4/8/2008 15:34 Thread "WorkThread1" has exited with error code 0xC0047039. 4/8/2008 15:34 4/8/2008 15:34 Thread "WorkThread1" has exited with error code 0xC0047039.
Notice the time difference between the last OnPipelineRowsSent event and the first OnError event (when I clicked the stop button): 5 hours! In all that time, SSIS did not log a single event, or use more than 2% of my processor or exceed 1GB page file or hit the database even once! I am assuming this means it is simply not doing anything. It is not failing, nor is it executing, it is just sitting there.
Has anyone experienced a similar problem? Does anyone know how I might troubleshoot this? Thanks in advance for any help, and let me know if I need to clarify. Also, I am new to SSIS, so if I am missing something obvious, go easy on me! Thanks.
The warehouse I am writing packages for has sort of a "Type 1.5" design for most of its DIMs that I am trying to get to work with the slowly changing dimension object.
Basically it should behave like a type 1 with updates in place BUT send the old prior rows/values to an "archive" server to hold the historical data. Unlike a Type 2 this data will not be used for any processes - but it needs to be kept for historical reseach and auditing.
Any ideas to easily do with the SCD wizard? I thought using the wizard as a type 1 and then after the wizard is done attaching the "Historical Attribute Inserts Output" to the archive db/table would do the trick but that output from the SCD object never has data. I could manually do it with a lookup and so forth but I thought I'd check in here first to see if I am just overlooking something with the SCD object.
Is there a way to change the data source for a SCD Component without having to go back to reinsert the matches for Source and Destination columns. Note the underlying data table hasn't changed, just the server the table resides on. Whenever I change the data source I am noticing that I have to painfully go back and match columns one by one.
Hi, I need to transfer the data in A table on a 2005 instance to B table which has the same structure as A table on a 2000 instance. There are 200,000 records in A table. If I use <insert B select * from linkedserver.....>, it takes only 30 seconds. I create a SSIS package to do this. But it is very slow. After it runs 10 minutes I have to stop it. And I find that it transfers about 100 records every second. Then I change the source server and destination server. That is transferring the same data from the 2000 instance to the 2005 instance. It takes only 50 seconds. why? How to make the package used for transfer data from the 2005 instance to the 2000 instance run fast?
does it build has values for inbound value and compare it to a stored hash value to determine if a change exists?
Are each of the attributes chequed one by one for a change?
Does a lookup occur at somepoint on the target table. i.e If I have a 16m row table will it lookup and cache all 16m rows or is it smart enough to only lookup and cache the rows that it expects based on a PK value
Basically what I'm asking is for a technical explanation as to how it works and how to tune it or the underlying data to make it perform well.
I have a problem with the SCD-transformation in SSIS. I have a variable that holds the batchid for the current batch and I want to add this variable to the datapipline in the Data Flow Task.
This is done by using a Derived Column, so far so good. The problem occurs in the Slowly Changing Dimension transformation where I do som evaluations of changed columns BUT I don´t want to do any evaluation of the batchid-variable because then all historical batchid will be updated.
I only want to update the batchid for row that have changed in the current batch.
Is it possible to do this in any way without adding the Derived Column after the SCD transformation??
We're using slowly changing dimensions to control a number of data tables in our system. Each table has five or six business keys, but the indexes of the tables are built so they're as efficient as possible (i.e. the fields with the highest diversity are listed first). How does the SCD wizard determine the order of the business key fields? Is there a way I can view or manipulate the statement the SCD task is using to make sure either (a) the indexes match the statement, or (b) the statement matches the indexes?
I am working on Creating an SSIS package to add Slowly Chaging Dimension to the package programmatically.
I have done the following steps: 1. Choosing the connection manager to access the data source that contains the dimension table that you want to update. You can select from a list of connection managers that the package includes. 2. Choosing the dimension table or view you want to update. After you select the connection manager, you can select the table or view from the data source. 3. Setting key attributes on columns and map input columns to columns in the dimension table. You must choose at least one business key column in the dimension table and map it to an input column. Other input columns can be mapped to columns in the dimension table as non-key mappings. 4. Choose the change type for each column. o Changing attribute overwrites existing values in records. o Historical attribute creates new records instead of updating existing records. o Fixed attribute indicates that the column value must not change. Code://Set the Key Element as part of Creating the SCD Transformation:
sbquery.Append("UPDATE SIRWorkdm..[Engagement] SET [BillingType] = ? WHERE [EngagementId] = ?"); //Here BillingType is the "ChangingColumnAttribute" and EngagementId is the key
On the face of it, the Slowly Changing Wizard seems a great idea, but is it really any good in the longer term? After you have selected your Type 1 and/or Type 2 changing columns ... then you have "customised" the generated data flow and carefully saved it ... how do you change the column definitions in the future?
If I want to add another column and make it Type 2 how do I go about it? If I run the "Wizard" doesn't this just destroy all my previous customisation work?
I have tried looking in the .dtsx xml file ... hmmm. I notice it's not really editable without some inside information. All those magic numbers in there ... I've fallen foul of them in the past with cut/paste or trying to INSERT a data flow task on the IDE ... luckily I back up my projects on a regular basis. I now have quite a large collection of Projects with the suffix "_corrupt". Are things going to get better?
Dim cnn As ADODB.Connection Dim rst As ADODB.Recordset
Private Sub Form_Load()
Set cnn = New ADODB.Connection cnn.ConnectionString = "driver={SQL Server};" & "server=SCHS-SQL;uid=sa;pwd=sa;database=Library" cnn.Open
Call loadrst
End Sub
Public Sub loadrst() Set rst = New ADODB.Recordset Dim sql1 As String sql1 = "select * from Books order by srno" rst.Open sql1, cnn, adOpenDynamic, adLockOptimistic, adCmdText
If rst.EOF = True Then MsgBox ("No records are present") Command1.Enabled = False Else Call display Command1.Enabled = True End If
End Sub
This is the code i use basically to connect my vb6 application to sql server 2005. I had started out lately trying to use sql server instead of access. So far none of the program have given any problems as the databases has a max one of 120 records. But the one which this code connects to has about 5200 records. I had imported the tables from access into sql server. The size of the database was around 17.67mb so i shrank it and it became 4mb. But still it takes roughly 2 minutes for the user to see the records in the grid. Could you tell me what to do?
I have a SSIS package which contains a number of slowly changing dimension transformations. While the majority work I have one which gives me the following error 'Error: The variable "System::LocaleID" is already on the read list. A variable may only be added once to either the read lock list or the write lock list. '. This error only occurs if the destination table holds data. If I truncate the table and reload the data then the package complete successful. The only difference I can see between this dimension transformation and the other dimension transforms is that the one in question has 2 business keys while the rest have 1.
We're currently running into an interesting situation where it seems that a Slowly Changing Dimension Transformation believes that 'Å“' (ASCII #156) is the same as 'oe' (ASCII #111 + ASCII #101).
To make a long story short, one of our integration package updates some Product table based on the result of a Slowly Changing Dimension Transformation with three outputs: Unchanged, New and Changing Attribute Updates. Among the columns leading to a row redirection in the Changing Attribute Updates output is some French Description column defined in SQL Server as a varchar(60) (external column) and in the SSIS package as a (DT_STR, 60, 1252) (input column). Now, when the SCD Transformation compares the word 'coeur' (external column) with the word 'cœur' (input column) in this French Description column for a given row, the row is redirected to the Unchanged output (no other columns changed...) instead of to the Changing Attribute Updates output.
Is my example clear enough? Any idea what could explain this unfortunate result? Note that from a strict French point of view, 'Å“' instead of 'oe' is a typographical fantasy and that in my example above, the word 'coeur' ("heart") is really spelled 'c' + 'o' + 'e' + 'u' + 'r', but we're talking programmed comparison here, not linguistic, right?
In our application we have created a SSIS package which extracts data from staging table and places the same in destination table. We have created a slowly changing dimension for the same. Slowly changing dimension uses a composite business key of two columns to decide whether it is a old record or a new record.
Problem : On execution of the package it copies duplicate records with same business keys instead of updating the same. Also the same does not happen for all records. For few records update works fine but for others it inserts a new duplicate record.
I will appreciate if anybody can guide me where I am doing something wrong.
Hello I use SSIS to load a Unicode file into a single table I Use a "slowly changing dimension" task to load the destination table and when i map a column (DT_WSTR) to a column with the datatype nvarchar(max) i have an error message that say that i can't map theses columns because there have not the same datatype.
I find a workaround : i map all my cols except the colums that must fill the cols with datatype nvarchar(max) , and after i modify manually the 2 subtask generated by the "slowly changing dimension" task (the insert and the update) and with this way i don't have error messages It works fine but is it the good way?