I have developed a big SSIS package to extract data from flat-files ( + 200 Dataflows ).
The situation is the following, inside de SSIS package, there are a lot of validations before extracting & loading the flat-files, i'm running this validations in paralell, so that when a file arrives, it enters the "validation process" and start extracting the file.
When i run the SSIS package from BIDS it works the way i have concepted it... but when i run the ssis in the server, the tables that are loaded through the process are only "available" when the SSIS PACKAGE ends, it is imperative that trough the process, when a table receives new data, it becomes ready, and don't just be available when the SSIS package finishes...
I have attached the an lousing .jpeg.
It is importart for the tables to be available, so the stored procedures(OUTSIDE SSIS PACKAGE) that are dependent of some tables, start working before the SSIS package Ends.
I have developed a big SSIS package to extract data from flat-files ( + 200 Dataflows ).
The situation is the following, inside de SSIS package, there are a lot of validations before extracting & loading the flat-files, i'm running this validations in paralell, so that when a file arrives, it enters the "validation process" and start extracting the file.
When i run the SSIS package from BIDS it works the way i have concepted it... but when i run the ssis in the server, the tables that are loaded through the process are only "available" when the SSIS PACKAGE ends, it is imperative that trough the process, when a table receives new data, it becomes ready, and don't just be available when the SSIS package finishes...
I have attached the an lousing .jpeg.
It is importart for the tables to be available, so the stored procedures(OUTSIDE SSIS PACKAGE) that are dependent of some tables, start working before the SSIS package Ends.
I'm trying to do some custom SSIS logging using event handlers, similar to the ideas provided by Jamie Thomson in the past. My problem is that when I use System:ourceID as one of the items to be logged, I can't match up the SourceID to any of the GUIDs that are displayed in the property window for the various tasks in my package.
Where is this sourceID coming from and how can I track it down?
I have some "Execute T-SQL Statement Tasks" in a package. I would like to run this same package on another SQL Server without having to change it on the other server. Since the server name can be given when setting up the connection, I think if I leave the server name out then the package could run on any server? Is my assumption correct?
I've made a query like the one in msdn (SELECT * FROM __InstanceCreationEvent WITHIN 10 WHERE Targetinstance ISA "CIM_DirectoryContainsFile" and TargetInstance.GroupComponent= "Win32_Directory.Name="e:\\temp""). I have 20 similar tasks for watching in different folders, but when there are too much tasks in parallel, it doesn't work anymore. I change the numbers of executables to 128 (in the general properties of the package (to test)) but it doesn't seems to work.
I don't understand why it works when there are only 1 or 2 (6 seems to be the maximum) tasks and not if there are more than 6.
Could you help me with this issue?
Configuration : Windows Server 2003, SQL Server 2005, SSIS, Sql Server Agent
Hi, Is it possible to change the Dataflow tasks properties with VB coding? I've managed to change the properties of the control Flow tasks with VB code! as I wan to create a generic dataflow that will change everytime I run it. Cheers
I've create a package that currently uses 5 DataFlow tasks connected in series to get data from 5 different files and place that information into 5 different temp tables. Each Dataflow task contains only a OLE Source, a row count and a OLE destination. My question is - Is it normal practise to keep each of these separate, or should I put them all into a single DataFlow? The package should only continue if all five dataflow task complete successfully.
How would you do a log in a massive rows loading, I'm having problems because every row error(because of casting, format, lookup) in a transformation task is redirected to a text file as a log, this is ok when only exist one error by row, but in the case when I have two errors in the same row detected by diferents transformation tasks only the first one is reported to the text file, I have to wait to the second information load, after I correct the first error, to find the second one, I need to validate as many errors exists by row in the same load...
which component or which strategy can I use in a SSIS Packge to achieve this?
These refer to Microsoft.SqlServer.DTSPipelineWrap.dll and Microsoft.SQLServer.DTSRuntimeWrap.dll. While these assemblies were already there in my dev machine I don't find these files in production enviornment for SQL server 2005. I am refering these assemblies from following path in my local machine : C:Program FilesMicrosoft SQL Server90SDKAssemblies.
How to install these assemblies in prod env, offcource one option is to copy it and then put it in GAC thru script . Why does not it gets installed while installation of SQL server 2005. Are these assemly dependent on SP1 ?
p.s. Does anyone have any needles I can borrow? I think sticking them in my eyes would be nicer than working with SSIS.
===================================
An error occurred while objects were being copied. SSIS Designer could not serialize the SSIS runtime objects. (Microsoft Visual Studio)
===================================
Could not copy object 'Preparation SQL Task' to the clipboard. (Microsoft.DataTransformationServices.Design)
------------------------------ For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%u00ae+Visual+Studio%u00ae+2005&ProdVer=8.0.50727.762&EvtSrc=Microsoft.DataTransformationServices.Design.SR&EvtID=SerializeComponentsFailed&LinkId=20476
------------------------------ Program Location:
at Microsoft.DataTransformationServices.Design.DtsClipboardCommandHelper.SerializeRuntimeObjects(ICollection logicalObjects) at Microsoft.DataTransformationServices.Design.ControlFlowClipboardCommandHelper.InternalMenuCopy(MenuCommand sender, CommandHandlingArgs args)
===================================
Invalid access to memory location. (Exception from HRESULT: 0x800703E6) (Microsoft.SqlServer.ManagedDTS)
------------------------------ Program Location:
at Microsoft.SqlServer.Dts.Runtime.PersistImpl.SaveToXML(XmlDocument& doc, XmlNode node, IDTSEvents events) at Microsoft.SqlServer.Dts.Runtime.DtsContainer.SaveToXML(XmlDocument& doc, XmlNode node, IDTSEvents events) at Microsoft.DataTransformationServices.Design.DtsClipboardCommandHelper.SerializeRuntimeObjects(ICollection logicalObjects)
Hi, I have one package with a complex Control Flow. I need to import a dataflow from another package into it. Cut and Paste is working but it looks like my package corrupts after that. (for example, login to database has problem with authentication now) Is there any import or export tool for this? Regards
I have 4 data flows in a for loop, is it possibly to roll the whole for loop back if any of the transactions fail. Also is it possibly to roll back a dataflow if any of the components inside the dataflow fail?
I'm building packages programmatically and all is well. I have a new custom transform that I developed. It also works fine. Now I'm trying to add my new component to my packages when I programmatically build them, and I'm unable to do that.
Has anyone added their own custom components to a programmatically built package successfully?
I get a COM error on the line that calls ProvideComponentProperties. I've attempted various modifications including not overriding ProvideComponentProperties or just having it do nothing. I always get the same result. What I don't understand is that the custom transform works and handles ProvideComponentProperties fine when it is added to a package in BIDS.
I have done a search and have read some of the posts, but am left more confused than before. I am fairly new to SSIS. Here is my situation and what i am trying to accomplish.
I have a package that has a sequence container, in which there are multiple SQL tasks (about 20) running in parallel. I have checkpoints enabled, and FailPackageOnFailure enabled as well. If the package fails, when i re-run the package it will run the last task as well as all the other tasks. What I am looking to accomplish is when the package is re-run, have the SQL tasks that failed ran and not the previous successful tasks.
I think the best way would be via disabling tasks on successful completion of a task, where it writes the name of the SQL task to a temp table, but I am skeptical.
Can anyone point me in a direction to help me accomplish what I am looking for please.
We want to develop an error handling process that will log the errors into multiple destinations (eventlog, text files or sql database) depending upon a variable set in the package. Also we want that this errror handling process should be initiated by all the tasks in the package on error.
Is this possible? Can the same event handler be called from multiple tasks in the package? Also in the event handler can we call another package which actually does the error handling. This way we have only one place to change our error handling process in case required.
This may sound a little anal-retentive, but I have a number of SSIS packages that, when I open them, the first thing I have to do is scroll to the left or up to get to where the tasks are displayed. Even if I move the tasks right or down, they still end up in that initial position. This happens even if I use auto arrange.
Is there a way for me to set the package so it has a consistent point or display at which it opens?
Hello Trying to figure out a clever solution for splitting multivalued columns out into n-columns. For that I've build a custom component in SSIS using
ms-help://MS.VSCC.v80/MS.VSIPCC.v80/MS.SQLSVR.v9.en/dtsref9/html/4dc0f631-8fd6-4007-b573-ca67f58ca068.htm as an example.
I need to be able to add columns to the OutputCollection in designtime, but the designer returns an error: Error at Data Flow Task [Uppercase [5910]]: The component "Uppercase" (5910) does not allow setting output column datatype properties.
How do I enable the designer to accept designtime changes in the columncollection?
I created a dataflow that transferred about 1 million records from a SQL database on one server to a differend SQL database on the same server. The processing took about 30 minutes. I used the Fast Load option.
I then created a "Execute SQL Task" and wrote a "SELECT * INTO TABLE" and this processing took about 30 - 60 seconds.
Can someone tell me why creating a Data Flow Tak would take so much longer or give differences between the two options above? Can someone give some pointers on how to make a Data Flow task more efficient?
I am trying to recode a SQL2K DDQ into an SSIS Dataflow. I have no issue recoding it in SSIS except I am not sure that I am doing it correct way.
My DDQ has source,destination, update/delete/insert statements and few lookups. I have used activex script for the transformation because I have atleasst 10 If conditions. For every "if condition" the destination columns are popuated with different lookups/source columns and constant values.
Now When I start doing it with SSIS I have to use at least 10 Conditional split. and then at least one lookup,one OLEDBCommand and one OLEDB Destination for each of them . that brings my count of DF objects to 30-40
It makes my data flow to complex with two many objects. Earleir I could do whole of this in one sngle DDQ. It makes me think if I am doing it the correct way. should I be using Activex Script Task to these kind of activity.
I need to create a ssis package. I want to import the data from a flat file to a table.
Lets say, the table has 5 columns -- col1, col2, col3, col4 , col5.(Assume that all columns can be NULLABLE) The datafile contains the data related to only three columns say col1, col2, col3. So when I use dataflow task to import the data from the file to the table, I will only get three columns, col1, col2, col3. Columns col4, col5 will be NULL. However, I want to populate columns col4, col5 with some values which are stored in the variable.
When I drop a new component onto the design surface, it appears with dotted lines around it, as it is selected.
But, the F2 key (the hotkey for rename) does not work.
I have to click on some other component, and then back on the new component, and then the F2 key works.
I remember reporting this bug back in the beta cycle, but it is still present even in the release -- I actually think Ã?'m using the SP1 version of 2005, but see version info below to be sure:
Remote terminal services broke my copy buffer again, as is so often does, so pasting in the version info failed -- falling back to manual typing -- SSIS 9.00.2047.00.
It just occurs to me that it might not be a bad idea to post again the problems in SSIS that I found back in the beta cycle which do not seem to have been fixed, to increase the chance that someone will notice them.
(During the beta cycle, I spent hours trying to get bugs posted into the beta bug system, but the beta web bug site had so many problems that I could never get it to work, unfortunately, so I fell back to just posting my bugs and hoping some developers would notice them. Kirk had a thread for bugs and RFEs at one point, where I posted some of the important ones.)
Here is another fairly bad one:
The dialog that pops up to show fields in the dataflow which have been orphaned, after some fields have been removed earlier in the dataflow, and which allows the user to rename them or delete them, has a very bad display problem:
It shows the box names before the field names, so if the box names are descriptive, the field names are far off-screeen, and it is very difficult to get them to display on screen.
So the user starts off with all the items in the drop-down being indistinguishable, as they all start with the box names, and the important part of the strings are far off-screen at the right -- the actual field names.
I don't recall if I actually ever posted this one; I may not have.
Okay I 'm not sure what to do next (S2K5 64 bit SP2) Migrated across simple DTS which picks data up from MSaccess and ships it into a database table!
I switched the 64bit run time to false and happy days it shipped the information across.
I then decided I wanted to deploy the package so I created a deployconfig.xml with all properties ticked against my two connections in connection manager. ( the MSacessdb and the S2k5 database )
after that I switched the deployment util to true.
then I fired the deployment manifest
Then created a job to schedule it fails with
Executed as user: SEA-SRV-00009SYSTEM. ...00.3042.00 for 64-bit Copyright (C) Microsoft Corp 1984-2005. All rights reserved. Started: 4:09:33 PM Error: 2007-09-10 16:09:38.70 Code: 0xC0202009 Source: Trackitnew Connection manager "TRACKIT" Description: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040154. An OLE DB record is available. Source: "Microsoft OLE DB Service Components" Hresult: 0x80040154 Description: "Class not registered". End Error Error: 2007-09-10 16:09:38.72 Code: 0xC020801C Source: Data Flow Task Task Collection [22] Description: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "TRACKIT" failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed. End Error Error: 2007-09-10 16:09:38.72 Code: ... The package execution fa... The step failed.
SO firstly I changed the surface area config to enable xp_cmdshell
created another job but ran TSQL xp_cmdshell 'dtexec /FILE "D:DTSTrackit.dtsx" /MAXCONCURRENT " -1 " /CHECKPOINTING OFF /REPORTING EW'
I ran the job and it says succesful but when I do a query on the table there is no data could anyone please point me in the right direction? Thanks
I have several sequence containers in one package that fire off execute package tasks. I would like each of the sequence containers to start at the same time when the job starts running. However when I set them up to do that, i get an error that the variable cannot be read because it is locked. I have the variables setup as readonly so not sure why they are being locked. When I run the package and have each sequence container fire off after the previous one ends it runs fine.
I'm trying to keep track of the ETL process inserting/updating a row in one table for each package that finish in my ETL process when executing. So far, I created a Script task that increments by one a variable (counter) and then open a connection to my database an insert/update my table. What I want to see is Step 1/30, Step 2/30 and so on. Right know I can display Step 1, Step 2 but how can I get the overall number of tasks within a package?
My apologies if this is a very basic question, but I am having a very difficult time finding the answer.
My very, very simple dataflow task is PAINSTAKINGLY slow. (It took over an hour to transwer @300,000 records). I'm doing no transformations whatsoever. In fact, the only reason I'm using the Data Flow component here is for its error tracking capabilities.
Here's a brief description-
1) The source is an OleDB datasource object that uses an OLEDB connection to access a SQL Server 2000 database.
2) The output from the source is dumped directly (no data transformations) into an OLEDB Destination Object (uses an OLEDB connection to access a View on a SQL Server 2005 database). Individual row errors are pushed to a seperate logging table.
Based on the advice of an article I read, I removed the "OleDB Destination" object and used the records from the OLEDB source as the input to a RowCount Transformation. This still took a SIGNIFICANT amount of time. I'm guessing that my problem is with using an OleDB Source component???? That seems really strange though... wouldn't it be optimized? What are my workaround options?
I am using a parameterized select query as displayed below to fetch values from source.
SELECT A.Account_GUID, M.Merchant_GUID, H.Household_GUID, B.BankAU_GUID, SR.SalesRep_GUID, E.Entitlement_GUID, I.Income_GUID, Exp.Expense_GUID, SP.Sales_Product_GUID, P.Product_cd, SUM(S.Sales) AS Monthly_gross_MC_VI_amt, SUM(S.Sales) - SUM(S.[Returns]) AS Monthly_net_MC_VI_amt, SUM(S.SaleTxns) AS Monthly_gross_MC_VI_tran_cnt, SUM(S.SaleTxns) - SUM(S.ReturnTxns) AS Monthly_net_MC_VI_tran_cnt FROM Account AS A LEFT OUTER JOIN dbKAIExtract.dbo.tblSales_STG AS S ON A.Account_No = S.AccountNo And S.BucketNo = ? And S.ProductCode in ('01','02') LEFT OUTER JOIN Merchant AS M ON A.Account_No = M.Account_no INNER JOIN SalesRep AS SR ON SR.Rep_SSN = isnull(A.rep_SSN,'000000000') INNER JOIN Household AS H ON A.Account_No = H.Account_no LEFT OUTER JOIN BankAU AS B ON A.Assigned_AU = B.AU_No LEFT OUTER JOIN SalesProduct AS SP ON A.Account_No = SP.Account_no And SP.Reporting_Interval_Id = ? LEFT OUTER JOIN Entitlement AS E ON E.Account_no = A.Account_No AND SP.Product_Cd = E.Entitlement_Card_Type LEFT OUTER JOIN Income AS I ON I.Account_no = A.Account_No And I.Reporting_Interval_Id = ? LEFT OUTER JOIN Expense AS Exp ON Exp.Account_no = A.Account_No And EXP.Reporting_Interval_Id = ? LEFT OUTER JOIN Product AS P ON P.Product_cd = SP.Product_cd WHERE (A.current_ind = 1) AND (SR.current_ind = 1) GROUP BY A.Account_GUID, M.Merchant_GUID, H.Household_GUID, B.BankAU_GUID, SR.SalesRep_GUID, E.Entitlement_GUID, I.Income_GUID, Exp.Expense_GUID, SP.Sales_Product_GUID, P.Product_cd
My problem is, I am not able to assign any variables to parameterized query. Can any body guide how to assign respective variables to the parameterized query. I have the above query as a part of OLE DB Source step within Data flow task.