Programmatically Iterating Tasks/components In The Data Flow Portion Of A Package.
Mar 6, 2007
HI All,
In several threads there has been discussion regarding adding connection managers to a package's data flow, etc. My challenge is that I have a large solution that contains many packages, and I need to change the connection manager linked to the data flow in all of the packages. When the solution was initially designed, data sources were used, and it has become a tedious maintenance issue to keep those in sync. We want to use a standard OLEDB connection manager, but adding a connection manager to each package and editing the corresponding data flow tasks in each package to use that new connection manager is a daunting task. I've coded a .Net module to access the packages, remove the old connection manager (data source) and add the new OLEDB data source. However, as I traverse the objects in the package hierarchy, when I come to the data flow object, the innerobject is not a dts object, but rather a _com object.. I can't seem to find any documentation/examples as to how to iterate the tasks within a data flow and change the connection manager. If you have any information, that would be quite helpful. If you reply with a code sample, if you would be so kind as to relate it to one of the sample packages provided with SSIS so I can run it, that would be great.
I have come across a situation where there 10 tasks. The second task on the flow is a script task which disables all further tasks based on a condition. I thought that the logic would be better if we force terminate the package successfully at this stage itself. How can this be done.
I have Data Flow task that contains 50 components.
My computer configuration: 1 GB RAM Microsoft Windows Server 2003
Periodicaly when i try to save package after making some changes Out of memory ... exceptions message box appears , and soon after this Not fatal error occurs ... message box shows . If i close solution and open it again all my 50 components disappears --instead I see clear list, and all my work losen.
Such "Not fatal errors" making hell out of job -- every time I need to change package i must add package to archive!!!
We're experiencing a problem where intermittently our SSIS packages will hang. There are no log errors or events in the event viewer. It will happen whether the package is executed from the SQL Job Agent or run from BIDs. When running from BIDs it appears to hang inside one of the data flows (several parallel pipes with sorts, merge joins etc...). It appears to hang in multiple pipes within the data flow component. The problem is reproducable, we just kill it and re-run, and it appears to hang in the same places.
Now here's the odd thing: as we simply open and close some of the components in the pipe line after the place it hangs, a subsequent run will go further in the pipeline before hanging. If we open and close all the components after the point it initially hung, the data flow will run fine, from there on out. When I say "open and close" I mean no changes are made, we simply double-click the component, like a merge join, then click 'close.'
To me this does not seem like a memory problem but likely something is wrong with the metadata, where opening a component and closing it somehow alters the metadata to "right it".
This seems to occur intermittently after we make modifications to the package. It's like if you make any mod, even unrelated to the data flow, you then have to go through and open and close every component in your package to ensure it will work. Again, no errors or warnings are fired.
Everything I've read says that custom data flow components are built by inheriting from the Microsoft.SqlServer.Dts.Pipeline.PipelineComponent class.
But the stock components such as the Derived Column data flow transformation must each be implemented by their own class. So how do I base my custom components on those classes? The documentation for the PipelineComponent class doesn't list any such subclasses.
I'm trying to keep track of the ETL process inserting/updating a row in one table for each package that finish in my ETL process when executing. So far, I created a Script task that increments by one a variable (counter) and then open a connection to my database an insert/update my table. What I want to see is Step 1/30, Step 2/30 and so on. Right know I can display Step 1, Step 2 but how can I get the overall number of tasks within a package?
I am trying to programmatically execute a package that contains an Execute SQL Task component bound to a variable for its "SqlStatementSource" property (via an expression). The variable is of type String and contains a simple value of "SELECT 1". The Execute SQL Task contains an expression that sets the SqlStatementSource property to the value of this variable.
The package runs fine when I execute it via dtexec or BIDS, but when I attempt to run it via the object model, I receive the following error message:
The result of the expression ""@[User::Sql]"" on property "SqlStatementSource" cannot be written to the property. The expression was evaluated, but cannot be set on the property.
I did a search on this forum and noticed quite a few threads about this same issue, but no explanation/solution. We have quite a few packages that have dynamically constructed SQL statements for Execute SQL Tasks, and they are all failing to run via the object model. Is there something that I am missing?
Dear All! My package has a Data Flow Task. In Data Flow Task, I use a Script Component and a OLE BD Destination to transform data from txt file to database. Within Data Flow Task, I want to call File System Task to move file to a folder or any Task of "Control Flow" Tab. So, Does SSIS support this task? Please show me if any Thanks
I am developing an SSIS solution where the Data Flow task extracts data from a source csv file, then performs several transformations on the source data and then starts inserting the cleansed data into several destination tables.
The Data Flow task is getting too large!
Question: Is there a way (best practice) for grouping components in the Data Flow - similar to the Container concept in the Control Flow?
I know this question sounds too luxurious, but I really loose the overall picture, when the Data Flow canvas gets too crowded.
I have a SSIS Package which I would like to modify using SSIS API. I need to put new component between some two existing data flow's components. During this process I need to disconnect two data flow's components using SSIS API. How can I do that?
When I copy over an SSIS package I have been developing from my laptop to my desktop with Windows File Sharing (shared folders) across a home network, the moved package fails to load properly. I can see the Control Flow tasks but not the Data Flow tasks - they simply disappear! I have update the Connection Managers to point to the new machine, and tested them (OK).
Its the same as this issue posted here
And Here
Creating an empty package and clicking to create a new data flow cause this error:
TITLE: Microsoft Visual Studio ------------------------------
The designer could not be initialized. Microsoft.DataTransformationServices.Design)
Seems the only way is to uninstall and reinstall Client Components.
How do I uninstall and reinstall Client Components? when I try to do that, SQL Server Setup tells me "None of the selected features can be installed or upgraded. Setup cannot procees since there is no effective change being made to the machine...."
I had selected "Workstation components, Books online and Development tools". This is already installed - so how I can reinstall it if I get this message above which does not let me proceed.
Is it possible to add a Fuzzy Grouping Transformation in a Data flow task by Programmatically ? If it possible, what is the C# or VB .net code for that ?
I have created around 10 seperate packages for our application data load. Now I am planning to create a master package (or a wrapper package) which will execute all the 10 packages (thru execute package task). Then I have a job which executes the master package at a given date and time.
Question : How can I enable / disable execution of each package within the master package depending upon a flag variable. The reason why I need this mechanism is if the flag = 0 then I don't want all 10 packages within master package to execute and if flag = 1 then master package execution should begin and subsequently execute all packages within that master package.
i got a foreach loop that has about 20 data flow tasks(same database connections but different extractions) but i notice that when i execute the project it only runs 4 data flow tasks at a time.
i know that there is an option for each data flow to set the "Engine Threads", but is there a way to set the thereads in a foreach loop or for the whole project so it will execute all data flow tasks in one go for each loop.
i have a for each loop and it has about 20 data flow tasks (simple data extractions). i notice when i run the package it only runs up to 4 data flow tasks at a time. others have to wait till one of the first 4 flows finishes.
i was wondering if there's a way to change the limit of how many data flow tasks can run at a time. is there a property some where ?
i know this will be stressfull to the server, but the server is well equiped with CPU power and memory, so performance will not be an issue.
I've been having an issue with the integration of a third-party DLL into a custom data flow component.
The company sent me a C# project that generates a simple console application. The project includes a class that calls their DLL with DllImport. The console application runs fine.
I created a stand-alone class using the C# class they sent to expose the methods of their DLL. In my custom component, I'm referencing this class to pass data to and from their DLL.
The first method from that stand-alone class that my component encounters simply gets their installation path from the registry and does not use DllImport. That path retrieval works fine. The next method calls a function that is declared with DllImport. Each time the call fails with "System.DllNotFoundException = {"Unable to load DLL AMZip.dll': Exception from HRESULT: 0xE06D7363"}".
I've copied this DLL to countless locations (e.g., the PipelineComponents directory, the project/solution bin directory) and included these paths in all manner of path variables.
What am I missing here? Their DLL is not strong named (does this matter since I'm using DllImport?), my stand-alone class is, and of course, the custom component itself is. I appreciate the help.
Hey all, is there any explanation why just opening a data flow task causes a VSS check out?
The issue with this is that I have multiple developers, and when one wants to look at a data flow task in a package already checked out, it generates multiple VS errors.
Not big on the priority list, (not as big as the modal dialog issue) but still a pain.
I am developing tools for automatic creation of data warehouse tables, cubes and SSIS packages. Generating the SSIS Data Flows works very well using the SSIS components for OLE DB Source, Derived Column, Lookup and OLE DB Destination.
However for some of the advanced functionality I need to use Script Component. I have managed to add it in the Data Flow with all inputs and outputs, but how do I populate it with my code? I've seen there is a component property called "SourceCode" and one called "BinaryCode". The "SourceCode" contains the code, but also some extra metadata.
Questions:
Do you know if there is any programmatic support to generate the Source Code property with the metadata necessary?
Do you know how to compile the Source Code and generate the property BinaryCode?
Has anyone done this? I can't find anything in the documentation that describes this. The closest I get is to the InnerObject property of the TaskHost class. There is an example of programming a bulk insert task. But I can't find anything on programmatically setting the column mappings (source to dest) of a simple data flow task. Any help is appreciated!
I need to parameterize some values in the data flow so that i can chnage the values directly in parameter file and re run the data flow for new value in the passed in the parameter. This can be easy for other who do not know about the flow of data flow task as to where to change the variable/parameter.
How can this be accomplished. I want the data flow task to refer to this file before it starts executing and pick the appropriate value from the file.
Or is their any better way to accompalish what i want to do here in SSIS.???
I have a requirement to read an encrypted file as a data source. I am not allowed to save an unencrypted text file version on disc at any time for any length of time, therefore I created a custom source component that reads an encrypted csv file, decrypts it, and then passes each row of data to the pipeline and ultimately to an ole data destination. Basically it is just a text file reader with an added class that adds functionality that decrypts the file before the component sets columns or reads rows.
The custom component, “Encrypted File Source”, has a custom property “encryptionkey” with the encryption required flag set to true (code below) and is declared as eligible to be set in the expressions.
IDTSCustomProperty100 EncryptionKey = ComponentMetaData.CustomPropertyCollection.New(); EncryptionKey.Name = "EncryptionKey"; EncryptionKey.Description = "Secure String key value to decrypt the file"; EncryptionKey.Value = string.Empty; EncryptionKey.ExpressionType = DTSCustomPropertyExpressionType.CPET_NOTIFY; EncryptionKey.EncryptionRequired = true;
I want to be able to set the password for the encrypted file in the SQL Agent job that executes the SSIS project. This means I have an environment with a variable, “DataPassword”, that is set to sensitive. It maps to a Project parameter in the SQL Agent job that is also set to sensitive. And I now I want to access that sensitive Project Password inside my data flow, specifically in the Encrypted File source task that I created and set my EncryptionKey to that Project Parameter.
The problem is that SSIS says.
"expression cannot be evaluated. ... The Expression will not be evaluated because it contains sensitive parameter value "$Project::DataFilePassord" . Verify that the expression is used properly and that it portects sensitive information" ((Microsoft.DataTransformationsServices.Controls) "<v:shapetype coordsize="21600,21600" filled="f" id="_x0000_t75" o:preferrelative="t" o:spt="75" path="m@4@5l@4@11@9@11@9@5xe" stroked="f">
[Code] ....
I am using SQL Server 2012, on a windows 7 box with VS2010 premium.
I have yet another question. How can i pass data b/w 2 data flow tasks? I'm trying to get some data from one sql server and then I want to pass this whole bunch of data to another data flow task which is going to get some data from second sql server. I would probably combine them using union all and then save that as a transaction in a third sql server?
Information regarding how to pass the data with some detailed discussion would be fine with me.
How do I retrieve the connections (connection managers) collections from Custom Data Flow destination? ComponentMetadata.RuntimeConnectionCollection is empty. I would like to be able to access all the connections defined in the package from the custom data flow task.
I came across code in which it was possible to access the Connections collection using the IDtsConnectionService for custom task (destination). The custom task has access to serviceProvider, whcih can be used to get access to the IDtsConnectionService interface but not the custom data flow task.
I would like to fetch the data flow component name while package is executing. Since system variable named [System::SourceName] only fetches name of the control flow tasks? Is there a way to capture them?
I had an idea that it would be nice to be able to extend the functionality of an existing SSIS component or task by inheriting from it. Perhaps in a similar way to how it is possible to extend user controls in .Net.
e.g. The rowcount is a useful component but it would be good to create a new component that inherits from it and then override the PostExecute() method to fire an Information event containing the number of rows. That's a very simple example but I think you get the gist.
Does anyone think that would be useful? Or even made possible?
I was also wondering whether it would be possible for the SSI team to make all the icons used within SSIS available as .ico files so that we could modify them for tasks/components that might do similar things.