Using Expressions In Custom Data Flow Transformation
Sep 18, 2006
Hi,
I'm creating a custom data flow transformation in c#.
I would like to use expressions within this component in the same way as in the derived column component: specifying the expression as a custom property of an output column, then evaluating this expression for each row of the buffer and using this evaluated expression to populate my output column values.
So I've added an custom expression on my output column, and set its expression type to CPET_NOTIFY
But in the ProcessInput method I don't manage to get the evaluated expressions, when I use exp.Value I get my expression definition and not its evaluation.
Is there a way to get these evaluated expressions ?
I've created my own custom data flow transformation task (using C#) that will parse a fullname and output the various name parts. In the ProvideComponentProperties method, I create 5 output columns (prefix, first, middle, last, and suffix). In the ProcessInput method, I parse the input and add the name parts to the buffer. The bad thing is that I€™m making an assumption on the position of the Full Name input column within the buffer.
I would like the €œuser€? to be able to map their "full name" input column to a known Full Name column so I don€™t have to make any assumptions. This is the first SSIS task I€™ve tried to create and I haven€™t been able to find very many examples online.
I created a custom transform that has a custom interface and is a wizard that uses a web service. It creates custom properties and output columns on the fly. I set the dialog result to Ok and close at the end of the steps. The transform then has the custom fields and output columns I created in the wizard. I've verified this by right clicking on the transform and going to the advanced editor. If I then immediately run the package, the custom fields don't exist in the CustomPropertiesCollection. If I close the package and reopen it, the properties now are gone. If I then go through the wizard again, thus recreating the properties, they stay and don't disappear. The quickest way to get a working transform is to add it to my data flow then save, close and reopen the package and then go through the wizard. Just saving after I add the transform does not help.
Does anyone know what might be causing this very strange problem?
I've built a simple custom data flow transformation component following the Hands On Lab (http://www.microsoft.com/downloads/details.aspx?familyid=1C2A7DD2-3EC3-4641-9407-A5A337BEA7D3&displaylang=en) and the Books Online (ms-help://MS.MSDNQTR.v80.en/MS.MSDN.v80/MS.SQL.v2005.en/dtsref9/html/adc70cc5-f79c-4bb6-8387-f0f2cdfaad11.htm and ms-help://MS.MSDNQTR.v80.en/MS.MSDN.v80/MS.SQL.v2005.en/dtsref9/html/b694d21f-9919-402d-9192-666c6449b0b7.htm).
All it is supposed to do is create an output column and set its value to the result of calling a web service method (the transformation is synchronous). Everything seems fine, but when I run the data flow task that contains it, it doesn't generate any output. The Visual Studio debugger displays it as yellow, with 1,385 rows going into it, but the data viewer attached to its output is empty. The output metadata looks just like I expect: all of my input columns plus the new column, correctly typed. No validation or run-time warnings or errors are reported.
I'll include the entire C# file below, which only overrrides the ProvideComponentProperties, Validate, PreExecute, ProcessInput, and PostExecute methods of the parent PipelineComponent class.
Since this is effectively a specialization of the DerivedColumn transformation, could I inherit from the class that implements the DC component instead of PipelineComponent? How do I even find out what that class is?
Thanks! Here's the code: using System; // using System.Collections.Generic; // using System.Text;
using Microsoft.SqlServer.Dts.Pipeline; using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using Microsoft.SqlServer.Dts.Runtime.Wrapper;
namespace CustomComponents { [DtsPipelineComponent(DisplayName = "GID", ComponentType = ComponentType.Transform)] public class GidComponent : PipelineComponent { /// /// Column indexes for faster processing. /// private int[] inputColumnBufferIndex; private int outputColumnBufferIndex;
/// /// The GID web service. /// private GID.WS_PDF.PDFProcessService gidService = null;
/// /// Called to initialize/reset the component. /// public override void ProvideComponentProperties() { base.ProvideComponentProperties(); // Remove any existing metadata: base.RemoveAllInputsOutputsAndCustomProperties(); // Create the input and the output: IDTSInput90 input = this.ComponentMetaData.InputCollection.New(); input.Name = "Input"; IDTSOutput90 output = this.ComponentMetaData.OutputCollection.New(); output.Name = "Output"; // The output is synchronous with the input: output.SynchronousInputID = input.ID; // Create the GID output column (16-character Unicode string): IDTSOutputColumn90 outputColumn = output.OutputColumnCollection.New(); outputColumn.Name = "GID"; outputColumn.SetDataTypeProperties(Microsoft.SqlServer.Dts.Runtime.Wrapper.DataType.DT_WSTR, 16, 0, 0, 0); }
/// /// Only 1 input and 1 output with 1 column is supported. /// /// public override DTSValidationStatus Validate() { bool cancel = false; DTSValidationStatus status = base.Validate(); if (status == DTSValidationStatus.VS_ISVALID) { // The input and output are created above and should be exactly as specified // (unless someone manually edited the persisted XML): if (ComponentMetaData.InputCollection.Count != 1) { this.ComponentMetaData.FireError(0, ComponentMetaData.Name, "Invalid metadata: component accepts 1 Input.", string.Empty, 0, out cancel); status = DTSValidationStatus.VS_ISCORRUPT; } else if (ComponentMetaData.OutputCollection.Count != 1) { this.ComponentMetaData.FireError(0, ComponentMetaData.Name, "Invalid metadata: component provides 1 Output.", string.Empty, 0, out cancel); status = DTSValidationStatus.VS_ISCORRUPT; } else if (ComponentMetaData.OutputCollection[0].OutputColumnCollection.Count != 1) { this.ComponentMetaData.FireError(0, ComponentMetaData.Name, "Invalid metadata: component Output must be 1 column.", string.Empty, 0, out cancel); status = DTSValidationStatus.VS_ISCORRUPT; } // And the output column should be a Unicode string: else if ((ComponentMetaData.OutputCollection[0].OutputColumnCollection[0].DataType != DataType.DT_WSTR) || (ComponentMetaData.OutputCollection[0].OutputColumnCollection[0].Length != 16)) { ComponentMetaData.FireError(0, ComponentMetaData.Name, "Invalid metadata: component Output column data type must be (DT_WSTR, 16).", string.Empty, 0, out cancel); status = DTSValidationStatus.VS_ISBROKEN; } } return status; }
/// /// Called before executing, to cache the buffer column indexes. /// public override void PreExecute() { base.PreExecute(); // Get the index of each input column in the buffer: IDTSInput90 input = ComponentMetaData.InputCollection[0]; inputColumnBufferIndex = new int[input.InputColumnCollection.Count]; for (int col = 0; col < input.InputColumnCollection.Count; col++) { inputColumnBufferIndex[col] = BufferManager.FindColumnByLineageID(input.Buffer, input.InputColumnCollection[col].LineageID); } // Get the index of the output column in the buffer: IDTSOutput90 output = ComponentMetaData.OutputCollection[0]; outputColumnBufferIndex = BufferManager.FindColumnByLineageID(input.Buffer, output.OutputColumnCollection[0].LineageID); // Get the GID web service: gidService = new GID.WS_PDF.PDFProcessService(); }
/// /// Called to process the buffer: /// Get a new GID and save it in the output column. /// /// /// public override void ProcessInput(int inputID, PipelineBuffer buffer) { if (! buffer.EndOfRowset) { try { while (buffer.NextRow()) { // Set the output column value to a new GID: buffer.SetString(outputColumnBufferIndex, gidService.getGID()); } } catch (System.Exception ex) { bool cancel = false; ComponentMetaData.FireError(0, ComponentMetaData.Name, ex.Message, string.Empty, 0, out cancel); throw new Exception("Could not process input buffer."); } } }
/// /// Called after executing, to clean up. /// public override void PostExecute() { base.PostExecute(); // Resign from the GID service: gidService = null; } } }
Hi, I'm trying to implement an incremental data pull (Oracle to SQL) based on Andy's blog: http://sqlblog.com/blogs/andy_leonard/archive/2007/07/09/ssis-design-pattern-incremental-loads.aspx
My development machine is decent: 1.86 GHz, Intel core 2 CPU, 3 GB of RAM. However it seems the data flow task gets hung whenever I test the package against the ~6 million row source, as can be seen from these screenshots. I have no memory limitations on the lookup transformation. After the rows have been cached nothing happens. Memory for the dtsdebug process hovers around 1.8 GB and it uses 1-6 percent of CPU resources continuously. I am not using fast load to insert new records into my sql target table. (I am right clicking Sequence Container 3 and executing this container NOT the entire package in the screenshots)
The same package works fine against a similar test table with 150k rows. http://i248.photobucket.com/albums/gg168/boston_sql92/7.jpg http://i248.photobucket.com/albums/gg168/boston_sql92/8.jpg
The weird thing is it only takes 24 minutes for a full refresh of the entire source table from Oracle to the SQL target table. Any hints,advice would be appreciated.
I am trying to use a CTE in an OLE DB Command data flow transformation object. However, when I enter the cte and corresponding query in the SqlCommand field of the OLE DB command editor dialog, I get a syntax error. Can CTE's be used data flow objects? I have been able to use them in an Execute SQL Control Flow Item, but not in any data flow item.
I have one data flow control. Source is SQL server and destination is flat file destination. I have one derived column placed in between these two. This functionality works fine. I would like to sum one column data and count total no. of columns and put it in global variable. How can I achieve it?
In Excel they are showing Sale as -ve quantity and purchase as +ve quantity.
The database has quantity always as +ve figure and a separate column "isPurchase" set to true or false depending on whether purchase or sale.
So I need Derived Column to return a bolean (or int) depending whether quantity is positive or negative. I tried each of the following in the Expression but all of them were invalid expressions.
CASE WHEN [TotalQuantity] > 0 THEN 0 ELSE 1 END
If [TotalQuantity] > 0 THEN 1 ELSE 0 END
IIF ([TotalQuantity] > 0, 1,0)
Can anyone help me with correct syntax, or correct Data Flow Transformation if Derived column is wrong.
I'm trying to edit the Expressions of a Data Flow task. This seems to happen when I rename some of the Data Flow components but not always. The error I get is:
Element "[ADO Net Source].[SqlCommand]" does not exist in the collection "Properties"
However, if you look at the XML, this property does exist. So I'm not sure why this should occur.
I'm using SSIS 2008 R2 with Visual Studio 2008 V 9.0.30729.4462 QFE.
<component id="1" name="ADO Net Source" componentClassID="{2E42D45B-F83C-400F-8D77-61DDE6A7DF29}" description="Extracts data from a relational database by using a .NET provider." localeId="-1" usesDispositions="true" validateExternalMetadata="True" version="4" pipelineVersion="0" contactInfo="Extracts data from a relational database by using a .NET provider.;
I need to know how to use my private function - created as a scalar-valued-function in SQL Server 2005 - in script component (here a transformation is used) in a data flow task to transform a two-digit-month into a tree-sign-month:
I have Variable , data source and conditional transformation which checks the count(*) if the count == 0 then I connect an script component and change variable to false(initial it is True) and write into a log file...
Then I check that variable on predence constarint at workflow if variable==True then success. BUT Whenever I run the package my dataflow gets green even the condition does not meet like count==0 . So my variable's value is "False". Actually if the condition doesnt meet then my script shouldnt work. Am I missing something???
The logic I am trying to recreate via SSIS is the following SQL statement:
insert into db3.dbo.targettable1 -- Target database table (SiteC, Objecte, Attrib1)
select distinct ?, ?, from ? -- Source database table join dbo.targettable2 c1 -- Target database table on c1.Alias = ? and c1.CSetID = ? and c1.FacID = (select f.PFacID from dbo.Fac f where f.FacID = ?) where not exists (select * from dbo.targettable2 c -- Target database table where c.Alias = ? and c.FacID = ? and c.CSetID = ?)
I have an OLE DB Source that consists of an expression to approximate the following portion of the Above Select statement:
Select ?, from ? -- Source database table and
The package has 2 global variables User:CSetID and User::FacID whose scope is global to the package and whose values are set within a Foreach Loop Container outside of the Data Flow Task
I was trying to reference the 2 global variables within the Looup Transformation to recreate the following portion of the SQL statement.but encounter errors:
join dbo.targettable2 c1 -- Target database table on c1.Alias = ? and c1.CSetID = ? and c1.FacID = (select f.PFacID from dbo.Fac f where f.FacID = ?)
In the Advanced Editor window of Lookup Transaction
select * from (select * from [dbo].[targettable2 ]) as refTable where [refTable].[Alias] = ? and [refTable].[FacID] = ? and [refTable].[CSetID] = ?
Is there away to reference global variables in a Lookup Transformation that are set outside a Data Task Flow?
I have developed some packages to load data into "Fact" tables in the data warehouse. Some packages are OK, other ones not. What is the problem?: some packages load fact tables with lots of "Lookup - Data Flow Transformation" into the "data flow task" (lookup against dimension tables) but they are very very slow, too much slow to be choosen as a solution.
Do you have any other solutions to avoid using "Lookup - Data Flow Transformation"? Any other solution (SSIS, TSQL and so on....) is welcome to speed up the Fact table loading process.
I have a situation where I have multiple source tables, I need to populate the destination tables that have the same schema as the source ones. I dont want to do the repetitive task of creating a Dataflow for each source-destination load.
I want to create one custom dataflow component and loop through all my source tables and provide destination tables dynamically.
Is there a way to do that? Any custom data flow component out there??
I am creating a custom transformation component, and a custom user interface for that component.
In my custom UI, I want to show the custom properties, and allow users to edit these properties similar to how the advanced editor shows the properties.
I know in my UI I need to create a "Property Grid". In the properties of this grid, I can select the object I want to display data for, however, the only objects that appear are the objects that I have already created within this UI, and not the actual component object with the custom properties.
How do I go about getting the properties for my transformation component listed in this property grid?
I'm trying to create a custom data flow destination, and it has a custom property that needs to get value from variable(similar to the FileNameVariable property of Raw File Destination), how can I do that?
I've been having an issue with the integration of a third-party DLL into a custom data flow component.
The company sent me a C# project that generates a simple console application. The project includes a class that calls their DLL with DllImport. The console application runs fine.
I created a stand-alone class using the C# class they sent to expose the methods of their DLL. In my custom component, I'm referencing this class to pass data to and from their DLL.
The first method from that stand-alone class that my component encounters simply gets their installation path from the registry and does not use DllImport. That path retrieval works fine. The next method calls a function that is declared with DllImport. Each time the call fails with "System.DllNotFoundException = {"Unable to load DLL AMZip.dll': Exception from HRESULT: 0xE06D7363"}".
I've copied this DLL to countless locations (e.g., the PipelineComponents directory, the project/solution bin directory) and included these paths in all manner of path variables.
What am I missing here? Their DLL is not strong named (does this matter since I'm using DllImport?), my stand-alone class is, and of course, the custom component itself is. I appreciate the help.
I'm having my first go at developing a destination adapter which will send data to an update Web Service.
I've got some rather big gaps in my understanding. I've been following the various samples I've found on the net and have validated my mapping and picked up all the available column names and datatypes which are appearing in the Input and Output Properties tab of the Advanced Editor but I only have a tab for "Input Columns" and not "Column Mappings".
Which method defines the availble columns for the user to map?
Let me know if I haven't given enough information.
I am using the unpivot transformation, but I can't figure out how to use an expression in the Pivot Key Value.
The denormalized table I want to unpivot has columns like Sunday_Qty, Monday_Qty, Tuesday_Qty, etc. Just before the unpivot component, I inserted a derived column component that adds fields like DateSun, DateMon, DateTue, etc. that resolves to values like 01/07/2007, 01/08/2007, etc.
So for the various rows in the Unpivot Transformation Editor, I entered DateSun, DateMon, DateTue, etc. for the Pivot Key Value, and "EntryDate" for the pivot key value column name.
The data pipeline gets unpivoted correctly, but the rows have the literal values "DateSun", "DateMon", etc. in the EntryDate column.
How do I tell SSIS to use the DateSun column instead of the string "DateSun"?
I implemented a custom source adaptor. I want to be able to associate custom properties with each of the output columns. I want them to be passed downsteam. The idea is to be able to retrieve these information in a downstream custom transformations of ours and process the various columns accordingly. How do I go about doing this?I noticed that the IDTSCustomProperty90 seems to have a local scope only.
I developed a custom data flow task in .net 2.0 using Visual Studio 2005. I installed it into GAC using GACUTIL and also copied it into the pipeline directory. This task runs absolutely fine when I run it on my local machine both in BIDS and using the script in windows 2000 environment. However, when I deployed this package into a windows 2003 server, the package fails at the custom task level. I checked the GAC in windowsassembly directory and it is present. Also I copied the file into the PipeLine directory and verified that I copied it into the correct pipeline directory by checking the registry. The version of the assembly is still Debug. I looked up documentation in MSDN but there is very little information about the errors I am seeing.
The error I get is pasted below, Can somebody please help me as I am currently stuck and running out of ideas to fix this problem.
Code: 0xC0047067
Source: DFT Raw File DFT Raw File (DTS.Pipeline)
Description: The "component "_" (2546)" failed to cache the component metadata object and returned error code 0x80131600.
Code: 0xC004706C
Source: DFT Raw File DFT Raw File (DTS.Pipeline)
Description: Component "component "_" (2546)" could not be created and returned error code 0xC0047067. Make sure that the component is registered correctly.
I am creating a customer data flow component for SSIS for use in a package. I've got some custom properties that I am exposing using the supplied advanced editor (no custom property editor here).
Some of my properties are enumerated types, and I have deciphered how to get those properties to show as dropdown lists of their respective enumerations. (For those of you who may be looking as hard as I did as to how to accomplish this, see the end of this post.)
I also have a few properties which request SSIS package variable names - such as an file name variable. However, I can't figure out how to tell the advanced editor that the property is looking for an SSIS variable, so that it can show a dropdown list of package variables, much like virtually any other Microsoft supplied Data Flow component can.
Is there a Type Converter I could specify for those custom properties? Is there another way to instruct SSIS that my custom property is expecting a variable? Or do I need to code a custom UI for editing my Data Flow Task?
To create a dropdown list of values for a custom property that represents an enum, do the following:
1. Create your enum definition, such as "public enum ThisIsMyEnum { one, two }"
2. Create a new class that inherits from TypeConverter, such as "public class MyEnumConverter : TypeConverter"
3. Override "CanConvertFrom", and return true if "sourceType == typeof(string)"
4. Override "CanConvertTo", and return true if "destinationType == typeof(string)"
5. Override "ConvertFrom", and return the enum value (such as "one" or "two" in my example) that corresponds to the string passed in the parameter "value"
6. Override "ConvertTo", and return a string that corresponds to the enum value passed in the parameter "value"
7. Override "GetStandardValuesSupported" and return true
8. Override "GetStandarValuesExclusive" and return true to indicate that ONLY the enum values should be accepted
9. Override "GetStandardValues", and return a new StandardValuesCollection constructed with Enum.GetValues() of your enum, such as "return new StandardValuesCollection(Enum.GetValues(typeof(ThisIsMyEnum)));"
10. Just above your "public enum" declaration, add a "TypeConverter" attribute to link your type converter to your enum, such as "[TypeConverter(typeof(MyEnumConverter))]"
11. In "ProvideComponentProperties", after you've created your custom property like this: "IDTSCustomProperty90 propEnum = ComponentMetaData.CustomPropertyCollection.New()", add another line to specify the TypeConverter property of the property to the full assembly name of your type converter, like so: "propEnum.TypeConverter = typeof(MyEnumConverter).AssemblyQualifiedName;"
How do I retrieve the connections (connection managers) collections from Custom Data Flow destination? ComponentMetadata.RuntimeConnectionCollection is empty. I would like to be able to access all the connections defined in the package from the custom data flow task.
I came across code in which it was possible to access the Connections collection using the IDtsConnectionService for custom task (destination). The custom task has access to serviceProvider, whcih can be used to get access to the IDtsConnectionService interface but not the custom data flow task.
I'm having trouble with a Script Component in a data flow task. I have code that does a SqlCommand.ExecuteReader() call that throws an 'Object reference not set to an instance of an object' error. Thing is, the SqlCommand.ExecuteReader() call is already inside a Try..Catch block. Essentially I have two questions regarding this error:
a) Why doesn't my Catch block catch the exception? b) I've made sure that my SqlCommand object and the SqlConnection property that it uses are properly instantiated, and the query is correct. Any ideas on why it is throwing that exception?
I would like to add additional string functions and other types of functions to the expression builder in SQL Server Integration Services. Right now the list of functions is relatively limited to such things as FINDSTRING, RIGHT, LEN, etc.
Everything I've read says that custom data flow components are built by inheriting from the Microsoft.SqlServer.Dts.Pipeline.PipelineComponent class.
But the stock components such as the Derived Column data flow transformation must each be implemented by their own class. So how do I base my custom components on those classes? The documentation for the PipelineComponent class doesn't list any such subclasses.
I have a table A with a KEY column and SSN column. KEY = 12 digits ( first 3 digits are Department Id , and last 9 digits are SSN)
I have a table B with SSN column only.
both KEY and SSN columns are Primary keys so duplicate entries must be Avoided.
Table A is intended to be popluated weekly from TXT file (SSIS package RUN). I want to achieve somethign like this..!
P-Code sample:
for each Row in TXT file if TXTfile.KEY = TableA.KEY then skip and Read/Go to next Row in TXT file else INSERT TXTfile.KEY into TABLEA.KEY SSN_Var = EXTRACT the SSN part (SSNpart.READ) if SSN_VAR.Exists In TableB.AnyRow then skip else Insert into TableB End If. End If End For Loop.
----------------------------------------------------------------- Using SSIS controls, what will be best flow and logic to achieve this.....? any sample scripting code ????
I've been trying to figure this out on my own for pretty much all of today, and part of last week. I've downloaded samples, searched this forum, blogs, etc. So I figured I would post, since it's the end of the day, and I'm not much further along.
I'm working on a custom transformation component, whose main function is to use SQL encryption/decryption to encrypt/decrypt data from the input columns, into the output columns. The component needs two strings, a key name and a certificate name, as well as the connection manager it should use to connect to SQL which will do the encryption/decryption.
Here's where I'm stuck:
1) How can I provide the key/certificate names via properties? What I'm expecting/looking for is a way to add these two properties at the component-level, which would show up under the "Custom Properties" section of the properties pane (currently, this only has one property, "UserComponentTypeName"). These key/certificate values will be used for all input columns.
2) How do I access the connection managers from within the component? What is the best way to go about using a connection manager from within my component to connect to SQL and perform the encryption/decryption? In a custom task, this was fairly simple, but it seems that same concept won't work on a transformation component.
3) Is there a better way to go about accomplishing this (column encryption via SQL from within SSIS)? Am I going about this all wrong?
As I said, I've searched for direction, but there seems to be next to nothing in the regards of a good reference for creating custom transformation components. I've looked at two MS samples, but can't seem to make any sense out of them.
Is there any tutorial to learn how custom transformation component works? maybe a blog, pdf or something... Specifically, i need to learn how to generate an output column composed from 3 input columns. The problem is i dont know how to set the column value... anyone have some sample code?
I've created a custom data flow tranformation and it isn't showing up in the Tool Box Items to be added under the Data Flow Items tab (right click on tool box, 'Choose Items...', then clicked Data Flow Items).
I have done the following:
signed the assembly, added to GAC, copied the dll to C:Program FilesMicrosoft SQL Server90DTSPipelineComponents.
It worked previously when I was just starting out, however now I cannot see it. What would cause it to not show up? Everything compiles fine. How would I determine how to fix it so that it shows up?