I am currently working on a project where I want to load information from a web service into a table in SQL Server. I wrote a web service to returns a dataset. Next, I went into SSIS and added a Web Service Task. I had the web service task call the method and output the result into an .xml file. I then wrote a data flow to load the XML file, using the Web Source, into the database, using the OLE DB Dest. I then set the connection manager on the XML Source to use inline schema, because the returned dataset from the web service has the schema information included. Next, I picked the columns I wanted and proceded to pick the table and mappings I wanted from the OLE DB Dest. However, when I execute the package the .xml file gets generated correctly but none of the rows get added in the database. How can everything be set up "correctly", but no rows gets added? Is there a specfic format that an XML source must be for SSIS to use it correctly?I would think a dataset that is generated from a .NET web service should just work in SSIS. Any help would be greatly appreciated. Thanks.
I have a situation like i need to Update the Destination Server for the changes made in the Source server.
I have MS Access as my Source Server and having XYZ.mdb as my source database...My destination server is MS SQL server. During my first install i will create a dynamic DTS package to move data from XYZ.mdb to desitination server.Now when i do some changes in the Source Database(XYZ.Mdb) like i update one row on one table and alos included one row on the table...now how do i import only the Updated and inserted row to the Destination Server using DTS....
I will like to be able to call a DTS to which I pass in parameter the name and the path of a DBF file of which the function will be to create a table in my SQL server identical by the structure and the contents to the DBF. The problem is that I never know in advance the structure of the DBF. But, in DTS designer I am already obliged to inform the name and the structure of the target then to make the correspondences of column.
PS: I know how to pass parameters to a DTS via utility DTSRUN... / A... etc in order to provide him the name and the path of the DBF
I need to use DTS to create a dynamic destination filename based on the value of a field. I tried creating a DTS package using the wizard. It works fine if I wanted a static filename for the destination filename. I tried creating a Dynamic Properties Task by selecting Connection2/OLEDB Properties/Data Source and then using a SQL Statement to return a value for the name of the file. It doesn't seem to run the query everytime the DTS is ran. The first time I run it it names the file correctly but after that it keeps naming the file the old value from the SQL statement. Any idea what I need to do to fix this?
I want two different set of rows in a single output. For example - the query gets records from the same tables, but first condition is a date range of 60 days and value = '1' then the second condition is a date range of 180 days and value = '2'
There are several events. Each event has several different sessions (stored in EventOptionGroups), and each session has a certain number of options (stored in Options).
A user can sign up for an event, and their information is stored in EventRegistration. They can choose an option for each session in the event. For each option they choose, a new row is added to RegistrantOptions.
For each row in EventRegistration, I want to output the user's information, and then the option they chose for each session in the event. Like this:
I have a problem with a Merge Join providing no output (when it should have 1890 rows). My Data Flow Task has 4 OLE Data Sources, 3 Multicasts, and 1 OLE Data Destination. I am experiencing the problem near the end of my data flow where two Multicasts create two parallel flows of data (see Level 1 below). I have two Merge Joins which join one leg from each multicast with a leg from the other multicast (see Level 2 below). Then the two remaining legs use a Merge to get my destination output (see Level 3 below).
I am experiencing my problem with the Merge Join (input A2, B2) --> (output C2) transformation. The Merge Join providing output C1 appropriately outputs 1890 rows, but C2 outputs 0 rows. Both Merge Joins are identical. The data is identically sorted prior to entering the problematic Merge Join and a DataViewer (Grid) verified that the data is appropriately entering in. Merge Join (input A2, B2) --> (output C2) has 667 rows as input A2 and 1890 rows as input B2 (using an inner join, just like the other merge join), but C2 baffles me with 0 rows of output (when it too should have 1890). I receive no Ouput errors and the execution completes showing all green.
I read about mysterious behavior with Merge Joins and have attempted modifying my EngineThreads property to values between 2 and 10, with no luck. Any help/ideas would be appreciated.
I'm using a simple data flow to extract rows from a table (using a SQL query) and put them in a flat file. If the query returns no rows, I don't want a file to be created. Right now it creates a file with the headers (since I do want the headers if there is data).
I've got a Derived Columns component as part of a data flow. On this I've set the error output for my columns to Redirect Row in all cases. I've set a data watcher on the error output and then ran the package.
There are several rows which I'm expecting to fail - about 3 of them. These fail but there are also another 697 which seem to have no problem. So I fixed one of the problem columns in the source data and then re-ran the package. I only updated one row in the source so this row no longer appeared in the error output, but neither did several hundred of the other rows.
Is it possible that the error output has been tripped for that one row but for some reason it sends several hundred more rows? The ids on these additional rows follow on from the erroneous row, and when I fix that row the rows following it no longer appear in the error output.
I have a stored procedure that I use to return Purchase Orders from our PO system. It returns the data rows for PO's that match the criteria passed in (including the page to show etc.) + it returns two output params, Number of rows and Number of Pages. Using query analyzer I can confirm the query works exactly as we want. I cannot however seem to get the data out to our ASP.net app. Here is a function that I use in one of my classes: <code> Function fnListPOsByCoordinatorIDPaged(ByVal strCoordinatorID As String, ByVal intPOStatusID As Int16, _ ByVal intUserTypeID As Int16, ByVal intArchived As Int16, _ByVal intPageNum As Int32, ByVal intPerPage As Int32, _ByVal strConn As String) As SqlClient.SqlDataReader Dim dr As SqlClient.SqlDataReader SqlConnection1.ConnectionString = strConn prListPOByCoordinatorPaged.Parameters("@CoordinatorID").Value = strCoordinatorIDprListPOByCoordinatorPaged.Parameters("@POStatusID").Value = intPOStatusIDprListPOByCoordinatorPaged.Parameters("@UserTypeID").Value = intUserTypeIDprListPOByCoordinatorPaged.Parameters("@Archived").Value = intArchivedprListPOByCoordinatorPaged.Parameters("@PageNum").Value = intPageNumprListPOByCoordinatorPaged.Parameters("@PerPage").Value = intPerPage SqlConnection1.Open()dr = prListPOByCoordinatorPaged.ExecuteReader(CommandBehavior.CloseConnection) Me.Pages = prListPOByCoordinatorPaged.Parameters("@Pages").ValueMe.Rows = prListPOByCoordinatorPaged.Parameters("@Rows").Value If Me.Rows / intPerPage > Me.Pages Then Me.Pages = Me.Pages + 1End If Return dr prListPOByCoordinatorPaged.Dispose()SqlConnection1.Close()SqlConnection1.Dispose() End Function </code> It does not crash, it returns my data reader with the correct records. Unfortunately my property values are returned as 0. They should have values. Anyone know how to do this? Thanks.
How to summarise the data in this table to a single row output per site (2 records for every SiteID). I am based in the UK so the data copied from SQL is in the US format. I can convert this to UK date without any issues.
CREATE TABLE [dbo].[MRMReadings]( [SiteIncomeID] [int] IDENTITY(1,1) NOT NULL, [SiteID] [nchar](5) NOT NULL,
[Code] ....
Is it possible to return the data in the following format merging 2 lines of data into one output:
For both OLEDB destiantions (and hopefully for the forthcoming ADO.NET destination adapter) it would be useful to have the following two output columns: NativeErrorCode and NativeErrorMessage.
For SQL Server, this would allow you distiguish between multiple errors which all roll up to the SSIS error "the value violated the integrity constraints of the column", which is way too generic for proper decision making (via conditional split).
For example, like SQL Server replication, you should be able to ignore duplicate key errors (error number 2627) indicating the record existed, but error out on nullability constraint errors (number 515) in which the record could not be inserted. If you had a native error code, you could make this decision, while the SSIS error for two different native errors is precisely the same.
There is a known and accepted race condition between a lookup transform and subsequent OLEDB dest insert attempt (assuming a non-transacted container and a common component target table), which is why the 2627 can be safely ignored in certain instances, while a 515 should not be.
Agent State Exposure Insured Name Rogers Inc MA 100,000 John Smith SAN Group RI 200,000 Jim Morrison SAN Group RI 100,000 Jimi Hendrix 123 Agency MA 300,000 Mickey Mouse Rogers Inc MA 50,000 Mike Greenwell
I want to be able to read the file and create new excel files for each Agent listed. So for Example, the above file would create 3 separate files since there are 3 different Agents listed. Each Agent file would contain the same information from the original file. The name of the file would be somethign like AgentName.xls...So the SAN group file would have this:
Agent State Exposure Insured Name Rogers Inc MA 100,000 John Smith SAN Group RI 200,000 Jim Morrison SAN Group RI 100,000 Jimi Hendrix
I need to transform Foxpro table to SQL Server table with merging all rows into one where all column values are the same except one . For this the only column with the different values , I want them also to be merged as coma or space delimited string. The question whether SSIS is a good candidate for this kind of data munging and also would be interested to know knowing as many as possible ways of doing that. Surely I may produce Foxpro script in 5 minutes which wil do that and be a pre-processor action before SSIS starts.
I've written an asynchronous script component and I have created the output buffer. 500,000 rows go into the input but only 1500 to 2000 rows come out berfore I get an SSIS Object reference not set to an instance of an object Error. The error occurs at the AddRow method of the outputbuffer (that's how I know it's gone). Why does this happen? Is there a way to sync up the output buffer with the input buffer?
I am using a foreach loop, with the data from an ado recordset, which contains the table name that I wish to write data to an OLEDB data dest. The table names are retrieved from an execute sql task in the an object var. Within the foreach loop, for each table name, I then use a datareader to an ado.net source to pull data from that table, via an expression construct into a variable - i.e. "select * from " + @[User::table_name]. This works fine for the first table, in which mappings are setup using the SSIS design environment. The data is retrieved. I then use a variable and set the data access mode for the oledb destination to "Table name or view name variable". This also saves data fine for the first table in the loop in the oledb dest. When the next table name is retrieved from the ado provider in the foreach loop, the datareader fails, as it still thinks the metadata mappings are from the first table, which was used for the mapping in the design environment. I.E. FIN_CLASS is a column from the first table in the loop.
Error: 0xC0202005 at Data Flow Task, DataReader Source [7181]: Column "FIN_CLASS" cannot be found at the datasource.
I have set the following properties, that I thought (in my feeble mind), are supposed to avoid that behavior. For the datareader, I set ValidateExternalMetadata to false, and for the data flow task (container for the datareader), I set DelayValidation to true. These settings, according to the doc, are supposed to evaluate metadata for the datareader source at runtime (not design time), so that the column metadata is dynamic, and so that the subsequent oledb destination can use the "data access mode" for the oledb destination of "Table name or view name variable".
If I cannot get this to work, I have 2 options: Use OPENQUERY via dynamic t-sql statements, OR create 30 separate flows in SSIS - one for each table - not looking forward to that one.
I have access to a stored procedure that was written previously for a process that uses the output from the stored procedure to provide input to a BCP operation in a bat file that builds a flat text file for use in a different system.
To continue with the set up, here is the stored procedure in question: CREATE PROCEDURE [dbo].[HE_GetStks] AS
select top 15 Rating, rank, coname, PriceClose, pricechg, DailyVol, symbol from
(selectf.rating, f.rank, s.coname, cast ( f.priceclose as decimal(10,2)) as PriceClose, cast ( f.pricechg as decimal(10,2)) as pricechg, f.DailyVol, f.symbol from dailydata f, snames s where f.tendcash = 0 and f.status = 1 and f.typ = 1 and f.osid = s.osid) tt order by rating desc, rank desc
GO
The code in the calling bat file is: REM ************************* REM BCP .WRK FILE REM ************************* bcp "exec dailydb.[dbo].[HE_GetStks]" queryout "d:TABLESINPUTHE_GetStks.WRK" -S(local) -c -U<uname> -P<upass>
This works just peachy in the process for which it was designed, but I need to use the same stored procedure to grab the same data in order to store it in a historical table in the database. I know I could duplicate the code in a separate stored procedure that does the inserting into my database table, but I would like to avoid that and use this stored procedure in case the select statement is changed at some point in the future.
Am I missing something obvious in how to utilize this stored procedure from inside an insert statement in order to use the data it outputs? I know I cannot use an EXECUTE HE_GetStks as a subquery in my insert statement, but that is, in essence, what I am trying to accomplish.
I just wanted to bounce the issue of y'all before I go to The Boss and ask him to change the procedure to SET the data into a database table directly (change the select in the proc to an INSERT to a local table) then have the external BAT file use a GET procedure that just does the select from the local table. This is the method most of our similar jobs use when faced with this type of "intercept" task.
I created a package that seems to work fine with a small amount of data. When I run the package however with more data (as in production) the merge join output is limites to 9963 rows, no matter if I change the number of input rows.
Situation as follows.
The package has 2 OLE DB Sources, in which SQL-statements have been defined in order to retrieve the data.
The flow of source 1 is: retrieving source data -> trimming (non-key) columns -> sorting on the key-columns.
The flow of source 2 is: retrieving source data -> deriving 2 new columns -> aggregating the data to the level of source 1 -> sorting on the key columns.
Then both flows are merged and other steps are performed.
If I test with just a couple of rows it works fine. But when I change the where-clause in the data source retrieval, so that the number of rows is for instance 15000 or 150000 the number of rows after the merge join is 9963.
When I run the package in debug-mode the step is colored green, nevertheless an error is displayed:
Error: 0xC0047022 at Data Flow Task, DTS.Pipeline: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "Merge Join" (4703) failed with error code 0xC0047020. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.
To be honest, a few more errormessages appear, but they don't seem related to this issue. The package stops running after some 6000 rows have been written to the destination.
Is there a way to build a select statement that will output related rows with different column data per row? I want to return something like:
rowtype| ID | value A | 123 | alpha B | 123 | beta C | 123 | delta A | 124 | some val B | 124 | some val 2 C | 124 | some val 3 etc...
where for each ID, I have 3 rows that are associated with it and with different corresponding values.
I'm thinking that I will have to build a temp table/cursor that will get all the ID data and then loop through it to insert each rowtype data into another temp table.
i.e. each ID iteration will do something like: insert into #someTempTable (rowtype, ID, value) values ('A', 123, 'alpha') insert into #someTempTable (rowtype, ID, value) values ('B', 123, 'beta') insert into #someTempTable (rowtype, ID, value) values ('C', 123, 'delta') etc..
After my loop, I will just do a select * from #someTempTable
Is there a better, more elegant way instead of using two temp tables? I am using MSSQL 2005
We have several hundred very simple ETL SSIS 2K8 package files (*.dtsx).
I'd like to be able to interrogate them to determine source and destination fields.
There's no great need to map source to dest or to extract data types.
So far, the most promising candidate is to load them using OPENROWSET into an XML field in a SS table.No problem there, but querying using OPENXML has me stumped.
The package files will change a couple of times per year, so the process needs to be repeatable with minimal manual intervention.
I am looking for a way to leave a Data Flow Task destination table name as-is, and have SSIS auto-create the table if it doesn't exist already.
I searched on this in the forums but based on the question it's difficult to kow if it has been answered or not.
Details:
I am writing some SSIS packages that need to be executable on another server. Many of the Data Flow Tasks copy data (such as from a Fuzzy Grouping transformation, and lots of other stuff) into a new table. But the other server will not have these tables set up for the first run.
My current solution is to check information_schema.tables and drop IF EXISTS. But, then the Data Flow Task will not work (becase table does not exist). So, I script to new window a create table statement based on the existing table that I use in my dev environment. This is a hack and I want to find a better method.
It is quite possible (although unlikely) that the source columns could be changed in the future, or some query used to pull the data might be modified. If this happens, then I would need to change the CREATE TABLE Execute SQL task. I want my package to accommodate without having to modify it.
When I use the Import/Export Wizard, I can select a table name from the drop down list OR type in a new name. When I type in the new name, it assumes I want to create the table. NOW, is there a way to mimic this in BI Developer Studio? Yep, I saved the Wizard version of the SSIS package and all it does is run a CREATE TABLE statement first.
I am looking for a way to leave a Data Flow Task destination table name as-is, and have SSIS auto-create the table if it doesn't exist already.
I have a stored procedure that returns a resultset AND an output parameter, pseudocode:myspGetPoll@pollID int,@totalvoters int outputselect questionID,question from [myPoll] where pollID=@pollID @totalvoters=(select count(usercode) from [myPoll] where pollID=@pollID)1. In my code behind I'd like to read both the rows (questionID and question) as well as total results (totalvoters) How could I do so?2. what would be the signature of my function so that I can retreive BOTH a resultset AND a single value?e.g.: private function getPollResults(byval pollID as integer, byref totalvoters as integer) as datasetwhile reader.read dataset.addrow <read from result>end whiletotalvoters=<read from result>end functionThanks!
Hello All,I am trying to create a DTS package.I have two tables tbl_A and tbl_B with similar data/rows but noprimary keys.tbl_A is master.I would like this package to query tbl_A and tbl_B and find1)all rows in tbl_A that are different in tbl_B, 2)all rows in tbl_Athat are not present in tbl_B and3)all rows in tbl_B that are not present in tbl_A, and then just showthose rows.Can this be done with a simple UNION?Perhaps this could produce a temp Table that can be dropped once theDTS package exists successfully.The 2nd part after all the above rows are retrieved is that I wouldlike to add an addional Column to the retrieved data called STATUSwhich has 3 possible values(letters) at the end of each row...M (modified) means that row exists in tbl_B but has 1 or moredifferent columnsA (add) means this row exists in tbl_A but not in tbl_BD (delete) means this row exists in tbl_B but not in tbl_AI'm hopping this DTS package would output a nice comma seperated TXTfile with only...1) rows from tbl_A that are different in tbl_B (STATUS M)2) rows from tbl_A that are not present in tbl_B (STATUS A)3) rows from tbl_B that are not present in tbl_A (STATUS D)Can a DTS package in MS SQL be used to perfom all of the above tasks?I would very much appreciate any help or any advise.Thanks in advance :-)
I am using SSIS in SQL Server Enterprise 2005. I have two OLE DB data sources from two disparate databases (IBM DB2 and Microsoft SQL Server), some columns from each of which are to be included in the merged output results. I have noted the various requirements in the forum postings with regard to sorting the OLE DB sources and specifying the output source columns as being sorted, as well as the requirement that the join fields in the two sources be close/exact matches. Yet, when I run this in VS, while the work area reflects the expected number of rows being input into the Merge Join transformation, no count is reflected as output from that transformation into the final destination table.Specifically, my two data sources (IBM DB2 and MS SQL) are configured as follows:
IBM DB2 contains an SQL statement that uses Cast operations to create the result columns.and an ORDER BY clause to ensure that the output is sorted by the desired two columns.. The OLE DB source property setting for IsSorted is set to true; the Output Columns folder column definitions for "key_ source_dtsy" and "key_source_dtrt" have their SortKeyPosition properties set to 1 and 2, respectively. Those field are both defined as data type DT_STR, with lengths of 4 and 2, respectively. Below is the Path metadata from the Data Flow Path editor from the path from this source:
IBM DB2 source"Name" "Data Type" "Precision" "Scale" "Length" "Code Page" "Sort Key Position" "Comparison Flags" "Source Component""ID_CODE" "DT_STR" "0" "0" "10" "1252" "0" "" "Source F0005 User Defined Codes""CODE_DESCR_1" "DT_STR" "0" "0" "30" "1252" "0" "" "Source F0005 User Defined Codes""CODE_DESCR_2" "DT_STR" "0" "0" "30" "1252" "0" "" "Source F0005 User Defined Codes""key_source_dtsy" "DT_STR" "0" "0" "4" "1252" "1" "" "Source F0005 User Defined Codes""key_source_dtrt" "DT_STR" "0" "0" "2" "1252" "2" "" "Source F0005
User Defined Codes:
MS SQL contains an SQL statement that takes the columns as they are in the MS SQL table (no Cast operations needed); it also uses an ORDER BY clause to ensure the output is sorted by the join columns. The OLE DB source property setting for IsSorted is set to true; the Output Columns folder columns for "key_source_dtsy" and "key_source_dtrt" have their SortKeyPosition properties set to 1 and 2, respectively. Those field are both defined as data type DT_STR, with lengths of 4 and 2, respectively. Below is the Path metadata from the Data Flow Path editor from the path from this source:
MS SQL source"Name" "Data Type" "Precision" "Scale" "Length" "Code Page" "Sort Key Position" "Comparison Flags" "Source Component""id_code_name" "DT_I2" "0" "0" "0" "0" "0" "" "Source CodeName in db dwVdFY""key_source_dtsy" "DT_STR" "0" "0" "4" "1252" "1" "" "Source CodeName in db dwVdFY""key_source_dtrt" "DT_STR" "0" "0" "2" "1252" "2" "" "Source CodeName in db dwVdFY"
The Merge Join transformation specifies an INNER JOIN using the columns named "key_source_dtsy" and "key_source_dtrt" from the respective data sources.I know there are alternative ways of accomplishing my intent (Lookup, port MS SQL table to IBM DB2 so join can occur in SELECT statement, etc.; however, I'd like to use this functionality and assume that it should work.
Inside some TSQL programmable object (a SP/or a query in Management Studio)I have a parameter containing the name of a StoreProcedure+The required Argument for these SP. (for example it's between the brackets [])
I can't change thoses SP, (and i don't have a nice output param to cach, as i would need to change the SP Definition)All these SP 'return' a 'select' of 1 record the same datatype ie: NVARCHAR. Unfortunately there is no output param (it would have been so easy otherwise. So I am working on something like this but I 'can't find anything working
I have Lookup task to determine if source data should be updated to or insert to the customer table. After Lookup task, the Error Output pipeline will redirect to insert new data to the table and the Output pipeline will update customer table. But these two tasks will be processing at the same time which causes stall on the process. Never end.....
The job is similiart to what Slow Changing Dimention does but it won't update the table at the same time.
I have a Lookup Transformation that matches the natural key of a dimension member and returns the dimension key for that member (surrogate key pipeline stuff).
I am using an OLE DB Command as the Error flow of the Lookup Transformation to insert an "Inferred Member" (new row) into a dimension table if the Lookup fails.
The OLE DB Command calls a stored procedure (dbo.InsertNewDimensionMember) that inserts the new member and returns the key of the new member (using scope_identity) as an output.
What is the syntax in the SQL Command line of the OLE DB Command Transformation to set the output of the stored procedure as an Output Column?
I know that I can 1) add a second Lookup with "Enable memory restriction" on (no caching) in the Success data flow after the OLE DB Command, 2) find the newly inserted member, and 3) Union both Lookup results together, but this is a large dimension table (several million rows) and searching for the newly inserted dimension member seems excessive, especially since I have the ID I want returned as output from the stored procedure that inserted it.
Thanks in advance for any assistance you can provide.