I have a very simple SSIS package that is moving data from a DB2 database to a Teradata box. I've run it around 10 times, twice it pushed data over, the balance of the time, it executes with no error, but moves nothing over. In the "incomplete" runs, a command line box pops up for half a second, then the package ends.
Does anyone have ideas as to why this behavior is occurring?
I would like to know what happens when a very large reference data set for a lookup transform with full caching enabled is getting loaded during package execution and the computer memory runs out or is very low. Does SSIS a) give an out of memory error of some sort b) resort to a no caching or partial caching mode c) maintain the full caching mode but will switch to using the paging file(virtual memory).
I think it will resort to using the page file in which case the benefits of in memory lookups are lost and performance would suffer. If I cannot upgrade the memory or shrink the reference set somehow, i should switch that lookup task to use partial caching or no caching with an indexed lookup table. Would this make sense?
In the preview view in Visual Studio, I can see all of the items in my document map (over 100 labels). After deployment, only the first 30 items are displayed and I can not scroll down to see the missing items. Is this a known bug? is there any workaround?
I am trying to interpret some of the results I observe when trying to match similar records using a fuzzy lookup transform, but it's not entirely clear how the overall row similarity score is calculated. In particular, sometimes rows with lower individual column similarity scores will achieve a higher similarity and confidence score than a matching row with higher individual column scores.
The transform is configured with 6 text fields set to fuzzy mapping and a minimum similarity of 0, and 3 additional numeric fields with an exact mapping. It is set to return a maximum of 2 matches per lookup and to do an exhaustive search of the reference table.
For example, from the following matching pair of records Match 1 is picked over Match 2 even though it's individual scores are lower.
I have an OLE DB Source and i want to transform the data type fields of the table before i export the table in an OLE DB Destination. Is there a way to transform numeric value to float, and numeric to nvchar?
I am trying to read in a flat file, transform the fields and store into a destination database.
In DTS, this works using Transform Data Task Properties. I define the columns and then have a VB script on the Transformations tab that changes any bad data.
Is there a way to do this in SSIS that I can define the column transformations and re-use my VB scripts?
i have too many DTS packages to migrate to SSIS, and while examining a DTS package in BIDS (converted with the migration utility) i tried to edit the resulting migrated package, which opened the DTS interface with the two connection icons joined by the big fat arrow with a gear on it...not exactly what i had in mind, iow, it looks like SSIS on the outside, but its still DTS on the inside. So I stripped out a series of components from a more complex package hoping that simplifying it would reveal the contents of old DTS Transformations tab at least partially set up in a Derived Column transformation. Can i get there from here, or must i recreate every stinking definition in a derived column manually from the ground up? thanks very much for your help
I can't figure out how to put nested tables into the Data Mining Model Training Transform (SSIS). I can do a simple case table, but how do you get those nested tables with DM Training Transformation? Any ideas? Samples?
Hi JayH (or anyone). Another week...a new set of problems. I obviously need to learn .net syntax, but because of project deadlines in converting from DTS to SSIS it is hard for me to stop and do that. So, if someone could help me some easy syntax, I would really appreciate it.
In DTS, there was a VBScript that copied a set of flat files from one directory to an archive directory after modifying the file name. In SSIS, the directory and archive directory will be specified in the config file. So, I need a .net script that retrieves a file, renames it and copies it to a different directory.
I have a series of SSIS packages which populate 12 different databases. Which data source & target they use is controlled by values passed down from the SQL Agent Scheduler Job Step using "Set Values"
These values are passed to Variables which are used in the Expressions in the connection manager for the database to change the connection string and initial catalog.
e.g. REPLACE( @[User::ConString] , "Royalty", @[User::RoyDb] )
These jobs run successfully the majority of the time but each day I get a significant number of failures where one or more target or error trapping table cant be found. They are all using the same connection manager and most of the tables in the database get updated correctly but the job fails on trying to access one or two of the tables, with the following message in the On Error event:
-1071636248,0x,Opening a rowset for "[dbo].[RIGDUE_DAY]" failed. Check that the object exists in the database.
This happens both when I schedule the jobs in parallel with other jobs running the same packages & when I run the job by itself using the right click, start at step option. e.g.
I had one fail last night, I ran it by itself this morning, it failed, I ran it again, it succeeded. Nothing concerning the data it was transforming had changed.
I have applied the hotfix to service pack one concerning the paralel use of variables in a package as referred to in the following link.
I am having a very wierd issue regarding a DB2 sql query. I need to get data from Db2 and insert into our sql server database. Using data flow task, to get data I am using the data reader source. COnnection is ado.netodbc connection. THis sql query also has some comments in it.
The first wierd thing is...
1. On Development server, when I run this query manually, meaning using toad, winsql (connection to the db2 database), the query runs fine. Brings back approx 667 rows which is correct. ON the same server when I try to run this query, via a SSIS pkg, data flow task, using data reader source, gives me error on those comments that exist in that query. But if I run the same SSIS pkg on another server (Integration server). It runs fine. The same pkg also runs fine if I run it from my machine. SO What is different on my Dev server compared to the Integration server.
2. Say if I take those comments out from the sql query, then try to run the ssis pkg. The query is stuck at the first record and goes in an infinite loop mode. though my query is not a procedure, it is just a sql statement. But this ssis pkg with the query runs absolutly fine on the other server. I aslo tried using the other types of connection and ole db source but still the same problem on the Dev server.
What do I need to look for that is so different on the dev server compare to the INT server. I also checked the version on both these server for Visual Studio 2005(by going to About Microsoft Visual Studio), it is the same.
This is what I have on both the servers.... Microsoft SQL Server Integration Services Designer Version 9.00.3042.00
I created a very simple SSIS package (it just updates a single row in a table). When I execute the package from the command line (using dtexec), it takes about a second to finish, as expected. But when I execute it using dtexec via xp_cmdshell, it takes about 91 seconds. When I use a SQL job to execute the package as an operating system type, it takes 91 seconds. Using a SQL job to execute it as a SSIS package takes again 91 seconds. It appears that something is causing a delay of about 90 seconds before the package actually gets executed. I tried changing the SSIS service account, but that didn't change anything. Why is executing the package through SS2005 different than executing it directly from the command prompt?
I have an SSIS package, (actually many packages at this point because I can duplicate this problem each and every time) with an FTP Task and three connection managers (one for FTP, one for a file share, one OLEDB).
The package in its simplest form, grabs a file from an FTP site and transfers it to the file share. When there are no files found on the FTP site, the FTP task fails as expected.
Here is where it gets strange...I turned on a package configuration, with all variables / properties saved to a SQL table; all except for the OLEDB connection which I am leaving hardcoded in the package for the time being.
The problems I am seeing as soon as I activate the package configuration are twofold: The first occurs with the property "FileUsageType" defined to the file share connection. I have it set to "existing folder" in the SSIS tool, the SQL config table also has a row for this item defined with a value of 2. I have even tried manually changing this value in the SQL config table and the result is always the same error when I run the package:
Error: 0xC002F314 at FTP Task, FTP Task: File usage type of connection "Z:" should be "FolderExists" for operation "Receive".
If I remove this entry totally from the SQL config table, and leave that property hardcoded in the package then I am able to continue execution in the package and that particular error goes away. But then I come to weird problem #2: I stated above that when there are no files in the FTP directory to retrieve, then the FTP task fails, as I would expect. However, as soon as I activate the package configuration, even when there are no files at the FTP site, the FTP task doesnt fail anymore, it marks itself as successful instead. This is wreaking havoc on my precedence constraints that follow the FTP task naturally. As soon as I deactivate the package configuration, the FTP task behaves normally again.
The problems only seem to happen with config packages stored in SQL. I tested this also with XML file config instead and everything seemed to behave as it should. Unfortunately, I need to store these configs in SQL, not XML.
Does anyone have any ideas on what I need to look at to fix these weird issues?
Requirement: I have a simple package with one dataflow task. In that I need to read from a sql table and for every row in that table loop through n times and generate new output rows based on certain conditions (which are best evaluated in custom script as they are rather complex). Hence, if I have 100 rows in the table as my input, I may end up with 100*n rows as output.
My Design: To implement this I have used an OLE DB Source which outputs to a Script Transform (ST). In the ST I intend to loop through in custom code and generate new rows using the .AddRow feature when I need new rows. This ST then feeds into another OLE DB Destination which writes the data to the table. Simple! I am using the default buffer settings. All I have tweaked is the Synchronous... property on the script transform (otherwise I do not get to the Output0Buffer within the script!).
Problem: I wish to do as much as possible in parallel. So I would expect the OLE DB Source to provide more than one row at a time to the script transform and that should process more than one input row simultaneously. It seems the script componenet is serializing input, so it seems to take one row at a time from the OLE DB source, loop through and process in the script transform).
AM I RIGHT IN THINKING THAT THE SCRIPT TRANSFORM IS EXECUTING THE INPUT IN A SEQUENTIAL MANNER? CAN I PARALLELISE THIS? If so, how?
I am trying to write a ssis surrogate key data transform, my problem is I can't find an example how to add a column to the incoming columns and add some data to it. If anyone has a sample, can you please post it. I found a script option that works but I would like an actual transform.
I have an SSIS with several data flows I need to do some complex data evaluations so I have used a script as transform in two of the DFT's. If I run these separately everything works great and there are no problems what so ever. If I run them together I notices I was getting an error on the second one. I discovered that this seems to be some kind of namespace problem since both Scripts were using Input_0 buffer. So I changed the name of the second one and retested.
Well I no longer get the error and in fact it seems to run through the entire SSIS just fine. However when I look closer I notice that the second Script task just does not seem to do anything at all. The script task does a lot of evaluation of the incoming data and then does some calculations depending on the value in the service code. however when it runs through this in the second script task all of the define output rows are just empty.
I have gone through and made sure that all input and output buffers are unique names thinking this was a similar problem but no luck. I even changes all column and variable names to unique with no luck. Again If I run them separately everything work fine it is only when I run the entire package that this problem occurs.
I am getting inconsistent results when BULK INSERTING data from a tab-delimited text file. As part of my testing, I run the same code on the same file again and again, and I get different results every time! I get this on SQL 2005 and SQL 2012 R2.We have an application that imports data from a spreadsheet. The sheet contains section headers with account numbers and detail rows with transactions by date:
AAAA.1234 /* (account number)*/ 1/1/2015 $150 First Transaction 1/3/2015 $24.233 Second Transaction
BBBB.5678 1/1/2015 $350 Third Transaction 1/3/2015 $24.233 Fourth Transaction
My Import program saves this spreadsheet at tab-delimited text, then I use BULK INSERT to bring the data into a generic table full of varchar(255) fields. There are about 90,000 rows in each day's data; after the BULK INSERT about half of them are removed for various reasons. Next I add a RowID column to the table with the IDENTITY (1,1) property. This gives my raw data unique row numbers.
I then run a routine that converts and copies those records into another holding table that's a copy of the final destination table. That routine parses though the data, assigning the account number in the section header to each detail row. It ends up looking like this:
AAAA.1234 1/1/2015 $150 First Purchase AAAA.1234 1/3/2015 $24.233 Second Purchase BBBB.5678 1/1/2015 $350 Third Purchase BBBB.5678 1/3/2015 $24.233 Fourth Purchase
My technique: I use a cursor to get the starting RowID for each Account Number: I then use the upper and lower RowIDs to do an INSERT into the final table. The query looks like this:
SELECT RowID, SUBSTRING(RowHeader, 6,4) + '.UBC1' AS AccountNumber FROM GenericTable WHERE RowHeader LIKE '____.____%'
Results look like this:
But every time I run the routine, I get different numbers! my results are not accurate. I get inconsistent results EVERY TIME.Here is my code, with table, field and account names changed for business confidentiality.This is a high profile project at my company;
TRUNCATE TABLE GenericImportTable; ALTER TABLE GenericImportTable DROP COLUMN RowID; BULK INSERT GenericImportTable FROM 'SERVERGeneralAppnameDataFile.2015.05.04.tab.txt' WITH (FIELDTERMINATOR = ' ', ROWTERMINATOR = ' ', FIRSTROW = 6)
I have a ssis package that has multiple large lookups without memory restriction. When running the package manually from SSMS on the same server it runs on when running automatically under the job agent, the package errors out when the server memory gets depleted by the loading of the large lookup reference data. One of the messages I get is "An out-of-memory condition prevented the creation of the buffer object. "
Anyway, the package runs successfully when it runs automatically under the job agent.
I was curious as to why the above happens. Is that a bug or is the run time behavior different under these 2 environments by design.
I have a Pivot Transform in SSIS (2005) working perfectly, EXCEPT for that the first column of the output (the date) repeats for each of the following columns, which are themselves falling into the correct column, but not on the same line for a particular date as the others. Snipet of result from Data Viewer is:
I work in the healthcare area, and am handling the survey data ETL's. There are around 8 different survey areas and based on information received from them for the visit they reference, I want to pull in more info from our invoicing database. My idea is this:
1.) Pull in the flat file to an ODBC staging table 2.) Cache all invoice records that fall between the MIN(Date of Service) and MAX(Date of Service) from the staging table. 3.) First lookup the information needed on patientID, providerID, date of service, and billing location. 4.) For the surveys that didn't match on those 4 columns, try looking up based on patientID, date of service, and billing location (since I could be 99% sure this would still return the record I need). 5.) For the remaining surveys, lookup based just on patientID and date of service. These records will be flagged for manual review because clearly, if a patient has multiple appointments in the same day, this will be prone to error.
However, in trying to use only 3 of the columns in the lookup, I get the error saying basically that I need to utilize all 4. Is there a way around this, or is there an entirely different way I should be approaching this? The reason I thought cache transform was the answer is because I will need to run a different package for each lookup, as the data and logic between each survey will vary, but the invoice data "pool" will stay the same regardless.
In my current project i have a requirement to assign value of an aggregate transform to a variable. But i need to accomplish it without using a script task.
Hi friends, Can somebody tell me how to do this- How can we Analyze existing code used to transform data into the Operations Data Warehouse, and make changes to correspond to upcoming changes in the SAP data sources. Thanks
We are using lookup transformation in SSIS 2012. The lookup transformation queries a table with two date columns. When we hover the mouse over the two columns in the 'columns' tab of the lookup transformation editor, the two columns show as DT_WSTR instead of DT_DBDATE. This causes the SSIS package to fail due to data type mismatch.A similar abandoned thread is available at: URL....
On my MS SQL Server 2000, I am trying to create a generic way to load tables into my datawarehouse.
I have as input to the process a large number of table definition(s) stored individually as files on my server. And, ascii delimited data files in various locations but mostly accessible via NFS mounts.
I created two DTS package in MSSQL2K that in theory represents what I want to do:
package1 ... invoke package2 with global variables to load a system of related tables
package2 ... check for a trigger file ... set the "Execute SQL Task" statement to my first file ... run the "Execute SQL Task" which drop/add's a table ... set a "Connection" to a data source file that I want to use ... run the transformation and, with that my package starts to fall apart ... set the "Execute SQL Task" statement to the next file, and ...... goback and execute it
I can't figure out how to set the table in the transformation section to the table I want to use. And, I assume next to have the transformations links between the source and new table relinked.
The source files contain in the first row the column names as found in the tables I just created.
Okay all I have a problem. I have two list of adresses and phonenumbers. I do not have control over the contents of the lists The onlyunique field between the two is the phone number. I need to be able toinner join the two lists on phone number.This would normally be straigt forward but the problem is that they areformated different and one of them does't even have a control on theformating.*Phone numbers are US phone numbers only.The one list (table) that does have a control uses this formatAAA3334444where AAA is the area code333 is the 3 digit prefix4444 is the four digit suffixthe second list (table) does not have any standardized formating andcan be filled with extraneous spaces, parentheses and dashes not tomention the leading 1 in some instances.I thought that I could do some kind of regular expression to do acomparison but I havn't as yet found a good resource to tell me how todo it or if it is even possible. Or maybe break up the one I know hasa standardized format into something like this:'%AAA%333%4444%' and doing my comparison that way. however It is veryimportant that only those list items in both list that are truly thesame place be listed.suggestions and solutions are apreciated