I am busy designing a ETL solution and have a question about how to design the packages.
We have over 30 source systems for different customers. We are building a WH that will combine all this data for analysis. The main issue is that these systems are always at different versions of the software. When a patch is released, it usually goes to one or two for a Beta process before it moves to the rest. These patches can affect the DB design, and we would want to be able to extract any new data as it comes available from the systems that have it.
Solution 1 - Package Versions
The idea was to create the SSIS packages with a version number in their name. For each change, you would create a new version. The Batch control application that is being developed will store which system needs to use which version of the package.
Solution 2 - Multiple paths within a package
This was to create a single package for each table, with a conditional split as the first task. The batch system will still provide which path the package needs to take with different Data Flow tasks containing the different column mappings.
Both have pros and cons, but was wondering if anyone has experience with this type of setup and which way worked best, or if there are any other options.
Many Thanks
Michael
I am in the process of designing a package for populating a Dimension table for my new data warehouse. I would like to discuss with you my proposed solution in the hope that some of you will be able to advise me on any mistakes that I have made so far.
I have 3 source tables which are used to populate my Dimension table (I will restrict each source table to a few columns to keep this simple).
Please note that I haven't included the start and end dates to keep this simple, but in my real solution, the Dimension table has many more columns.
The Meetings and Events tables contains hundreds of thousands of records so when my package is run, I should only retrieve information that has changed since the last time my Dimension table was populated. This is why I store the time stamps in my Dimension table.
So when my package runs, I get max(Meeting_Time_Stamp) and max(Event_Time_Stamp) from my Dimension table then retrieve all records from the source table where their timestamps are GREATER than the max values.
The problem is this :
If in the source system, an event is changed, the time stamp associated with that record will change value but the meeting record associated with that meeting will not have its time stamp changed.
How do I ensure that when my package is run, I retrieve all meeting and events information even if one hasn't changed?
Should I build a table made of IDs?! And if I do need to build a table made up of IDs, should I store this as a permanent table in my database or is there an SSIS data structure I can use to hold this information?
We are downloading 4 large (500mb) zip files in the package. Those come from 4 different FTP sites. Sometimes the FTP download on one of those fails.
Zip files contain images which need to be uploaded for existing listings. Each zip file is processed the exact same way. The question is how can I make the SSIS package so that each downloaded zip file can be processed immediately after the download - and not duplicating the processing logic?
I am new to SSIS. I need to transfer data from SQL Server 2005 Operational Database to SQL Server 2005 Report Database. The upload needs to work every night. There are few master tables and remaining are transactions tables.
I am planning to create 2 packages one for master tables and other for transactions tables. Is it the good approach?
Also few of transaction tables are heavy in terms of number of records. Will it better if i further break them in many packages?
I am using book "Microsoft SQL Server 2005 Integration Services Step by Step". Can you suggest any other good book?
Win 7 SP1 x64 PC. I installed SQL Server 2014 Dev Edition + Visual Studio 2015.
I'd like to create some basic ETL SSIS packages, and I worked very comfortably in 2008R2.
For 2014, I started with this tutorial:[URL]However, it says to go to Start->All Programs->Microsoft SQL Server->SQL Server Data Tools.Â
I did explicitly install SSDT when I installed VS2015. I also installed it separately. I see SSDT listed in Programs, and SSIS is running according to SQL Server Config Manager, and Services. Half of Microsoft's docs seem to be 2012 era, which is a shame because 2014 is out and it's nearly 2016...
how do I get to the GUI where I can design ETL packages?Â
OK, I was able to successfully migrate all of my DTS packages to SSIS, for SQL 2005. I can log into intergration services and see my packages listed under:
servername --> stored packages --> msdb. Now my question is, how can I open these packages, not run them open them in a design mode like you can in SQL 2000, you can double click on the package name and view the design of the package. how can i do that now that I have them in SQL 2005?
Hi i tried designing a SSIS package which loads only those rows which were different from existing rows in the table , i need to timestamp the existing row with an inactive date when a update of that row is inserted (ex: same studentID ) and the newly inserted row with a insert time stamp so as to indicate the new row as currently active, in short i need to maintain history and current rows in same table , i tried using slowly changing dimension but could not figure out, anyone experience or knowledge regarding the Data loads please respond.
example of Data would be like
exisiting data
StudentID Name AGE Sex ADDRESS INSERTTIME UPDATETIME 12 DDS 14 M XYZ ST 2/4/06 NULL 14 hgS 17 M ABC ST 3/4/07 NULL
New row to insert would be
12 DDS 15 M DFG ST 4/5/07
the data should reflect
StudentID Name AGE Sex ADDRESS INSERTTIME UPDATETIME 12 DDS 14 M XYZ ST 2/4/06 4/5/07
12 DDS 15 M DFG ST 4/5/07 NULL
14 hgS 17 M ABC ST 3/4/07 NULL
Please provide your input as much as you can even though it might not be a 100% solution.
Hey, I've a few jobs which call SSIS packages. If I run the SSIS package, it runs fine but if I try to run the job which calls this package, it fails. Can someone help me troubleshoot this issue? None of my jobs that call an SSIS package work. All of them fail.
And there is a task (Execute SSIS package) in First package that calls the execution of second package.
I m continuously receiving an error "Failed to decrypt protected XML node "PackagePassword" with error 0x8009000B "Key not valid for use in specified state.". You may not be authorized to access this information. This error occurs when there is a cryptographic error. Verify that the correct key is available."
As we are running first package by job, job runs successfully logging above error
The protection level of second package is set to "EncryptSensitiveWithUserKey"
I am completely new to SSIS and have been given a large project (of course with a tight deadline) that has the absolute requirement of using SSIS. I am/was very, very good with DTS and could easily accomplish what I need to do with an ActiveX script task in DTS in no time, but as this is new development, we are not to use ActiveX script tasks within SSIS since it will not be supported in the next SQL Server release. I'm thinking script task, but please give some comments on how you would accomplish the following in SSIS (please remember I'm new to SSIS, so don't assume I know anything. )
I must accomplish this: in a nutshell, I need to create separate tab delimited text files of customer informaion. One for each region. Each region consists of X amount of states and we have X amount of regions. (Pseudo code followed by standard explanation)
Select a max value from region lookup table in SQL (this is the # of regions)
for N=1 to MyMaxValue
select states from region lookup table where region code = N (the current region we are on) 'this returns a list of states in a region, need these in array or recordset object or something Open an output file which will be a tab delimited text file we will write results below in loop to (in DTS I would programatically kick off a transformation task in the package) 'loop thru states returned, so if in a rs object... do while not rs.eof
execute customer stored procedure, passing as a variable the current state we are on 'this will return all customers within a state, this whole result set (approx 1 million) needs to go to the tab delimited file 'I have to execute this stored procedure for each state & then write results to the SAME file, until we are onto a different region
rs.movenext close file loop next
OK, so basically, as you can see, Its sort of simple in a way what I need to do, i just have no idea how to go about doing this in SSIS. I can not hard code any state or region values. I MUST read them in from the lookup tables as region codes are constanatly changing and we are constantly adding in new states and new regions, so with above coding idea, it would always dynamically pick up any new states, new regions or changes.
So in a nutshell, I need to create separate tab delimited text files of customer informaion. One for each region. Each region consists of X amount of states and there are X amount of regions. Pretty strait forward, huh? The requirements are strait forward, but SSIS is throwing me for a loop... it does not seem flexible enough to be as dynamic as I need it to be to do this. I'm sure it is, just my understanding of it is very basic so far.
Please provide your suggestions! I think a lot of newbies would benefit from some SSIS design info... how to do common things in SSIS, but beyond just retrieving a recordset and writing it to a file... what do you do when you need to add just a few layers of decision processing, and retriving recordsets and writing files based on that decision processing?????
Is Microsoft thinking over the possibility to implement Control+Z in order to avoid drawbacks when you're writing a package? I mean, when you drop a container then you aren't able to retrieve again . That same behaviour happened with Sql2k-dts.
Can anybody give me design ideas on the following?
I have numerous tasks, any one of which can fail. I want to point them all (via 'Failure' constraint) to a SendMail task for a "failure notification" email. This I have setup, and is working fine. Now, I want to have a changing message for the email's body (MessageSource) to say "Process A has failed", or "Process B has failed", etc.
My initial thought was to add a variable, then add a ScriptTask between each task and the single SendMailTask, have the script update the variable, and have the MessageSource (body of email) mapped to the variable. Is this the proper way to go about this? Seems like if I have 20 processes that could fail, then I'll have to add 20 script files; this becomes a bit unwieldy the more processes that I want to monitor.
Does any one have a workaround this issue. If I designed a package from SQL2000 and save it on SQL 7, I can't open/execute it from SQL 7.0. My workstation is SQL 2000 and the server is SQL 7.0, even if I have the same SQL 7 but different SP.
Hi all,I'm fairly new to this...I't trying to design a DTS package on SQL Server 2000`, which willconnect to the server, export some table date into a .txt file (basedon a select statement) and then delete the data from the table (basedon a delete statement) upon successful completion of the export.So far so good..Now, upon the successful deleteion, I want to export some more data andthen (upon the successful export) to delete the data. I want to repeatthis process approx. 10 times.Though, when I try to link the deletion to the new export on success, Iget the message: "Defining the precedences between the selected itemsis not valid."Any idea on how I can accomplish what I want to do?thanks a lot
All: This proably is an unsual request. I have developed a package that runs fine and does what it is supposed to do. I am jsut not sure if I have developed it in the most efficient way. I have several years of ETL experience but only about 6 months with SSIS. I know I can benefit a lot if I had my package reviewed by an expert.
I realize that I am asking for some time committment on your part and so would understand if I do not get any takers. But if you would like to review my package and offer suggestions on its improvement please let me know. We can work on the logistics of getting the package to you.
I have a SSIS job, one of the last steps it performs is to execute a SQL 2000 DTS package. This has to be done as a SQL 2000 DTS package as it is performing rebuilds of SQL 2000 Analysis Services dimensions and cubes. We've found that when the DTS fails the SSIS job is happily completing showing as a success, we would prefer to know it went wrong.
As far as I'm aware SSIS merely starts the DTS off and doesn't care about it's result. I've taken a look in to turning on the logging for the execute DTS package and thought that the ExecuteDTS80PackageTaskTaskResult would give me the answer I need...but is merely written to the log not available as an event-handler. It also looks like it is not safe to put a SQL task in as the next item to go look at the SQL 2000 system tables to look at the log for the DTS package as the SSIS documentation warns that the DTS package can continue to run after the execute DTS package task has ended.
Ideally I want any error raised within the DTS package to cascade up to be an error in the SSIS job, I can then handle it appropriately. I cannot find a way to do this. Is there a way?
If not, can anyone suggest how in the remainder of the SSIS tasks I can be sure that the DTS has completed before I start any other tasks that will check for the SQL 2000 log of its execution?
I have developed an SSIS package for ETL purpose. I am invoking the SSIS package through .Net console application by referencing the ManagedDTS Assembly. I am able to execute the package in Sql Server 2005 Developer Edition and it runs fine till completion.
But when i try to execute the packahe in Sql Server 2005 Standard edition, by invoking the package through .Net console application the status of the package is failure.
Can any one help me how to over come this problem.
I am in the process of moving from a 32-bit SQL Server 2005 Enterprise (9.0.3054) to a 64-bit SQL Server 2005 Enterprise (9.0.3054 with 4 CPUs and 8GB of memory on Win 2003 SP2) and the process has been very frustrating to say the least. I am having a problem with packages that I created on my 64-bit SQL Server. I am importing a few tables from the 32-SQL Server into the 64-bit SQL Server using the Task --> Import to create the package.
Sometimes when I am creating a package I get the following error in a message box:
SQL Server Import and Export Wizard
The SSIS Runtime object could not be created. Verify that DTS.dll is available and registered. The wizard cannot continue and it will terminate.
Additional information: Attempted to read or write protected memory. This is often an indication that other memory is corrupt. (System.Windows.Forms)
Other times when I run a package that has run successfully before I get the following error:
Faulting application dtexecui.exe, version 9.0.3042.0, stamp 45cd726d, faulting module unknown, version 0.0.0.0, stamp 00000000, debug? 0, fault address 0x025d23f0.
The package appears to hang when running. By this I mean that the Package Execution Progress shows progress up to a point then it just stops. (The package takes about 17 seconds to run normally) CPU usage is at 1% and the package cannot be stopped.
I have deleted and re-created the package several times and I have also re-installed the service pack on the SQL Server (9.0.3054) but that did not help.
I would like to standardize SSIS development so that developers all start with the same basic template. I have set it up so it is an available template ( http://support.microsoft.com/kb/908018 ) but I would like it to be the default when a new project or package is created. Is this an option?
I would like to fetch the data flow component name while package is executing. Since system variable named [System::SourceName] only fetches name of the control flow tasks? Is there a way to capture them?
The master package has a configuration file, specifying the connect strings The master package passes these connect-strings to the child packages in a variable Both master package and child packages have connection managers, setup to use localhost. This is done deliberately to be able to test the packages on individual development pc€™s. We do not want to change anything inside the packages when deploying to test, and from test to production. All differences will be in the config files (which are pretty fixed, they very seldom change). That way we can be sure that we can deploy to production without any changes at all.
The package is run from the file system, through a job-schedule.
We experience the following when running on a not default sql-server instance (called dkms5253uedw)
Case 1: The master package starts by executing three sql-scripts (drop foreign key€™s, truncate tables, create foreign key€™s). This works fine.
The master package then executes the first child package. We then in the sysdtslog get:
Error - €œcannot connect to database xxx€? Info - €œpackage is preparing to get connection string from parent €¦€?
The child package then executes OK, does all it€™s work, and finish. Because there has been an error, the master package then stops with an error.
Case 2: When we run exactly the same, but with the connection strings in the config file pointing to the default instance (dkms5253), the everything works fine.
Case 3: When we run exactly the same, again against the dkms5253uedw instance, but now with the exact same databases defined in the default instance, it also works perfect.
Case 4: When we then stop the sql-server on the default instance, the package faults again, this time with
Error - €œtimeout when connect to database xxx€? Info - €œpackage is preparing to get connection string from parent €¦€?
And the continues as in the first case
From all this we conclude, that the child package tries to connect to the database before it knows the connection string it gets passed in the variable from the master package. It therefore tries to connect to the default instance, and this only works if the default instance is running and has the same databases defined. As far as we can see, the child package does no work against the default instance (no logging etc.).
We have tried delayed validation in the packages and in the connection managers, but with the same results (error).
So we are desperately hoping that someone can help us solve this problem.
I am interested in Passing value from a child Package variable to the Parent package that calls it in ssis.
I am able to call the Child package using the execute package task and use Configurations to pass values from the parent variable to the child, but I am not able to pass the value from the child to the parent.
I have a variable called datasetId in both the parent and child. it gets computed in the child and needs to be passed to the parent...
Deployed Report having SSIS package as source do not work when Indirect Package configuration is used in ETL package. It seems ETL package when called/executed from Report manager does not recognize environment variable to pick up the dtsconfig file.
The Report works when Direct package configuration is used to same dtsconfig file.
What could be the reason? Any solution for this? This will cause our build/deployment to QA and Prod very difficult.
I now have two SSIS package, "TESTING" and "LOADING". The "TESTING" package have an execute package task that call the "LOADING" package. When I want to execute the TESTING package, how can I setup the connection string so that I can edit the password of the database connected by the "LOADING" package?
I have two SSIS packages in a project, one calling the other. The parent package works fine in my local mechine. After they are deployed to the production, I schedeul jobs to run the packages in the SqlServer. The child package works fine if I run it alone, but the parent package could not find its child package if I run the parent package . As I checked, all xml config files and the connection string pointing to the child package were set correctly. It seems the parent package did not use the xml config file. Can someone help me? Thanks in advance.
I have successfully created a SSIS package which execute a DTS 2000 package and with no problem to execute the task. But I failed to schedule this package. I was not success in setting the logging. When running the package in command line:
dtexec file "C:Documents and SettingslyangMy DocumentsVisual Studio 2005ProjectsTraingDTSTraingDTSDTSTraining.dtsx"
Error: 2008-03-24 08:03:24.36 Code: 0xC0012024 Source: Execute DTS 2000 Package Task Description: The task "Execute DTS 2000 Package Task" cannot run on this edit ion of Integration Services. It requires a higher level edition. End Error Warning: 2008-03-24 08:03:24.38
Code: 0x80019002 Source: DTSTraining Description: The Execution method succeeded, but the number of errors raised (2) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the M aximumErrorCount or fix the errors. End Warning DTExec: The package execution returned DTSER_FAILURE (1).
I am making use of the DtUtil tool to deploy my package to SQL Server. Following is my configuration: 32-bit machine and 32-bit named instance of Yukon.
I have some package variables which need to be set in the code.
Previously I did it as follows:
Set the package variables in the code. For example:
Here the package was successfully deployed and when i open those packages using BIDS, I am able to see that the variables are set to the values as doen in teh code.
Because of oen problem I am not using SaveToSQLServer method. So I switched to DTUtil tool. Now I am doing it this way:
Set the package variables as before. Deploy the package to SQL Server using DTUtil tool.
Now is the problem: The package is successfully deployed. But the variables are not set to the value that I have specified in the code.
I also tried DTexec utility to set the package variable. Even that does n't work. Can anyone help me out? Is there any alternate method to set package variables?
when i run the job using network service account credentials job is failing. But when i run the package individually, it is tasting success. when it runs as job, this is the error message i am getting
SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009. There may be error messages posted before this with more information on why the AcquireConnection method call failed.
I have changed to 32-Bit run time and ran the excel package, even then it is failing...i tried to use my credentials (i am admin on the box), even then it is failing...please suggest
I would like to merge data from several excel sheets and load into a SQL Server 2005 database. The excel sheet are uploaded by users from several locations and I'm storing them on a folder on the web-server. I've written the SSIS packages for the required transformations, and for loading the data onto the database. Now, my question is:
1. What is the most efficient way to trigger execution of the SSIS package each time a file is uploaded onto the server (and stored in that specific folder)?
2. Would it be possible to not statically set the "excel source" for the SSIS package data flow transformations, but instead use the folder as the source and start the SSIS package each time a file is newly transfered onto the folder? (Should I use the Script task to get a list of excel files in the folder ordered by the system time?)
I tried using the Service Agent and scheduling it to execute the package on the server. But, my current SSIS design has a static excel source filename and thus when the package is executed more than once, it fails due to DB key constraints.
I need some help in designing the SSIS Package. The requirement is as follows:
I need to import data from an excel sheet to a table in the database. I have 2 lookup tables which has the master data. Before importing, I need to check whether the values of 2 columns in Excel sheet exist in those Master data tables or not. If not exist i need to log that error along with the row number and proceed to next row. It has to loop through all the rows and log the errors. The Import happens when there is no error occurred ie i need to roll back (but loop through all rows to log the errors) if there is any error occurs.
It would be great if you could help me out in desining this scenario in SSIS Package.
I want to achieve the following in (SSIS/SSDT for SQL 2012) -Â
I have a generic SSIS package which simply sends out email notifications using SMTP email task (this package is within its own project, and has project level input parameters).
I need to be able to call this package in the Event handler section of every package (numbering in about less than 60) that we have. These packages are within their own respective projects.
I thought I could use the "execute package task", but it turns out , using this, I cannot call a package that is part of some other project. I also cannot call a package that is stored in the CATALOG. Is there any way I can do this ?
When I call the child package , I should be able to send in parameters like - error information and package name of the Parent package.