Hi all! I recently started working with SSIS and one of the things that is puzzling me the most is what's the best way to go:
A small control flow, with large data flow tasks
A control flow with more, but smaller, data flow tasksAny help will be greatly appreciated.
Thanks,
Ricardo
I need to pass a parameter from control flow to data flow. The data flow will use this parameter to get data from a Oracle source.
I have an Execute SQL task in control flow to assign value to the Parameter, next step is a data flow which will need take a parameter in the SQL statement to query the Oracle source,
The SQL Looks like this:
select * from ccst_acctsys_account
where to_char(LAST_MODIFIED_DATE, 'YYYYMMDD') >?
THe problem is the OLE DB source Edit doesn€™t have anything for mapping parameter.
I have an Execute SQL Task that returns a Full Rowset from a SQL Server table and assigns it to a variable objRecs. I connect that to a foreach container with an ADO enumerator using objRecs variable and Rows in first table mode. I defined variables and mapped them to the columns.
I tested this by placing a Script task inside the foreach container and displaying the variables in a messagebox.
Now, for each row, I want to write a record to an MS Access table and then update a column back in the original SQL Server table where I retreived data in the Execute SQL task (i have the primary key). If I drop a Data Flow Task inside my foreach container, how do I pass the variables as input to an OLE DB Destination on the Data Flow?
Also, how would I update the original source table where source.id = objRects.id?
Thank you for your assistance. I have spent the day trying to figure this out (and thought it would be simple), but I am just not getting SSIS. Sorry if this has been covered.
Dear All! My package has a Data Flow Task. In Data Flow Task, I use a Script Component and a OLE BD Destination to transform data from txt file to database. Within Data Flow Task, I want to call File System Task to move file to a folder or any Task of "Control Flow" Tab. So, Does SSIS support this task? Please show me if any Thanks
I'm currently setting variables at the package level with an ExecuteSQL task. This works fine. However, I'm now starting to think about restartability midway through a package. It would be nice to have the variable(s) needed in a data flow set within the data flow so that I only have to restart that task.
Is there a way to do that using an SQL statement as the source of the value in a data flow?
OR, when using checkpoints will it save variable settings so that they are available when the package is restarted? This would make my issue a moot point.
I recently upgraded to on 2012 SP1 CU5 and have found the SSDT gui for SSIS to be almost unusable. I can't drag or resize items. Any time i try they either automagically shrink to the tiniest possible size, shoot off to some extreme or just shake uncontrollably I didn't have these problems on previous versions (dont remember what It was).
I am new to SSIS can anyone tell me the diff between control flow and dataflow. if all the transformation are done using dataflow than why do we use control flow. Sorry if I am asking you very basic question.
Hi, I'm trying to implement an incremental data pull (Oracle to SQL) based on Andy's blog: http://sqlblog.com/blogs/andy_leonard/archive/2007/07/09/ssis-design-pattern-incremental-loads.aspx
My development machine is decent: 1.86 GHz, Intel core 2 CPU, 3 GB of RAM. However it seems the data flow task gets hung whenever I test the package against the ~6 million row source, as can be seen from these screenshots. I have no memory limitations on the lookup transformation. After the rows have been cached nothing happens. Memory for the dtsdebug process hovers around 1.8 GB and it uses 1-6 percent of CPU resources continuously. I am not using fast load to insert new records into my sql target table. (I am right clicking Sequence Container 3 and executing this container NOT the entire package in the screenshots)
The same package works fine against a similar test table with 150k rows. http://i248.photobucket.com/albums/gg168/boston_sql92/7.jpg http://i248.photobucket.com/albums/gg168/boston_sql92/8.jpg
The weird thing is it only takes 24 minutes for a full refresh of the entire source table from Oracle to the SQL target table. Any hints,advice would be appreciated.
I am having a hard time with what appears to be something simple. I want to import an excel spreadsheet into a table on a daily basis from a command line. I created a package from the Import Wizzard in the SQL Management Studio and saved it. Since I want a clean table each day, my process needs to be create a temp table, import from the Excel file into the temp table. If that is successful, delete the original table and rename the temp table the original name. The point of this process is to provide for a fail-safe if there is some unforseen problem downloading the data on a particular day.
When I run the package, the first thing it does is delete the original table. I know this because the process shows the time that it finished is before anything else has started or finished. The time shown for the completion of the data flow task is about 2 minutes after that time.
This is maddening!!! The one thing I do not want to happen I can not seem to prevent. I have my control flow set on success. Why does it do this?
Is it possible to setup a Control Flow at the solution level rather than at the package level? I'd like to setup a Control Flow that truncates multiple tables in a staging database then runs multiple packages that reloads those tables. Each package has a Control Flow tab that seems to be specific to the package. Is it possible to set something up that governs the execution of multiple packages?
Essentially I have an incoming file, each line in the file is a record. The records share the same initial key fields for the first 10 columns, then the field structure varies depending on a rectype and sequence number.
My initial plan was load the keys into fields, and load the remaining data into a long varchar field.
Then the stored procedure would evaluate the Rectype and Seqno of each record and chop up the Varchar accordingly.
So I set up a cursor to read the temporary table, do a fetch into variables, and go to evaluate the variables.
I want to be able to use a CASE statement to evaluate the fields and then perform various logic, but it's giving me fits because it seems like CASE only really works in Select statements, and won't really allow you to do any sort of GOTO logic.
I chopped the following SQL up and put in a rough cut of what I thought I was doing.
FETCH NEXT FROM Transaction_Cursor into @eepssn,@rectype,@seqno,@data
WHILE @@FETCH_STATUS = 0 BEGIN select @companycode = cmpcompanycode, @eeccoid = eeccoid, @eeceeid = eeceeid, @eecempno = eecempno from company,empcomp where eeceeid = (select eepeeid from emppers where eepssn = @eepssn ) and eecemplstatus = 'A' and cmpcoid = eeccoid
CASE @RECTYPE+@SEQNO when '001001' then goto parse_pretax when '002001' then goto parse_LOAN else select @RECTYPE+@SEQNO+' not recognized!' end process_it: insert into foo2 (empno,companycode,amt,pct) values (@eecempno,@companycode,@eeamt,@eepct) FETCH NEXT FROM Transaction_Cursor into @eepssn,@rectype,@seqno,@data END
CLOSE Transaction_Cursor DEALLOCATE Transaction_Cursor
goto bypass
parse_pretax: let @eepct = substring(@data,1,5)
goto process_it
parse_loan: let @eeamt = substring(@data,27,11) goto process_it
bypass:
I could sketch it out a little bit better in Northwind or Pubs, but I think I just need a smack upside the head and a little edification.
Hi, I would like to have a decision maker component in Control Flow to decide which Data Flow to run. For example: if myVar=A then run data flowA if myVar=B then run data flowB ......
We have Conditional Split component available in data flow but not in Control Flow. What component in control flow can be used to do the same job that conditional flow is doing in data flow? Or what's a work around for this.
I have a problem when using nested loops in my Control Flow. The package contains an outer Foreach Loop using the Foreach File Enumerator which in my test case will loop over two files found in a directory. Inside this loop is another Foreach Loop using the Foreach Nodelist Enumerator. Before entering the inner loop a variable, xpath, is set to a value that depends on the current file, i e /file[name = '@CurrentFileName']/content. The Nodelist Enumerator is set to use this variable as its OuterXPATHString. Now, this is what happens:
First Iteration: The first file is found and the value of xpath = /file[name = 'test1.txt']/content. When the inner loop is entered it iterates over the content elements under the file with name test1.txt as expected.
Second Iteration: The second file is found and the value of xpath = /file[name = 'test2.txt']/content. When the inner loop is entered it unexpectedly still iterates over the content elements under the file with name test1.txt.
My question is: Should it not be possible to change the loop condition of an inner loop in an outer loop such that the next time it is entered it will be done based on the new condition? It seems that the xpath variable is read once, the first time, and never again. If that is the case, does anyone know of a workaround?
I recently installed sql server 2005 with integration services. While trying to drag control flow items in control flow pane, I am having below error...
The task with the name 'Execute sql task' and creattion name "Microsoft.sqlserver.dts.tasks.executesqltask, Microsoft.sqlserver.SqlTask,Version=9.0.242.0,culture=neutral,public token=7895cd6666345' is not registered for use on this computer.
Contact Information:
Execute SQL Task
Can anybody please help me why this and how we solve this problem...please
Hello, My package "splits" into a number of work flows by the use of precedent management.
After a number of steps I want to combine the flows back to one and finish the package.
What is the mechanism to combine the flows into a singular, linear process again so that I do not have to multiply define identical tasks to finish each flow leg?
I have run into a problem while creating a simple UDF on SQL Server 2000.
Code Snippet
CREATE FUNCTION [dbo].[GetSectionNum] (@section varchar(4)) RETURNS varchar(2) AS BEGIN DECLARE @sTemp varchar(2),@s char DECLARE @count int DECLARE @length int
set @length = LEN(@section) set @count = 1
WHILE (@count <= @length) BEGIN SET @s = SUBSTRING(@section,@count,1) IF(ISNUMERIC(@s)) BEGIN SET @sTemp = @sTemp + @s END SET @count = @count + 1 END
IF(LEN(@sTemp) = 1) BEGIN SET @sTemp = '0' + @sTemp END
RETURN @sTemp ENDWhen I perform a syntax check I receive and error about "Error 156: incorrect syntax near keyword 'BEGIN'. I have narrowed the problem to the IF statement inside the While block. If I remove the IF statement the syntax check is successful. This is the first UDF I have written so I'm swimming in uncharted water. Thanks ahead of time for your help.
I am using the event handling mechanism to do my custom logging. This works fine. Using the OnPreExecute and OnPostExecute my log tables fill up with the start- and enddates of all the containers and tasks in the complete package hierarchy.
However, what I am missing (e.g. in a system variable) is the source ID of the parent container that started the task or container. In other words, in my logging reports, I would like to build up a tree starting with the topmost package, like the progress indication in the IDE.
The built-in Logging features do not log this information either as far as I can tell.
I have a package which has 3 data flows and a script component to write log file. Those 3 data flows are in sequence (mean upon success, next will run). If the first 2 data flows fails, control will go to script and upon completion of 3rd, control will go to script to create log file.
Now, my question is, within my script component, is there any way to find out what was the last ran data flow and its status? What I know is, adding a 'activex task' between each data flow and script component which updates a variable and then I can use that variable in the script component to know the last ran data flow. But what I want to know is, is there any system variable (or something) which will do the samething without any need to add additional task?
hi all of you, I haven€™t idea if the following description is a issue or not, anyway:
I begin from Control Flow layer.
I€™ve created a sequence container and inside I€™ve got two groups, one own a sql task and another one own a Data Flow task. Both are linked for a completion conector. Up to here everything is fine. But when I collapse my sequence container the arrow remains there for these tasks and you can see the sequence container €śclosed€? and the arrow lonely.
Not very esthetic, not practical.
Any clarification or though will be as usual welcomed
I have read several posting about various modes of trapping errors, but none seem to directly address what I am looking for (SQLIS.com.MSDN, etc)
Coming from a Java/C# background, I am looking for a way to trap errors that arise within the ssis control flow much like the said languges:
try {
do something
} catch(AnExceptionType myException) {
handle my exception
}
/ ** my code at this point is unperterbed by the exception unless I explicitly raise the exception out of the scope of the exception handler. */
To make the analogy in SSIS, I want to be able to handle an error within a "container" and not have to handle the same error in surrounding containers.
Example:
I have a "Foreach" container (call it container FEC) that contains several other containers. One of the subordinate containers is a "For Loop" (call it FLC). The FLC in turn has some nested tasks, some of which are expected to fail and therefore I want to handle in a graceful manner. The tasks that are expected to fail have a "fail" constraint that links them to a task that I want to occur when the failure occurs, and that works, but the failure is not trapped as it percolates out of the container to the FEC. I also tried to trap it with event handler, but that is also an incorrrect trail to follow.
I don't want the failure to percolate up to the FEC. I have set the max errors to a reasonable value for FLC and my "program" is not exceeding that value; however, the FEC still sees that error so it fails. How do I keep FEC from seeing the error (without upping the max errors for the FEC)?
BTW, I am using the script task to set a variable value to indicate successes or fails for those tasks where I can set the max errors to a high enough level (allow the error to occur, then let the fail/success precedent constraint pass control to the script task so that the variable can be set). This is only a partial solution.
I am new to SSIS, in fact to the MS world having been a code slinger for Java and Oracle. So far I have been very impressed with SSIS. Analogous structures that I expect to find in modern development environments have been within easy reach. This is my first serious challenge. Please help.
A branching element is critical to any process flow. Currently, as far as I know, there;s only a Conditional Split Data Flow Element. There is no direct way I can branch out in a control flow.
In some cases, I could branch by using a conditional operator ?: to either create a dynamic sql string for each patch, or a package name for a Package Execution task and so on.
This approach is not always good enough.
Are there others out there who want a Conditional Control Flow element?
I am somewhat new to SSIS and have a question on branching/control flow. We have 3 manufacturing facilities on AS400. Requirement were to create independent SSIS packages in case certain facilities were down, to not interrupt or fail other facilities. to be honest, I didn't want to maintain 3 * (each plant package) packages and decided to create one that takes in a command line parameter and will create seperate batch files (requirements for our archaic scheduling system) that pass in the plant to be processed.
Everything works fine and the packages analyze the passed parameter with no problem and evaluate it when calling stored procedures int a SQL Task. I am required (I think - let me know if I'm wrong here) to have 3 seperate Data Flow tasks depending on which plant is being processed because they are different connections (same server, different DBs - AS400). Which Data Flow Taks run is based on a precedence constraint that analyzes the passed in plant number.
After the Data Flow Tasks are complete, I'd like to merge back into one path again (even though1 out of N will only run at a time) because they are all simply stored proc calls with different parameters. It seemed silly to me to have the same tasks repeated 3 different times. I looked into started up tasks after events, but I don't know a lot about it and it seems as though I would still be maintaining 3 different paths anyway.
Can anyone point me in the right direction?
P.S. (Even if the sollution is to have 1 Data Flow Task I'm still interested in knowing how to converge paths back together)
I've been using Konesan's FileWatcher control-flow item successfully in design-mode on my PC, which runs the package on a remote server.
I have installed the Konesan's FileWatcher on the remote SQLServer machine. I then imported the package to the server (Files System folder). I then select the package, right-click 'Run Package', then Execute, and receive the error:
"Error: The task 'File Watcher Task' cannot run on this edition of Integration Services. It requires a higher level edition"
..in the 'Package Execution Progeress' dialog. All other validation seem to be ok.
(Note, I'm executing the above steps using SQLServer Mgt Studio from my PC ; I'm not doing it from the SQLServer machine itself...not sure if this matters or not.)
The SSIS version installed on the server is 9.0.3054. It shouldn't be an "SSIS version issue", as it is the same SQLServer that I used (successfully) from my PC in design mode...
Hello, I'm trying to do something simple but can't make it work. I want to use the result of this sql "Select MAX(Stoptime) AS LatestStop FROM Stoptime" and use it in "Delete from Stoptime Where Stoptime > ?" but here with just the date. How do I set this up with parameters in the Control Flow in Integration Services?
I have a For Each Loop container, and inside the container, I have a SQL Execute task, which runs first, and then I need to kick off 5 Data Flow Tasks. Do I need to connect the 5 DFTs to each other using the Green(Pipelines). How would you usually do this? Thx.
I'm trying to populate a 3-field Dimension table from two existing databases (on two separate servers).
From databaseA.tableX I'm taking two fields as is.
The third field needs to come from databaseB.tableY.
However to find the row with the value I want I need to: -- Take one of the fields from A.X and massage its string value (let's call it @Value) -- Use that value to query B.Y for the first row having a value LIKE(@Value + '%') in a certain field -- Get the value of that matched field and put it in my Dimension table.
Can someone help sketch out a rational Control Flow for this? What Data Flows would it contain, etc.
Im just a little unsure of somethingl If i have, for instance, 2 containers with both containing control flow items, and container 1 flows into container 2 (thus ensuring that container 1's items are executed first), how do I set the control flow to NOT execute containers 2 at all if any of the items of container 1 has failed BUT ensuring that container 1's items are ALL executed even though some in container1 might fail.
Thus I want all items in container1 to at least execute but only if all are successful should the control flow execute container 2's items.
I'm trying to conditionally execute a dataflow based on the presence of a data file. If the data file isn't present, I'd like to execute gracefully without error.
Logic is as follows:
If FileExists Then execute dataflow Else exit w/o error End If
I've got the code ready to go, but I'm not sure how to do this conditional branch logic. Right now, the code calls the Dts.Results.Success / Failure. The problem, however, is Failure is exactly that... which doesn't result in the graceful exit I'm looking for.
Is there documentation somewhere about multiple execution paths in SSIS control flow? I didn't find documentation anywhere. I have a situation where I have two tasks that take considerable time, but could be executed in parallel (to speed up things) and I was wondering whether SSIS supports parallelism.
To illustrate the issues in simultaneous execution, I created a test SSIS package. In the package, I have five tasks, let's call them T1, T2, T3, T4 and T5. The taks are connected with "green arrows" like this:
T1->T2
T1->T3
T2->T4
T3->T4
T5 is not connected. The tasks can be e.g. Send Mail tasks, that's not relevant to this issue. I put a breakpoint in each task and execute the package.
When I execute package, T1 and T5 become active, i.e. the arrow that displays where the package execution currently is, is in two tasks simultaneously. Now F10 (step over) doesn't seem to work "Unable to step. Not implemented". If I press F5 nothing happens. After I press F5 for a second time tasks T1 and T5 and executed. Why don't they execute with the first pressing of F5? I would additionally like to know whether these two tasks are executed in parallel or sequentially, i.e. in the same thread or in two threads? Is there documentation of this?
The execution stops at T2&T3. Again, pressing F5 doesn't do anything, but the second time I press F5 T2 and T3 are executed.
What SSIS Task or process can i use to stop my Control Flow Process from running?
I created a SQL Task to do a count on a table to see if there is data, if the count is > 0 then the Control Flow task must continue, else it uses a RAISERROR statement which i use with the event handler, but i want to put something in the event handler to stop the process then and not continue?