I have a requirement that seems like it would be a very common requirement in many ETL scenarions:
It has been called "upsert". From wikipedia:
The SQL-like UPSERT statement inserts a record to a table in a database if the record does not exist; if the record already exists, an update operation is performed. This is not a standard SQL statement, but it is frequently used to abbreviate the equivalent pseudo-code. The term upsert is a portmanteau of update and insert and is common slang among database hackers. The SQL:2003 defines a MERGE statement that provides similar functionality.
ExampleIF FOUND
THEN UPDATE
ELSE
INSERT;
See also
merge (SQL)
Other ETL tools I am aware of support this opreation with pre-built functionality that only requires the developer to enter the key columns for the target and choices for how to handle the data when rows with matching keys are found in the stream and the target (ignore, update).
Is there an SSIS way to do this without resorting to hand coded SQL or script? Given that this is likely to be needed for many of the "fact" tables in a schema, it will end up being a _lot_ of SQL or script to write and maintain.
My organization is required to ETL millions of rows of medical insurance claims data per month into our data structures from many different external systems. The data structures vary tremendously between the different source systems. We need to develop an efficient "upsert" algorithm: e.g. when a new row arrives from a particular source system determine: a: does a row with an identical unique natural key (often a composite) already exist in our system? b: if "a" above is true, is any non-key data in the newly arrived row different from the existing row with the same key? c: depending on the results of test a and b above, perform either an insert or update.
So we need to develop an internal "upsert" logic solution. We also want to have a solution that will support a generalized solution as early as possible in the data processing. Therefore, we are considering applying two identical MD5 hashing algorithms (probably C or C# extended stored procs) to each incoming source row (and a one time generation of the two hashes for all legacy loaded rows). One hash will be created from the unique natural key for each row, and one will be created from all the remaining columns in the row. Together these two MD5 hashes should allow us to perform comparisons on all newly arrived vs. previously loaded data, and reliably determine results of tests "a" and "b" above.
This approach also supports the requirement that the code that performs the upsert logic be generalized as early as possible for all different source systems, as the actual test logic will be identical for all source data and independant of the custom logic used to generate the MD5 hashes for each unique source structure.
Is there a better way? How do others approach this problem? What kind of performance impact can be expected from the xp calls?
I have read what people have posted on this but don't seem to find exactly what i want. I tried the methods from this website like many suggest http://www.sqlis.com/default.aspx?311. But all this seems to do insert new records but does not update the existing records.
This is what i want to do. I have an online store that uses an SQL DB with a products table. I have an excel/csv spreadsheet with the same columns as in the sql database. It is much quicker for me to update the spreadsheet than logging in to sql to do it. So i would like to update all the product prices on flat file to the sql table. There will be more updating than adding new rows.
I want to achieve the following in (SSIS/SSDT for SQL 2012) -Â
I have a generic SSIS package which simply sends out email notifications using SMTP email task (this package is within its own project, and has project level input parameters).
I need to be able to call this package in the Event handler section of every package (numbering in about less than 60) that we have. These packages are within their own respective projects.
I thought I could use the "execute package task", but it turns out , using this, I cannot call a package that is part of some other project. I also cannot call a package that is stored in the CATALOG. Is there any way I can do this ?
When I call the child package , I should be able to send in parameters like - error information and package name of the Parent package.
I have an SSIS package (TransAgentMaster) that I recently modified to include a call to a child package via the file system. The child package creates a text file. When I run the package in dev studio then the child package/text file is produced.
I then imported the TransAgentMaster as a stored packagesfilesystem package into SQL SSIS and executed the package. The child package produced the text file.
I then ran the SQL Server Agent to see if the child package would work and it did not generate the text file. Thus after updating a SSIS package importing the package into SSIS the job that calls the package will not call the child package. Please not that the TransAgentMaster package calls 7 children packages €¦ just not my new one.
Any thoughts why the agent will not run the child newly crated childe package?
p.s. Does anyone have any needles I can borrow? I think sticking them in my eyes would be nicer than working with SSIS.
===================================
An error occurred while objects were being copied. SSIS Designer could not serialize the SSIS runtime objects. (Microsoft Visual Studio)
===================================
Could not copy object 'Preparation SQL Task' to the clipboard. (Microsoft.DataTransformationServices.Design)
------------------------------ For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%u00ae+Visual+Studio%u00ae+2005&ProdVer=8.0.50727.762&EvtSrc=Microsoft.DataTransformationServices.Design.SR&EvtID=SerializeComponentsFailed&LinkId=20476
------------------------------ Program Location:
at Microsoft.DataTransformationServices.Design.DtsClipboardCommandHelper.SerializeRuntimeObjects(ICollection logicalObjects) at Microsoft.DataTransformationServices.Design.ControlFlowClipboardCommandHelper.InternalMenuCopy(MenuCommand sender, CommandHandlingArgs args)
===================================
Invalid access to memory location. (Exception from HRESULT: 0x800703E6) (Microsoft.SqlServer.ManagedDTS)
------------------------------ Program Location:
at Microsoft.SqlServer.Dts.Runtime.PersistImpl.SaveToXML(XmlDocument& doc, XmlNode node, IDTSEvents events) at Microsoft.SqlServer.Dts.Runtime.DtsContainer.SaveToXML(XmlDocument& doc, XmlNode node, IDTSEvents events) at Microsoft.DataTransformationServices.Design.DtsClipboardCommandHelper.SerializeRuntimeObjects(ICollection logicalObjects)
Hi. I need to import excel file in database. i first need to do an unpivot task. the column names are dates and SSIS seems to be unable to pick up the column name as it is replaced by F2 F3 F4etc Can you advise of a solution. thanks ken
I'm finding that the standard components often just don't quite meet my needs, but would only need some fairly minor changes to save me and my team a lot of work (and produce more elegant solutions). So I was just wondering whether the source code was available for the standard components that come with SSIS, or if there is anyway to extend their functionality? Or do you just have to start form scratch?
I need to build an asp.net/C# application to read values from an Excel spreadsheet. Once the values are read from the spreadsheet, the C# code will do some elementary statistics on the values read. Then the values read and their computations will be written to a sql server database. My manager suggested that SSIS might be a good candidate technology for doing this type of work. Does that sound correct? My only hesitation with using SSIS is that I want to keep the application as simple as possible, so that the code can be more portable. Maybe might argument is not a good one, but maybe someone can help me out here. Ralph
Dear Friends, I store several configurations in the main database of my SSIS packages. I need to get the servername from a xml or txt file in order to get those configurations stored in my database. How you think is the better way to do that? Using a FlatFileSource to read the file and a script to save the value into a SSIS variable? Using the package configuration I cant do that... maybe I dont know, but I can save the SSIS variale in the configuration file, but what I need is to do the inverse, read the configuration file and save the value in the SSIS variable. How the best way you suggest?! Regards!! Thanks.
We have SQL 2008 in development but only SQL 2005 in production. I have an SSIS package that was created in 2008 but need to deploy it to a SQL 2005 server. The '05 server will not import the package because of its version. Is there a way to convert back or 'save as' SSIS '05?
I have two questions to ask in this one thread. I would appreciate any feedback.
1. Is it possible to create GUI from SSIS using macro so that it can display forms or dialogs? If so how can I create a form that can be used to pass the parameters for the execution of the SSIS??
2. Is it possible to pass parameter(s) to SSIS? If yes, how can we do it...Please provide me with any example.
Scenarion: 1.- SSIS Package execute tasks on 2000 SQL Server Database 2.- Execution takes places using Business Intelligence Studio Question: 1.- How can I tracked that SQl 2000 tasks took place using a SSIS Package?
I am new to SSIS. I am trying to install just the SSIS in one machine("SSIS Machine") and just the DB Engine ("SQL Server Machine") in another machine. What I am trying to do is, separating the SSIS service and packages from the Database Engine and trying to run in in another machine. I have few questions on this topic. I searched on this forum but I couldn't find a concrete answer to those questions. Forgive me if it already answered/asked multiple times.
1. When I install SSIS in "SSIS Machine", do I need to install client components also in the same machine? 2. I already established this setup (SSIS with client components in one machine and SQL Server in another) but when I tried to connect to the SSIS thro' Management Studio from Sql Server machine, I keep getting "Access Denied" error. Is it possible to connect to SSIS server from another machine (using Management studio)? I tried the DCOM security permission options I found in the internet(I don't have domain id so I gave "Everyone" full access) but still I get the same error. Any help would be appriciated. 3. Do I need 2 SQL Server Licenses (Enterprise) if I go with this environment? 4. Is it possible to configure SQL Job to run SSIS installed in another machine?
I am completely new to SSIS and have been given a large project (of course with a tight deadline) that has the absolute requirement of using SSIS. I am/was very, very good with DTS and could easily accomplish what I need to do with an ActiveX script task in DTS in no time, but as this is new development, we are not to use ActiveX script tasks within SSIS since it will not be supported in the next SQL Server release. I'm thinking script task, but please give some comments on how you would accomplish the following in SSIS (please remember I'm new to SSIS, so don't assume I know anything. )
I must accomplish this: in a nutshell, I need to create separate tab delimited text files of customer informaion. One for each region. Each region consists of X amount of states and we have X amount of regions. (Pseudo code followed by standard explanation)
Select a max value from region lookup table in SQL (this is the # of regions)
for N=1 to MyMaxValue
select states from region lookup table where region code = N (the current region we are on) 'this returns a list of states in a region, need these in array or recordset object or something Open an output file which will be a tab delimited text file we will write results below in loop to (in DTS I would programatically kick off a transformation task in the package) 'loop thru states returned, so if in a rs object... do while not rs.eof
execute customer stored procedure, passing as a variable the current state we are on 'this will return all customers within a state, this whole result set (approx 1 million) needs to go to the tab delimited file 'I have to execute this stored procedure for each state & then write results to the SAME file, until we are onto a different region
rs.movenext close file loop next
OK, so basically, as you can see, Its sort of simple in a way what I need to do, i just have no idea how to go about doing this in SSIS. I can not hard code any state or region values. I MUST read them in from the lookup tables as region codes are constanatly changing and we are constantly adding in new states and new regions, so with above coding idea, it would always dynamically pick up any new states, new regions or changes.
So in a nutshell, I need to create separate tab delimited text files of customer informaion. One for each region. Each region consists of X amount of states and there are X amount of regions. Pretty strait forward, huh? The requirements are strait forward, but SSIS is throwing me for a loop... it does not seem flexible enough to be as dynamic as I need it to be to do this. I'm sure it is, just my understanding of it is very basic so far.
Please provide your suggestions! I think a lot of newbies would benefit from some SSIS design info... how to do common things in SSIS, but beyond just retrieving a recordset and writing it to a file... what do you do when you need to add just a few layers of decision processing, and retriving recordsets and writing files based on that decision processing?????
Hey, I've a few jobs which call SSIS packages. If I run the SSIS package, it runs fine but if I try to run the job which calls this package, it fails. Can someone help me troubleshoot this issue? None of my jobs that call an SSIS package work. All of them fail.
I've been looking into ways to accomplish a fuzzy search and SSIS makes that possible if I want to do a bulk import or something like it. But what it I just want to look stuff up at any given time not haveing to run the package?
Is it possible to expose the fuzzy lookup outside of SSIS to for example t-sql?
Here's an example: I want to lookup the music artist "Notorious BIG" but in the database it is "Notorious B.I.G." if I use the SSIS fuzzy lookup I basically get what I'm looking for. But how would I call this from a web application? So then I tried Full text search but this doesn't really work out as well.
Will I have to re-write the logic that the fuzzy lookup uses to enable it to work? i.e. using Full Text Indexes and FreeTextTable, ContainsTable, SoundEx and the like to somewhat even come close to what the Fuzzy Lookup has?
I am new to using FTP and executing .bat files in SSIS. First, the .ftp file should run. When it runs the FTP process, the user should be prompted for username and password. Then, it should run the .bat file. What all are the steps I need to do and how to set the properties for FTP and the Execute process task (which holds the .bat file).
And there is a task (Execute SSIS package) in First package that calls the execution of second package.
I m continuously receiving an error "Failed to decrypt protected XML node "PackagePassword" with error 0x8009000B "Key not valid for use in specified state.". You may not be authorized to access this information. This error occurs when there is a cryptographic error. Verify that the correct key is available."
As we are running first package by job, job runs successfully logging above error
The protection level of second package is set to "EncryptSensitiveWithUserKey"
According to microsoft, we can cluster SSIS service but it is NOT RECOMMENDED. http://msdn2.microsoft.com/en-us/library/ms345193.aspx
Now this is the situation that I have where I need to understand how SSIS works?
Enviornment: Active Active cluster enviornment for SQL server with SSIS server installed as stand alone as default on both node.
Name: Node 1 Node 2 --------- -------------- --------------------- Server name: Nd1 Nd2 SQL server name: cs-nd1in01 cs-nd2in02 SSIS server name: Nd1 Nd2
BTW, this is cosolidated enviornment so there are more than one application expected and resides on each instance of SQL server.
The question is around SSIS, what would be the best practice to develop SSIS package that can work with above envoinrment.
Secnario: What if my Nd1 fails. SQL server cs-nd1IN01 will be failover to Nd2 and it will be available. But How about SSIS packages? How that understands to use Nd2 SSIS as Nd1 SSIS is not available. Is anyone has similar experience to setup SSIS in cluster envionrment but as non-cluster service?
Database Transfer Services, replaced now by SMO now as I know.. In SQL Server 2005 is SSIS (SQL Server Integration Services). From Microsoft web page: SQL Server Integration Services, or SSIS, is an engine for building data import and export solutions and performing transformations on data as it is transferred. pls. explain SSIS than? what is latest technology?
I am having all sorts of access issues when trying to execute a package from a .net interface (an intranet webform) using packagename.execute. There are millions of posts out there raising this issue (including some on this website), but no one has yet come up with a solution. This is a summary of the matter, I hope some .net guru comes across this post :I created an intranet webpage that takes 4 parameters and passes it to an SSIS package. Then I use packagename.execute to execute this package. All this works fine when I run the web page from Visual Studio. But if I configure my local IIS and run it thru that, then I get all sorts of access denied messages. I started off using integrated windows authentication. I tried all sorts of combinations (on IIS) where I selected digest authentication and windows authenthication, with and without anonymus access and so on. (I also played around with impersonation). But I always got stopped at an access denied message usually about access being denied to NT Authority. Although its worth mentioning that if I just access data from a table and display it on the webpage without any package being involved, then it works fine. From my understanding the difference between running it from within Visual Studio and local IIS is that visual studio inbuilt IIS uses kerberos authentication and installed IIS (5.1 , 6.0 etc) uses NTLM by default. I tried changing the default authentication for the inbuilt IIS to kerberos by fiddling with the metabase. But then I couldn't even get to the webpage, it always denied access no matter how many times I entered my user id and password. Now I didn't want to go through this huge documentation (http://www.microsoft.com/technet/prodtechnol/windowsserver2003/technologies/security/tkerbdel.mspx) on kerberos to fix my problem so I turned to SQL authentication. I tried placing the package on local file system and in SQL Server, but either way I always ended up facing a brick wall due to some sort of an access denied message. On the webpage I get a success message for the package execution, but when I check the database I see that the package has not executed at all. Then when I check the log I see these messages Error: 18456, Severity: 14, State: 8. Login failed for user '*******'. [CLIENT: <local machine>]Instead of the stars it shows the actual username (I put in the stars) I tried giving the SQL server account all sorts of access rights, I set him up as the dbo of all tables in the database. Nevertheless I was sure even before trying that it wouldn't help because if I am not executing a package and just executing stored procedures thru that account, then everything worked fine. On another note does Microsoft just build stuff and throw it out there without even testing? Go figure. Is there anyone who has executed a package through a web page using the installed IIS (5.1 or above) and deployed that web page on a server? Anybody??
Hi I am in the job market at the moment and keep seeing DTS listed as a requirement in the specs, typically right after "excellent T-SQL Skills". I kind of thought the latter made the former redundant. I don't really use DTS anymore - debugging a complex package is about as excruciating as pulling teeth - and just do it all in SQL. Load the data into staging tables using BULK INSERT and bugger about with it there before loading into the data tables. Am I missing something? Is there some great feature and functionality in DTS (or latterly in SSIS) that I am missing out on? Perhaps I just haven't come across the complex problems these organisations have.... Or is it just another drag and drop GUI that keeps you at arms length from the application and therefore less effective? Are DTS advocates the sort that would swear by EM over QA too? I have DTS on my CV since I can (and have) used it but I am nervous that there are perhaps some advanced features I am not familiar with that could catch me out at interview. I know there are some vehement critics of DTS on this forum and some agnostics but I don't know of anyone that is a real fan. Anyone got any opinions? Ta
if my data is residing at SQL 2005 Express? Okay, here is the scenario. ServerA contains some data but stored under SQL 2005 Express. ServerB contains another set of data and stored under SQL 2005 Standard Edition. I have to export data from ServerA to Excel via DTS manually (which can be saved as dtsx, the SSIS packages) . Can I make use of the resources in ServerB to run the SSIS packages? DTS is good but there are too many steps. :(
I do not have the DBA access to ServerB, what shall I ask the DBA to grant me in order to run this SSIS against the remote SQL 2005 Express sitting on ServerA?
I've written thousands of DTS Packages with thousands of lines of ActiveX code which does thousands of things that would have been really simple in T-SQL.