Hi
I am new to SSIS. Probaly What I am asking is the simple one. But I dont know. I have a requirement. My Data source is Excel file and Destination is SQL Server . Excel file consist around 10k unnormalized rows which should be loaded to the multiple normalized master and child tables into the destination with the Primary key (identity column) and foreign key constrains. Even sometimes package should check for destination database table for matching Source column (Excel) code with the lookup table code (destination lookup table) and get the Primary key of Lookup table and put it into the destination table. Please help me the way to find the solution for this
I have a "source" table that is being populated by a DTS bulk importof a text file. I need to scrub the source table after the importstep by running appropriate stored proc(s) to copy the source data to2 normalized tables. The problem is that table "Companies" needs tobe populated first in order to generate the Identity ID and then usethat as the foreign key in the other table.Here is the DDL:CREATE TABLE [dbo].[OriginalList] ([FirstName] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,[LastName] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,[Company] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[Addr1] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[City] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[State] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[Zip] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[Phone] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL) ON [PRIMARY]GOCREATE TABLE [dbo].[Companies] ([ID] [int] IDENTITY (1, 1) NOT NULL ,[Name] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL) ON [PRIMARY]GOCREATE TABLE [dbo].[CompanyLocations] ([ID] [int] IDENTITY (1, 1) NOT NULL ,[CompanyID] [int] NOT NULL ,[Addr1] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[City] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[State] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[Zip] [varchar] (10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,[Phone] [varchar] (14) COLLATE SQL_Latin1_General_CP1_CI_AS NULL) ON [PRIMARY]GOThis is the stored proc I have at this time that does NOT work. Ituses the last Company insert for all the CompanyLocations which is notcorrect.CREATE PROCEDURE DataScrubSP ASBegin Transactioninsert Companies (Name) select Company from OriginalListIF @@Error <> 0GOTO ErrorHandlerdeclare @COID intselect @COID=@@identityinsert CompanyLocations (CompanyID, Addr1, City, State, Zip) select@COID, Addr1, City, State, Zip from OriginalListIF @@Error <> 0GOTO ErrorHandlerCOMMIT TRANSACTIONErrorHandler:IF @@TRANCOUNT > 0ROLLBACK TRANSACTIONRETURNGOThanks for any help.Alex.
I have an allergy table which has a patientid and an allergy id. i would like to create a view(or SQL statement) that will give me a crosstab of a patient and there allergies(like below)
I get the following error on my OLE DB Destination: column"Oprcode" cannot convert between unicode & non-unicode string data type. Please Assist on what i should do!
My OLE DB Source and Excel desintation values all will be assigned during the run time but it does work during design time but as on runtime columns are different. That's why it does not work.
Here is what I want to accomplish, I have table which contains all my report which needs to dumped to excel at the month end.
SQL Task using ADO enumrator read one record(one report), Give that record to For Each contair which Create the Excel file on the fly using one of variable from my table and uses a stored procedure to dump data to excel using Dataflow Task.
Does it mean for 10 reports, I have to create 10 different data flow tasks, or it can be done using one data flow tasks but changing columns on the run time.
I am building SSIS for 3 different files that have identical schema and mapping logic.
In my OLE DB Source (object name - "OLEDBSource_SourceTable") Data Access mode is "Variable name". As soon as I swithced to this Data Acces mode it started to give me an error:
[OLEDBSource_SourceTable [1]] Warning: The external metadata column collection is out of synchronization with the data source columns.
The column "DEAL_NUM" needs to be updated in the external metadata column collection. The "external metadata column "DEAL_NUM_Flag" (34529)" needs to be removed from the external metadata column collection. The "external metadata column "recordID" (33740)" needs to be removed from the external metadata column collection.
I want to copy 2 columns from 1 database to another database. I managed to do this, using a Ole DB source and a Ole DB destination dataset.
Now I want to merge 2 colums into 1: Source Database: Column A: first name, Column B: Lastname. Destination Database: Column 1: First and lastname customer.
Hello, What I'm trying to accomplish is to have a variable names "SourceTable" and "DestinationTable". So for each SourceTable, the DestinationTable will have the same columns. All I need is to auto-map these columns between source and destination via code?
Given the following scenario, what kind of performance should be expected in transferring about half a million rows? We are seeing about a 9 minute execution time. Is this reasonable for about 460,000 records moving from source to target, with 3 inner joins from the source?
Source: Server A.OLTPDB
Target: ServerA.DataMartDB
Server A is running SQL Server 2000. SSIS is running on a different machine, Server B.
The reason for this is that we are distributing the SSIS package for use with a BI product built on SSAS 2005 and the requirements are such that the client could very well have the source OLTP database on a different physical machine than the data mart.
My understanding is therefore that:
1. SSIS will do all of the heavy lifting on Server B.
2. Even though OLTPDB and DataMartDB are on the same server, it is expensive for Server B to pull the records from Server A and then send them back to Server B, but with SSIS being on a different machine, this is inevitable.
3. In the OLE DB Source Adapter, specifying table or view has the effect of an OPEN QUERY command, whereas a SQL command with straight SQL will be executed on Server B and the former would be somewhat more performant.
I have searched the SSIS forum for an answer to my question, and I think I have found my answer but what i have read does not come out directly and answer my question.
I am attempting to create an SSIS package that imports data from a Visual Foxpro table into a SQL Server table. From what I have read, the source and destination tables must match column for column. Is this correct? My SQL Server table has a few more columns in it. However, i consistently get "Cannot create connector. The destination component does not have any available inputs...blah".
From what i have read, it sound like the reasoning is that my source and destination tables do not match.
Is this correct, or am i barking up the wrong tree?
I am having a brain-fart or something. I need help!!!
I'm developing a transform that selects data from a SQL script, then transforms that data, and then needs to update the data back into the same source table. The transform is working great, but I can't figure out the update table part. I've got it adding the transformed records to the table, but that's not what I need to do. I need to update the table with the changed data.
Any sample code or blog links would be greatly appreciated!
I want to process data from source table to destination table without using cursor.
At this point; we are creating temp table and inserting data from source to temp table. once we get data into temp table; using while loop we are processing record one by one to destination table.
while executing stored procedure; we noticed that there are few records in source table which are invalid and stored procedure is terminating from such records.
Any better approach to log INVALID data and resume code to process next record instead of terminating.
I want to write a SSIS which picks up the source and destination tables at runtime is it possible. As we have a SSIS which is used to pull data from oracle but the source and destination table name changes.
I have read only permission in the source (OLTP) database. The source database is running in SQL Server 2000 and the size is more than 200 GB. I need to pull data from source and load target which is running in SQL server 2005.
Following are the objectives I want to achieve.
Data should be loaded on incremental basis.
Whatever changes take place (Update/Delete) in source, that should be replicated to already uploaded data.
Here I want to mention that, the source database does not have any identification key or timestamp column like Updated_Date by which I can filter the data which are recently inserted or updated into the source and upload the same. The source does not maintain any history data also. So I do not have any track of deleted record also.
I don€™t have any scope to change the schema in the source. In this scenario can anybody suggest me the best approach to achieve the above mentioned objectives?
Can I retrieve only the recent updated or inserted date form transaction log back up. Can log shipping solve the give the solution?
One more question. Say I have a table and I am exporting/importing all the data from/to my target table using SSIS or DTS. In this scenario does using query or using directly the table affects the performance?
I need to write TSQL script that will insert, update, delete rows in a production db from new, updated and deleted data from staging db. They are both in different servers and I can only use TSQL. Basically, the production DB needs to be synchornized with the staging DB.
I was thinking using dynamic sql, cursors, and linked servers. Any ideas?
Hello Everyone, Please do inform me, how can I check if there is a new record or changed record from the source.
NOTE: my source is Flat File and destination is Oracle Table.
What is needed from my side is the history load (Type 2). This is not possible thru SCD component in Integration Services, If my source is Oracle (Even after I had added the parameters in my OLE DB) Please do inform me about this process. Very important.
i want to compare the source and destination data my source is Oledb source and destination is sql server destination I want to Compare the Data in destination with source and also want to add validation like iF Any row is there is destination but not in source delete that row if any row in source but not in destination Insert that row nd compare source with destination respect to Primary key if There is change in any column of that row Update that if not change than move to next record. Please tell me what is the sutable object and what to do Please Help me Out Thanks Sandeep
I am new to SSIS can u help me out in this i have a source(Flat File) and target SQL Server 2005. Source has got 2 columns- Col1 and Col2 even target has got 2 col's Col1 being the PK. I have created a package that checks if target and if the record exist it updates and if it does not it inserts. My package looks like
Source - lookup on target- Conditional split- Derived Column and then 2 OLE DB Destinations 1 for inserts and 1 for updates.
I have created a relationship in lookup with col1 from source and col1 from target and col2 as lookup col and connected red output to 1 OLE DB for inserts and redirect rows to it. Green out put i have taken to conditional split and gave condition like Col2 from source is not equal to lookup col2. I run the package with this it is inserting new records but not updating it.
I tried to add derived column after conditional split but lacks in writing expression that says update col2 records if they changed at source. Can u help me out in this scenario. I would appreciate if you could get back to me ASAP.
I am a complete newbie to SSIS. I can create a simple package to transfer data between SQL instances and thats about it.
I have tableA (source data) and tableB (Destination data). TableA has 4 column and tableB has 5. I want to transfer all of the columns from tableA into TableB, but the 5th column in tableB needs to be populated with the ServerInstance name of the server TableA sits on. Do I need to have multiple data sources to achieve this? I have tried but no matter how I set it up, the Column in the destination is set to ignore.
My company is moving to a SQL Server-based packaged application early next year. We€™re planning our SQL Server architecture but have some questions that I can€™t readily find answers for. I€™m hoping someone here can point me in the right direction.
We have three servers, I€™ll call them A, B, and C. We want to duplicate all changes to certain databases on server A to server B, then duplicate changes to selected databases and tables on server B to server C.
Ideally we€™d run SQL Server 2005 Enterprise Edition on all three servers, but the packaged application vendor does not support SQL Server 2005 yet, only SQL Server 2000. Our license agreement with them does not allow us to use replication on server A. We€™re free to do whatever we want on our other SQL Servers, but server A must sit alone, untouched, like a monolith on a far-away moon. (I€™m lobbying to have the server named Tycho, or TMA2.) Stranger still, they€™re OK with log shipping from server A to other servers. We€™ve tried to explain that replication and log shipping are both core function built into SQL Server, and that if one is acceptable, then both should be. Their fear is that replication could cause performance and stability problems, and to eliminate this possibility they€™re ruling out replication on server A.
Given these constraints we€™re resigned to using SQL Server 2000 Enterprise Edition on servers A and B, and SQL Server 2005 Enterprise Edition on server C. We plan on periodically shipping logs from server A to server B and applying them at server B.
We€™d like to know if it is possible to also use transactional replication on server B to duplicate changes from server B to server C. I€™ve used log shipping and replication in the past, but never at the same time. My understanding is that a database goes into recovery mode while a transaction log is being applied and that any user changes to the database after the log has been applied will cause later log applications to fail. The scripts I€™ve seen that are used to apply the transaction logs put the database into single user mode after the log has been applied to prevent this.
This raises a few questions:
If we try to RESTORE a log to a database being used as a source for transactional replication articles, will the RESTORE fail? Or will the RESTORE start and break the transactional replication? I€™ll test this on my own, but it€™d be nice to know if anyone has already experienced this.
Is it possible for us to have a database in read-only mode serve as the source for transactional replication articles? (I can€™t imagine why not, ever though it seems counter-intuitive - why would you want to replicate transactions from a database that has no transactions?)
If the answer to number two is yes, can we suspend transactional replication on a database, RESTORE a log to the database, put the database into read-only mode after the RESTORE, and restart the replication on the database? Thanks in advance for sharing your wisdom, everyone!
Inside a data flow task, i have a oledb source and destination. In my situation, I need to pull data from a table in the source, but also hard code some columns myself, which means my source is a blend of data from table, hard coded data, which will then have to be mapped to columns in oledb destination. Does anyone which option to choose in the oledb source dropdown for the data access mode. Keep in mind, i do need to run a a select query, as well as get data from a table. Is it possible to use multiple oledb sources and connect to one destination, because that is really what intend to do here. I am not sure how it will work, or even if its possible. Basically my source access mode needs to be a blend of sql command and table columns, how would that be implemented? Any help or advice is appreciated.
I am bring a date from a OLE DB Source (SQL Command) as [select cast(convert(varchar,getdate(),101) as varchar(10))] to a Flat File Destination (CSV File). The data in the source is show as "1/1/2007", which is how I need it to display in the file. The flat file defaults to showing the data as "1/1/2007 00:00:00." I did change the destination field to a String, and still, I geting the timestamp. I tried using a data conversion transformation, and I am getting bargage data when converting the date to a string.
Can any one give me insight on how to populate a date into a comma delimeted file as "1/1/2007", not "1/1/2007 00:00:00."
I have an SSIS package which calls a web service and returns a Dataset object in the form of an xml file (it does this successfully - and stores it successfully).
The Data Flow then picks up the file using an xml source task (in which use inline schema is specified) , and passes it to an SQL Server Destination task - for bulk load.
I've checked the mappings and everything seems to be ok, the package gives no warnings or errors when run.
The destination table is created on the SQL Server without problem, but at that stage the task finishes "successfully", without actually loading any table rows into the destination table.
Also in the SQL Server Destination task - when I select "preview" in the connection manager section - I can see the column definitions, but again - no rows of data.
Thanks to Simon Sabin for pointing me here - I did post on the managed msdn newsgroups in sql server programming but have had no replies as yet.
Below is the beginning of the xml Dataset object which is output as an xml file for information. If you need any more information just let me know.
Not quite sure where i'm going wrong here - thanks in advance
I have implemented a simple package that has only a Data Flow Task in the Control Flow. The corresponding Data Flow Components are as follows (in the correct order):
1. A Flat File Source adpater which parses a file containing circa 1 million rows. 2. A Script Component that generates IDs for two of my destination tables. 3. A Mulicast Component that distributes all the columns retrieved from the source file as well as the 2 generated keys. 4. Four OLE DB Destination adapters where I insert the data from the different columns together with the generated keys as primary or foreign keys respectively.
My questions are mainly about how the "Flat File Source" and "OLE DB Destination" work behind the scenes.
Questions: A. "Flat File Source" adapter: Does the "Flat File Source" adapter load all the rows from the source file into the memory and then start pushing the rows one by one into the data flow? My concern is actually about the huge sizes of my import files, which might eat up the memory. Or does the "Flat File Source" adapter load a certain number of rows, pushes them into the pipeline and then fetches the next batch based on a certain configuration?
B. "OLE DB Destination": I have set the four OLE DB destinations to "Fast Load" with a commit size of 5,000. Can I ever run into the problem of having for instance 10,000 records inserted in Table1, but only 5,000 inserted in Table2. The error case I am thinking about is as follows: After the second commit on Table1, an erroneous record occurs during the insertion in Table2. Thus, the second commit for Table2 gets rolled back and the whole package fails. In this case, I get an inconsistent state in my database. In other words, how can I synchronize the commit operation in all my destination tables, s.t. they either all commit the same batch or all do not commit it?
I'm trying to build a DTSX package that FTP's an XML file to the local file system and then imports it into an existing table.
My "Data Flow" for the package starts with an XML Source component and then goes to Data Conversion component and then to an OLE DB Destination component.
It all executes with out error and seems to work fine, but when I look at the data in the table after I've run the package it seems to have inserted the appropriate number of rows from the XML file but all of the column values are NULL.
All of the data in the XML file is surrounded by <![CDATA[ ]]> and I discovered that if I remove the CDATA wrapper by hand then it inserts the data properly. The only problem is that I'm not in a position to have the data provider remove the CDATA tags and some of the data in the XML file needs the CDATA wrapper or else it will not validate.
Anyone know of anything I can do to get the CDATA to import properly?
I am trying to Import data into SQL server 2008 using management studio. The source data is an Access dbase. I am trying to do this with queries as the tables do not match but I do need to copy specific columns for the source to the destination. Any brief example selecting a column from the source table, and just entering a dummy value for other columns(other column of data for destination table does not exist in source table).Â
For the example
Source access dbase, just two columns but no primary key, T2dbase, Employee
New table.
First Name, Description  Tom         , manager
Hi all, I got a unicode file source with this fields: -DT_WSTR (100) originally is DT_STR(100) -DT_WSTR (100) originally is DT_STR(100) -DT_NTEXT -DT_WSTR (20) originally is DT_DBTIMESTAMP -DT_WSTR (5) originally is DT_BOOLEAN
I export a Query result to a File (see above) ...as unicode TXT destination.
OK, now I must to re-import into another DB and here is my difficult...'cause the DT_NTEXT is HTML code and I got always this error: [Flat File Source [1050]] Error: The column delimiter for column "scheda" was not found. Scheda field is the DT_NTEXT.
Into connection manager area I modify the advanced tab for the set-up of my fields setting all to: Unicode string [DT_WSTR] with a variable of the len, but Try also to define everyone to the rigth type of the SQL destination like:
In every type of action I see no message alert and all seem to be good, but when I try to execute got always same error... So hope someone can help me... ----------- here first line of my UNICODE TXT source file ---------- "codven" "manufacturer" "scheda" "last_modified" "modificata" "CDGI2120" "Altri" "<datasheet><section ncellmax="1" id="1"><row order="1"><cell><![CDATA[Combat possiede mitragliatrici per intraprendere battaglie testa a ~testa del genere "spara o sei finito" in mezzo a territori ~butterati di crateri su carri armati del 23esimo secolo.~Caratteristiche:~* Cinque modalita' di gioco~* Tre tipi di carri armati~* Partita singola o in multiplayer~* Grafica in 2D, 3D]]></cell></row></section></datasheet>" "2007-12-11 13:02:26.290000000" "1" "CDGI2586" "Disney Interactive" "<datasheet><section ncellmax="1" id="1"><row order="1"><cell><![CDATA[Entra con Tigro ed i suoi amici nel meraviglioso Bosco dei 100 Acri aiutalo a cercare il miele nella natura incantata di questo fantastico mondo! ~~Il giocatore vestirÓ i panni di Tigro, il simpatico e buffo amico di Winnie The Pooh, il quale dovrÓ raccogliere quanto pi¨ miele possibile, per rendere la festa di Winnie qualcosa di veramente speciale!!!]]></cell></row></section></datasheet>" "2007-12-11 13:02:26.290000000" "1"
After: If (Row.RegenerateXML_IsNull()) Then Command.Parameters.AddWithValue("@RegenerateXML",System.DBNull.Value) Else Command.Parameters.AddWithValue("@RegenerateXML", Row.RegenerateXML) End If
Which is great but it does turn 1 line into 5 lines. I have 20 parameters which takes up 20 lines of code which will now be 20 x 5 = 100 lines of code.
My question is :
Is there a better way to write this code without having to take up so many lines?
I have a data task with the following requirements: 1) Run query against database to retrieve rows 2) Add header and footer row to the result set. The footer row must contain a count of the records. 3) Write the rows to a fixed width file if there were any data rows
I have got to the point that I can create the file (using a set of tasks that includes derived columns, sorts, aggregation and merges). However the file is created regardless of whether there were data rows returned. I can't check the row count before proceeding as this isn't set until the data task ends. And if I try to split them into separate data tasks (so that I can access this variable and perform conditional execution) it becomes harder to access the original rows.
Do you have any recommendations on the best way to achieve this? It all seems to be very complex and I'm starting to feel that it would be easier to do this outside of SSIS... Please help me to keep the faith!
For those interested this is a slightly simplified version of what I have so far (all within a single data task):
1.Run dummy sql to create header row 2.Run main SQL to retrieve rows | | | 3.Multicast | | | | | 4.Create footer row by doing sum() in aggregate task | | | | 5.Merge body and footer | | 6. Merge header with body and footer |