Integration Services :: Data Synchronization Using Table Partitioning
Jul 14, 2015
We have a requirment, we have different databases in different servers, we need to syncronize the data in data bases in different server. created Partitioned on these tables.
We tried some options:
1) Snapshot replication
2)CDC (Change Data Capture)
I have a transaction table having about 40 crore rows in source. It don't have timestamp and unique key columns. It have only Bill_month and Bill_Year columns. Actually for loading this table into staging I have added a new datetime column by adding default bill_date as 01. Then
* First we delete last 3 month data from staging tables. * Get last 3 months data from source table. * Load that 3 months data from source to staging table.
We do this because we only get update for last three months data. Now I have to include this transaction table as Fact table in DW. What will be the best practice for loading the fact table by picking data form staging table. Also we have to look up with dimensions for Foreign Keys.
* Should I implement the same method of deleting last 3 months records and loading them again.
I have a requirement to load bulk of csv files to sql table. some times, some columns could not come in csv file(some times 100 columns and some times 80 cloumns). That time the package is getting failed. How to create a table dynamically based on csv file structure.
I'm importing comma-delimited text files into a SQL table. The data imports in a seemingly random order. One time I import and the lines appear one way and the next time I import they import another way.
Is there a way to force the text files to import in the same order the data is found in the file?
I've a SSIS 2008 parent/child package solution to manage data transfers between two different data sources, so we can copy multiple tables and capture how many rows were transferred and duration for each transfer. This solution was working fine up until last week, when I made some changes to allow the package to perform a source count using standard SQL determined by an expression, or SQL provided from configuration tables, I also changed the package to Truncate or not the destination table, again controlled by configuration settings in a table. The child packages which perform the data flows have not changed!
The day after the controlling package promotion to live, I saw the bizarre behaviour of the Package log stating all rows transferred, but the actual table counts were not what the log stated, see attached file. The package solution works ok on other servers and was ok in DEV, but there were less tables and rows transferred.Re-running the package gave the same errors, but on some of the same tables and some different ones.
As it is the child packages doing the transfers and nothing has changed in them. I cannot see how the log would be able to say all rows are transferred and yet not all of the rows are actually moved?
Process output - where you can see counts and log Table transfer controller (as txt not dtsx)
An example of the data transfer child packages (as txt not dtsx)When I set the ExecuteOutOfProcess = True the package worked fine, unfortunately, this is not a good solution as SSIS 2008 does not tidy up the Dtshost.exe processes it starts and I'd be left with a memory issue after a very short time, we transfer hundreds of tables each day. ( I could write a .net script in the controlling package to kill the child processes, but that would still have hundreds of processes running before I could end them, as we have three parallel streams to allow a bit better performance.
I've a dataflow task on For Each Loop container at control flow of SSIS package. This For Each Loop container reads the CSV files from the specified location one by one, and populates a variable with current file name. Note, the tables where I would like to push the data from each CSV file are also having the same names as CSV file names.On the dataflow task, I've Flat File component as a source, this component uses the above variable to read the data of a particular file.
Now, here my question comes, how can I move the data to destination, SQL table, using the same variable name?I've tried to setup the OLE DB destination component dynamically but it executes well only for first time. It does not change the mappings as per the columns of the second CSV file. There're around 50 CSV files, each has different set off columns in it. These files needs to be migrated to SQL tables using the optimum way.
Which is the best way to setup the Dataflow task for this requirement?Also, I cannot use Bulk insert task here as we would like to keep a log of corrupted rows.
Using SSIS 2012 (within Visual Studio) on Windows 7.
Before allowing my Data Flow task to fire, I'd like to check the target table (OLE DB Destination) for a specific date value in a specific field. I've seen how the Lookup Task is commonly used to check for dupes before inserting, but I'm not able to use that method because the data value I want to search the table for is contained in a Global Variable (let's say "MyVariableDate").
Is there any way to check for any records in a target table where Date1 = MyVariableDate (i.e. scanning the entire table for any occurrence of MyVariableDate in the Date1 field)?
In my SSIS package I have flat files as a source, I have to load numbers of flat files into SQL target table. I am using For each loop container for that. I am doing it correctly. My aim is to validate the source data from all angle before writing it into target sql table. I am using below points to validate the source data , if I found any bad data I am redirecting those data to error output.
To Checking
1. To check whether data type of column. 2. To check whether buisinesskey column null.
Is there any thing which I am missing to validate source data.
I have a requirement of table partitioning. we have 10 years of data on a table which is 30 billion up rows on 2005 server we are upgrading it to 2014. we have to keep 7 years of data. there is no keys on table or date column. since its a huge amount of data and many users its slow down the process speed. we are thinking to do partition on 7 years for Quarterly based. but as i said there is no date column on table we have to use reference table to get date. is there a way i can do the partitioning with out adding date column on table? also does partition will make query faster?
I have think three ways to do it. 1. leave as it is. 2. 7 years partition on one server 3. 3 years partition on server1 and 4 years partition on server2 (for 4 years is snapshot better?)
We have created SSIS package to load a text file into a table. Source system shares 10 text files and recently they stopped generating data for one of the text file (comping empty), after few months they will start generating the data for the empty file batch processing.
The Issue here is Data Flow task is getting failed while loading empty text file into table. How to handle this empty file load issue in SSIS package.
I am wondering if it is possible to use SSIS to sample data set to training set and test set directly to my data mining models without saving them somewhere as occupying too much space? Really need guidance for that.
I'm using Script Component to load data into Oracle DB due to the poor performance issue. Now, I found it will missing some data during the transmission. Please see the screenshot below:
I setup this package to import data from a Sharepoint list to a SQL Server data table. The primary key of my SQL table is mapped to the Title column of my Sharepoint list. There is a possibility that duplicate values will be entered in the Title field of the Sharepoint list. So when importing data into my table via SSIS, my package always error-out when there it comes across duplicate values. how you others have managed data integrity when importing from a Sharepoint list with the Title column being mapped to the primary key of a table.
I have to value [CreateDate] in the data pump of my Flat File Source into my OLE DB Destination SQL Server Table. With a Variable within the SSIS Package or with a Derived Column task within the Data Flow between the Flat File Source and OLE DB Destination?
Is there a way within analysis services to perform a partition on an automated basis? Not sure if this is necessarily the best forum for my question. Apologies if it falls outside the scope of SQL server.
Please help! I am trying to import data from an ODBC data source to a SQL Server database using Integration Services. I am new to SQL Server 2005 but all was working happily on 2000 using DTS.
I am trying to follow the tutorials using a data flow task but cannot get my ODBC database into the connection managers tab, because OLE DB for ODBC isn't one of the options! Am I missing something? Any help on this would be greatly appreciated as I am struggling to come to terms with 2005 and cannot migrate the 2000 DTS packages
Hi, I have a question regarding the Integration Services Data Types.
From http://msdn2.microsoft.com/en-us/library/ms141036(d-printer).aspx, I found a table that shows me the Mapping of Integration Services Data Types to Database Data Types.
For example, how the DT_BOOL Data Type maps to bit for SQL Server.
In this case, I am okay, as I know exactly what the mapping is, however, for some of the datatypes, I do not.
Here is an example. The DT_CY datatype maps to smallmoney and money ... how do I know which one to map to? For me, which one I map to does indeed matter because their representation is different.
DT_NUMERIC maps to decimal and numeric ... this one does not matter as much
DT_STR/DT_WSTR ... I need to know whether its char, varchar, ncahr, or nvarchar for padding purposes mostly.
I am having a requirement where I need to load the correct data into the target table and needs to save the bad data for analysis, how can I do that in SSIS.
Hi friends , Can any buddy tell me how can i update a particular table by integration services.... I just need to update some of column value
if i write query ...i need to write approx 35 update statement (Query) So is there is any way by which i can replace existing data to my current data .
I am having one store procedure which use to load data from flat file to staging table dynamically.everything is working fine. staging_temp table have single column.all the data stored in that single column below is the sample row.
after the staging_temp data gets inserted into main table.my probelm is to handle such a file where number of columns are more than the actual table.if you see the sample rows there are 4 column separated by "¯".but actual I am having only 3 columns in my main table.so how can I get only first 3 column from the satging_temp table.output should be like below.
It's SQL 2008 R2. I need to bring data from Oracle using .Net Providers/ODBC Data Provider to MS SQL table converting Oracle UTC dates to PST. The source connection type cannot be changed as it's given. For the Destination I'm using the OLE DB.
As the truncate all and load could take time I'm trying to use a temp table or a variable to use it further with t-sql merge or not exists to bring/add the only new records to the destination table.
I'm trying different scenarios that is all failed.
Scenario A:
1. In DTF after OLE DB Source I'm using the Derived Colum to convert dates. It's working well.
2. Then use Recordset Destination with an object variable User::obj_TableACD. It's also working well.
3. Then I created a string variable with a simple query that I could modify later "select * from " + (DT_WSTR,10)@[User::obj_TableACD] trying to get data from the recordset object variable but it's not working.
Scenario B:
1. Created a store procedure to create a temp table.
2. Created a string variable to execute SP str_CreateTempTable: "EXEC dbo.TempTable". It's working well with the SQL Task with SQLSourceType as Variable.
3. Then how to populate the temp table from the Oracle source to bring data into the Destination?
How do I add only new rows to a destination table (when copying a table from another database every night) ?Every night I am copying a number of tables from one database to another.I only want to insert news rows (that are not in the destination table, but are in the source table) to the destination table.I might normally drop the destination table and just copy over the whole table, but in this case rows can be deleted from the source table, but I want to keep these old rows in the destination table (to maintain history). So I only want to add in rows to the destination table that have been added to the source table since last time.I guess I could copy the whole of the source table to a temporary table in the warehouse, then use a T-SQL merge command to compare and just add new rows to the destination table- but suspect that this is not the best way.
I'm using - Destination - Oracle driver - oraOLEDB.Oracle.1 (native ole dboracle provider for ole db)
Source - SQL driver - microsoft ole db prover for sql server. I want to import data from sql server to oracle. Challenge is, I have 1 million records on oracle. I have 100 records on sql server (these 100 records count will change daily). So, I thought of using 'lookup' task looking taking record from ms sql and fetch corresponding record from oracle. But when I use lookup, all records from oracle are loading into cache, which is taking approx 3 hrs.
I have a requirement to compare data between two tables in SQL Server.
What is the fastest way to do it using SSIS? There are approx 6~7 millions of records in each table.
My solution: Read both the tables and store the data in Object Type variable. Then run an except query. But I am stuck at except query part. How do I implement it?