We are in SQL2005 . Is there anyone have SSIS package developed for automation of database refresh
Say we want to do weekly refresh prod to staging.
I want to use that as a model template for me to develop the package.
i would need one package for refresh one specific database
and one for user databases.
MOST IMPORTANT dev permission should be restored after production data sync on staging.
Secondly I want to create a job based on ssis package to run on schedule.
I have a database db1 on server1 and server2.The Db On server1 is a production db and the Db on server2 is a staging Db.All the new data will be coming into production Db.And i wanted to update the data and database structures on staging Db from production Db on weekly basis.So how can I reflect the data and datastructures on my staging Db from my production Db.
Hi all, I've been assigned a task of refreshing data from theproduction env to development env.what i got is a backup file of a db in the prod env, i now need tomake that into the development env.I can restore it to the dev env no problem, but the warnings i got arethat the tables owner in prod and dev env need to be different, thatis, owner is A in prod env and owner is B in dev env.So I need to:1) restore the db in dev env2) change table owner from A to B3) change related triggers4) change related viewsCan anyone suggest me an approach that is most efficient?Thanks a lot.
Hi I have a production table in SQL Server 2005 that has approx 500,000 records---every 6 hours this table needs to be truncated and filled
The basic SSIS package uses a script compentant and imports the data into a staging table which has the same structure as the production table. I have a final Execute SQL Task that Truncates the production table and does a Insert into production-select * from stage table.
Takes around 30 seconds to run this last Execute SQL Task--problem is there is a risk that our webservice will query this table during the Execute SQL Task and return incorrect results.
Q1: In this last Execute SQL Task if I used a BEGIN TRANSACTION and COMMIT TRANSACTION; will this be any quicker ?
Q2: In this last Execute SQL Task- would it be better to use a RENAME TABLE technique in TSQL--any code examples ??
Q3:Is there any way in TSQL I can compare EVERY FIELD in two tables ie Stage and Production which have identical data structures and figure out a way to update only the records that changed? Is SQL Server Replication the best way to do this
I've found staging tables to still be necessary (for example, when you have a large dimension table that you can't load into memory. The memory restricted lookup is too slow as it sends a query for each row). I have a few questions:
I'm trying to import the contents of a CSV file into a staging table that I've created in SQL server 2005. To perform the import I have used the BCP utility with the use of a BCP format file.
The problem I'm having is that the data in some of the fields in the csv file are longer than the length of the corresponding field in my staging table. So when I try to import I get the following error:
[Microsoft][ODBC SQL Server Driver]String data, right truncation
And the record with the error does not import.
My question would be, is there a way of telling BCP to trim the data before importing into the staging table? Could I somehow specify in the BCP file to trim the data before importing or is there a switch that i can specify in the BCP command which tells BCP to trim data to the length of the destination column.
If it helps I'm using the command below to run BCP.
I'm populating a table (B) in SQL Server from a Staging table (A) using a stored procedure.At any point of time, the Staging table holds 60 months' old data.In the first load of the destination table B, I get 13 months of old data whereas for every subsequent load,I need to load the data for the most recent month and delete data for the 1st(oldest) month. For example, if the load procedure runs on December 02,2006, it should pick data for the month of November,2006 from the Staging table and delete data for the 1st month.
I have a column DATA_MONTH_KEY in table B which maps to the column DATA_MONTH in my staging table A. I get the data for the first 13 months using:
(B.DATA_MONTH_KEY BETWEEN ( DATEADD(month,-13,@startdate)) AND @startdate) where startdate is the current date on which the procedure for populating table B is run. I get the value of startdate from a function.
How do i get data for the most recent months and delete the oldest month in subsequent loads?
Scenario: We want to stage all data coming into our database from various applications. We have a generic control where all data goes through to get to the database.
We want have the data staged in a 'staging database'. My question is, what would be the best way to store the staging database in the database.
I'm thinking storing it as XML in a SQL column. The reason for this is because the data could change from application to application. It would be impossible to create a relational table because maintenance would be a nightmare.
Has anyone ever done something like this. Would you save the node names in another table in SQL?
We are in the process of setting up a shared staging/development enviroment in which multiple developers will deploy their packages for testing etc.
The plan is to allow the developers to remote in to execute their packages, but we are worried about the concurrency limit of 2 users per terminal services.
It is my understanding that you cannot execute packages remotely (ie you have to be on THAT box to kick it off. (FYI - we are not planning on using SQL Agent at this point - That will be with DBAs in UA)
Please let me know what options are available/best practices...
I have a flat file with columns from a geographical hierarchy such as:
Country Zone State County City Store Sub Store , etc.
The file also has data columns for months to the right of the above columns such as:
Jul Aug Sept ......... basically 25 of these columns for two years' data for one product and another set of 25 columns for another kind of product. A typical record in the file looks like:
Country Zone State County City Store Substore
USA Southeast FL Hillsborough Tampa walmart Fletcher
I need to upload this data into a staging table in SQL Server 2005 using SSIS, I created a table with the geographical hierarchy columns but am trying to figure out a way to load the monthly data. I can create 50 columns for the 50 months ( 25 months for each product) but that would be very crude.
Is there a better way of inserting data from this flat file into a destination table? I need all the data in the staging table in one upload.
I am extracting source data which is in txt fille to OLE DB destination. But data of each day I want to save in different staging table. For Eg; tblProduct20081206, tblProduct20081207. How can it be done. I have seen lots of posting and script when destination is Txt. I want to use same table for staging but want to create different table for each day with adding date extension.
I've migrated a package that has worked in our development environment. In both environments (dev & staging) I am in the BUILTIN/Administrators local group which is in the sysadmin server role.
In our staging environment, I execute DTEXEC command from the command line and get the following results.
D:Reporting>"D:Program FilesMicrosoft SQL Server90DTSBinnDTEXEC.EXE" /SQL "MSDBProcessReportingDatabase" /SERVER USFKL16DB1CI01 /MAXCONCURRENT " -1 " / CHECKPOINTING OFF /REPORTING V /SET "Package.Variables[User::RunID].Value";65 / SET "Package.Connections[RSAnalytics].InitialCatalog";"VR New Test 1 Dec28-FndPrj 1 Dec28" Microsoft (R) SQL Server Execute Package Utility Version 9.00.1399.06 for 32-bit Copyright (C) Microsoft Corp 1984-2005. All rights reserved.
Started: 1:46:43 PM Could not load package "MSDBProcessReportingDatabase" because of error 0xC0014062. Description: The LoadFromSQLServer method has encountered OLE DB error code 0x80004005 (Login timeout expired). The SQL statement that was issued has failed. Source: Started: 1:46:43 PM Finished: 1:47:02 PM Elapsed: 18.797 seconds
In addition to my login, all logins that access this package are in the sysadmin role; and I have gone back and included these logins in msdb's dts_operator role. Has anyone seen this before?
I have a question about staging design using SSIS. Has anyone come up with an ETL design that would read table names from a generic table and dynamically create the ETL to stage the table.
1. Have a generic table which would have table name and description and whatever else that was required.
2. Have a master ETL that would enumerate through the table and stage all the table names found in the generic table.
This way I wouldn't have to create an ETL which would hardcode the names of 300-500 tables and have the appropriate 300-500 data sources and targets listed.
Not sure if I am making sense but I hope someone understands the attempt.
I have been working on this without any success. Anybody out there to help?
1. There are new files uploaded in the FTP site everyday by other regions. 3 regions and 1 file each. That means 3 files at 3 different times. The file name will be that same day's date.csv. Example; today's file will be A_20071106.csv , B_20071106.csv and C_20071106.csv. Tomorrow's will be A_20071107.csv,B_20071107.csv and C_20071107.csv. How do I do this to run everyday and take the file only for that day from FTP straight into SQL Staging table?
2. Everyday, immediately after each file is uploaded, to indicate the FTP file is loaded successfully, there will be a 20071106.dummy and so on for all three files will show up on the same folder in FTP. I must make this package runable only if the .done file arrives for that days date and then execute the .csv file. Otherwise check after 30 minutes if the .dummy is there yet or not. Do this until the .dummy file comes. Then execute the package that is on that time. Then do the same for the other two for that particular day. The times are 4pm for A, 6pm for B and 8.30pm for C
3. Then, if there is a new row, it should INSERT, any changes (based on 5 columns), it should update and if there was a row yesterday and it is not there today, DELETE.
I want to create a mirrored DB set for data entry in a extremely busy OLTP DB. I want to add transactional replication between the production server and a staging server outside my quorum that I will use to index the data and prepare it for reporting and warehousing purposes.
If/when fail-over takes place, what happens to my transactional replication between the former production sever (now presumably offline) and my staging DB? Does it switch to the new production server automatically or do I have to manually set the replication between the new production server and the staging DB?
I'm attempting to load some data into an explicit hierarchy in MDS 2012 via the staging table and struggling with the HierarchyName field. Specifically I'm loading data into stg.[Entity Name]_Consolidated and using the exact name of the explicit hierarchy I've set up in the front end web application.
Originally my hierarchy was labelled "Reporting Hierarchy" and when loading the data into staging using this name then running the batch from the Import Data screen I can see the error message "Error - The HierarchyName is missing or is not valid.". I've checked the table mdm.tblHierarchy and can see that the name there is exactly as it was in the staging table and have since renamed the hierarchy as "Reporting_Hierarchy" with the same results.
I have a large table with 100 Million records that has around 1 million duplicate records that need to be deleted.
I am running a script that creates a staging table called,DuplicateTable that collects all the duplicates and then I want to write a an effecient delete statement.
Is it possible to write something like:
delete from OrigTable O join DuplicateTable D on O.Key = D.key
Or do I have to run a loop on the DuplicateTable and run a delete statement record by record ?
I'm looking at SSIS and SqlBulkCopy as a possible method to quickly insert and process large amounts of data, my current method uses the sp_xml_preparedocument and OPENXML to parse an xml document of the data I want to process and insert into the database, however I'm noticing the performance of SqlServer parsing the xml document isn't that good.
I found the C# SqlBulkCopy method (new in .NET 2.0) and I was thinking it would be faster to use that to load my data into a staging table and then use SSIS to extract the data from the staging table, process and transform it as necessary and finally load it into the final destination tables. I was able to create an Integration Services Project that selects the data from the staging table, does a bit of processing on one of the fields (by calling a stored procedure), and finally loads the processed data into the final table.
The problem with this is I need to clean out the rows that were extracted from the staging table and I'm not sure how I can accomplish this (and I can't just issue a "delete from staging_table" because there maybe new records in the staging table that were not processed), perhaps I can either delete each record as it is proccessed or somehow get the last proccessed identity id from the staging table and delete all records less than or equal to that id.
thanks in advance for any help you can provide, maybe there is an easy way to accomplish what I'm trying to do.
We are designing a Staging layer to handle incremental load. I want to start with a simple scenario to design the staging.
In the source database There are two tables ex, tbl_Department, tbl_Employee. Both this table is loading a single table at destination database ex, tbl_EmployeRecord.
The query which is loading tbl_EmployeRecord is, SELECT EMPID,EMPNAME,DEPTNAME FROM tbl_Department D INNER JOIN tbl_Employee E ON D.DEPARTMENTID=E.DEPARTMENTID.
Now, we need to identify incremental load in tbl_Department, tbl_Employee and store it in staging and load only the incremental load to the destination.
The columns of the tables are,
tbl_Department : DEPARTMENTID,DEPTNAME
tbl_Employee : EMPID,EMPNAME,DEPARTMENTID
tbl_EmployeRecord : EMPID,EMPNAME,DEPTNAME
How to design the staging for this to handle Insert, Update and Delete.
In my ETL job I would like to truncate stg table but before truncating stging table, I want to make sure that all the records are inserted in the data model. The sample is as below
create table #stg ( CreateID int, Name nvarchar(10) ) insert into #stg select 1, 'a' union all select 2, 'b' union all
[Code] ....
How can I check among these tables and make sure that all values are loaded into the data model and the staging table can be truncated.
I'm trying to use Excel in SSIS to import the data from spreadsheet to a staging table. The package runs well from the web server using SSMS. But when I deploy and try to execute the package, I'm getting the below error. I've a question, whether I've to install the AccessDatabaseEngine driver in SQL database server or the web server where I'm executing the SSIS?
Error: The requested OLE DB provider Microsoft.Jet.OLEDB.4.0 is not registered. If the 64-bit driver is not installed, run the package in 32-bit mode.
I apologize if this has been asked, but I can't find a complete answer.
We have a situation with parent/child tables which have an identity column as their PK. We need to be able to insert into the live tables from staging tables. The data in the staging tables are related via a surrogate key.
I have found the OUTPUT clause, but that can only refer to columns of the actual table (since there is no FROM clause in an INSERT). Our current best solution to this problem involves adding bogus "staging" columns to the destination tables, and removing them after we've inserted everything from staging. This is an unattractive solution to say the least.
I'll give an example that mirrors our actual solution, and ask if anyone has a better solution? ----------
Code Snippet CREATE TABLE [dbo].[TABLE_A]( [ID] [int] IDENTITY(1,1) NOT NULL, [DATA] [nchar](10) NOT NULL, [STAGING_COLUMN] [bigint] NULL, CONSTRAINT [PK_TABLE_A] PRIMARY KEY ([ID] ASC) ) GO CREATE TABLE [dbo].[TABLE_B]( [ID] [int] IDENTITY(1,1) NOT NULL, [A_ID] [int] NOT NULL, [DATA] [nchar](10) NOT NULL, [STAGING_COLUMN] [bigint] NULL, CONSTRAINT [PK_TABLE_B] PRIMARY KEY ([ID] ASC) ) GO ALTER TABLE [dbo].[TABLE_B] ADD CONSTRAINT [FK_TABLE_A_TABLE_B] FOREIGN KEY([A_ID]) REFERENCES [dbo].[TABLE_A] ([ID]) GO CREATE TABLE [dbo].[STAGE_TABLE_A]( [A_Key] [bigint] NOT NULL, [DATA] [nchar](10) NOT NULL ) GO CREATE TABLE [dbo].[STAGE_TABLE_B]( [B_Key] [bigint] NOT NULL, [DATA] [nchar](10) NOT NULL, [A_Key] [bigint] NOT NULL ) GO
The STAGING_COLUMN columns are the ones that will be added before, and dropped after.
Code Snippet DECLARE @TABLE_A_MAP TABLE ( A_ID INT, A_Key BIGINT ) INSERT INTO TABLE_A (DATA, STAGING_COLUMN) OUTPUT INSERTED.ID, INSERTED.STAGING_COLUMN INTO @TABLE_A_MAP SELECT DATA, A_Key FROM STAGE_TABLE_A INSERT INTO TABLE_B (A_ID, DATA) SELECT TAM.A_ID, STB.DATA FROM STAGE_TABLE_B STB INNER JOIN @TABLE_A_MAP TAM ON TAM.A_Key = STB.A_Key
This seems to work, but I'd really like another alternative. Even though this is happening when nobody else is using the database, I cringe at the thought of adding and removing columns just to make this work.
Here are a few of my constraints:
The above is a simplification of the actual problem. The actual problem goes about five levels deep (hence the B_Key in STAGE_TABLE_B). At the top level, our larger customer will have 100,000 rows to insert. Each level will average 3 times as many rows as the next higher level, so we're talking about real volumes here.
This has to finish over the course of a weekend.
This has to be delivered to QA this Friday Thanks for any help or insight.
I am having one store procedure which use to load data from flat file to staging table dynamically. everything is working fine.Staging_temp table have single column.All the data stored in that single column below is the sample Data.
where including/excluding a single column in an empty staging table would influence a resultset returning from distributed query? Both servers are SQL Server 2012. Nothing special about the staging table. It contains 12 columns with a mixture of INT and NVARCHAR(256) columns. In one case I exclude the column and the query returns in 17 seconds. When I include it the query does not return. Excluding the INSERT INTO the staging table and query returns in 17 secs with and without the column.
I am working on an HR project and I have one final component that I am stuck on.
I have an Excel File that is loaded into a folder every month.
I have built a package that captures the data from the excel file and loads it into a staging table (transforming a few bits of data).
I then combine it with another table in a view.
I have another package that loads that view into a Master table and I have added a Slowly Changing Dimension so that it only updates what has been changed. (it’s a table of all employees, positions, hire dates, term dates etc).
Our HR wants to have this data in a report (with charts and tables) and they wanted it to be in a familiar format. So I made a data connection with Excel loading the data into a series of pivot tables.
I have one final component that i cant seem to figure out. At the end of every year I need to capture a count of all Active Employees and all Termed employees for that year. Just a count.
The data is in one table labeled [EEMaster]. To test the count I have the following.
SELECT COUNT([PersNo]) AS HistoricalHC FROM [dbo].[EEMaster] WHERE [ChangeStatus] = 'Current' AND [EmpStatusName] = 'Active'
this returns the HistoricalHC for 2013 as 418.
SELECT COUNT([PersNo]) AS NumbOfTermEE FROM [dbo].[EEMaster] WHERE [ChangeStatus] = 'Current' AND [EmpStatusName] = 'Withdrawn' AND [TermYear] = '2013'
This returns the Number of Termed employees for 2013 as 42.
I have created a table to report from called [dbo.TORateFY] that I have manually entered previous years data into.
Kimberly Tripp describes a recipe for switching partitions in and out, thru the use of staging tables, when it comes time to "slide the window" on a partitioned table. She says that the clustered index (on staging) must be the same as that chosen for the partitioned table itself but she doesnt discuss whether or not all of the non clustered indexes need to be the same too once the ALTER TABLE Orders SWITCH PARTITION 1 TO OrdersOctober2002 and ALTER TABLE OrdersOctober2004 SWITCH TO Orders PARTITION 24 run. For the data being switched out, I wouldnt want to do anything extra. For the data being switched in, I'd like to understand if she is implying that all other indexes would be built automatically as a result of the 2nd ALTER statement?
Kimberly's article is at http://www.sqlskills.com/resources/Whitepapers/Partitioning%20in%20SQL%20Server%202005%20Beta%20II.htm#_Toc79339965
After the staging_temp data gets inserted into main table.my probelm is to handle such a file where number of columns are more than the actual table.
If you see the sample rows there are 4 column separated by "¯".but actual I am having only 3 columns in my main table.so how can I get only first 3 column from the satging_temp table.
I have an existing data warehouse that is refreshed nightly from system1.
We will be moving to a new system2 1/1/06. The old system1 will be phased out by the end of 2006. Several tables on system2 will have different keys.
I have to move the data from both systems in the data warehouse.
Here's the issue. I have to totaly refresh the data warehouse nightly. Once system2 is phased out how do I keep the data from system1 and refresh only data from system2.
Hello, Maybe anyone have done that before? I have table where i store SOURCE_TABLE_NAME and DESTINATION_TABLE_NAME, there is about 120+ tables. i need make SSIS package which selects SOURCE_TABLE_NAME from source ole db, and loads it to DESTINATION_TABLE_NAME in destination ole db.
I made such SSIS package. set ole db source data access mode to table or view name variable. set ole db destination data access mode to table or view name variable. set to variables defoult values (names of existing tables) but when i loop table names is changed, it reports error, that can map columns, becouse in new tables is different columns.
What is the best way to transfer data from the staging table into the main table.
Example: Staging Table Name: TableA_satge (# of rows - millions) Main Table Name: TableA_main (# of rows - billions)
Note: Staging table may have some data same as the main table.
Currently I am doing: - Load data into staging table (TableA_stage) - Remove any duplication of rows from the staging table (TableA_stage) - Disable all indexes on main table (TableA_main) - Insert into main table (TableA_main) from staging table (TableA_stage) - Remove any duplication of rows from the main table using CTE (TableA_main) - Rebuild indexes on main_table (TableA_main)
The problem with the above method is that, it takes a lot of time and log file size grows very big.