How To Optimize Data Import With Huge Volumes And Joins Across Data Sources Not All SQL Server Based?
Jun 7, 2006
I need to periodically import a (HUGE) table of data from an external data source (not SQL Server) into SQL Server, with the following scenarios:
Some of the records in the external data source may not exist in SQL.Some of the records in the external data source may have a different value at different imports, but this records are identified univocally by the same primary key in the external datasource and in SQL Server.Some of the records in the external data source may be the same in SQL.
Due to the massive volume of the import, I would like to import only the records which are different from what I have in SQL Server (cases 1 and 2 above). In fact case 2 is the most critical.
I thought of making a query with a left outer join between the data in the external data source table (SOURCE) and the data in the SQL Server table (DESTIN). The join is done on the respective primary keys (composed keys of up to 10 columns) and one of the WHERE conditions will be that the value in SOURCE is different from the value in DESTIN.
The result of this query would be exactly what I need to import.
How to do this in SSIS??? I couldn't figure out how to join tables in different data sources yet.
In fact I cannot write a stored procedure to do that, since one of the sources is in a datasources not SQL Server.
I have seen the Lookup transformation in this article http://www.sqlis.com/default.aspx?311 but this is not exacltly what I want to do.
Another possibility is to use the merge join, but due to the sorting I believe its performances would be terrible!
Thanks in advance for your suggestions!
View 9 Replies
ADVERTISEMENT
Oct 30, 2007
Hi,
I'm trying to replicate a SQL join across two seperate data sources in SSIS. If I were to write SQL to do this, it would be as follows:
SELECT Costs.CostRateEntryId,
Costs.UserId,
Costs.HourlyRate * 8 AS DailyCostRate,
Dates.DateKey,
Dates.ActualDate,
FROM Costs
INNER JOIN Dates ON Dates.ActualDate >= Costs.EffectiveDate AND Dates.ActualDate <= Costs.EndDate
Unfortunately, as the tables 'Dates' and 'Costs' are in two seperate SQL2005 systems, I can't really do this. I was hoping that it could be achieved in SSIS, but I cant seem to find any way that I can do a join that's <= or >=.
Can anyone help?
Thanks
Jeremy
View 7 Replies
View Related
Apr 21, 2007
Can I import in my query a file from other sources (in this case it's a job that has elaborated server database data, but I could be in the need of using for instance Excel files or others) and compare its fields with the tables in my query?
Thank you.
Anna - Verona (Italy)
View 3 Replies
View Related
Jul 20, 2005
I have a table which contains approx 3,00,000 records. I need toimport this data into another table by executing a stored procedure.This stored procedure accepts the values from the table as params. Mycurrent solution is reading the table in cursor and executing thestored procedure. This takes tooooooo long. approx 5-6 hrs. I need tomake it better.Can anyone help ?Samir
View 2 Replies
View Related
Dec 30, 2005
I'm just beginning to use SSIS (bracing for a steep learning curve due to lack of helpful documentation) and am starting out trying use the Import and Export Wizard. On the "Choose a Data Source" page there is a dropdown for the Data Source. I see a list of possible data providers, but not one of "Microsoft OLE DB Provider for ODBC drivers," which is the one I wanted to use because I'm trying to connect to an obscure database. So I figured that I need to use ".Net Framework Data Provider for Odbc." Unfortunately, regardless of what I enter for the Connection string or the Dsn or the Driver I invariably get an error, although it's somewhat dependent on that I have entered for those three items.
Either this (when I type in a DSN)
Cannot get the supported data types from the database connection "Dsn=Terrascan_Okanogan_WA".
or this (if I enter a full connection string and a driver)
The operation could not be completed.
------------------------------
ADDITIONAL INFORMATION:
ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified
of this (if I enter a DSN and a driver)
Cannot get the supported data types from the database connection "Dsn=Terrascan_Okanogan_WA;Driver={SmartWare Driver}".
------------------------------
ADDITIONAL INFORMATION:
Specified cast is not valid. (System.Data)
So I have a couple questions. First, why doesn't "Microsoft OLE DB Provider for ODBC drivers" appear in the list of data sources, and secondly, when using the ".Net Framework Data Provider for Odbc" data source what inputs are expected because whatever I'm doing doesn't seem to work?
View 12 Replies
View Related
Jul 6, 2006
Hello all,
I have recently been task with rewriting a database that holds large volumes of data, whilst ensuring that query can be run in optimal time. Having never really delved into this sort of thing before, I hoped you guys might be able to offer some advice and guidance.
The design I have inherited is based around 2 main tables:
[captured_traps]
[id] [int] IDENTITY (1, 1) NOT NULL
[snmp_version] [int] NULL
[community_name] [varchar] (255)
[packet_type] [varchar] (50)
[oid] [varchar] (500)
[source_ip] [varchar] (15)
[generic] [int] NULL
[specific] [int] NULL
[time_stamp] [varchar] (15)
[trap_entered] [datetime] NULL
[status] [int] NULL
[captured_varbinds]
[id] [int] IDENTITY (1, 1) NOT NULL
[captured_trap_id] [int] NOT NULL
[varbind_oid] [varchar] (500)
[varbind_text] [varchar (500)
The relationship between the two tables is on the "captured_traps (id)" to "captured_varbinds (captured_trap_id)". Currently the "captured_traps" table contains around 350 million rows, the "captured_varbinds" table contains around 900 million rows.
Now as you can probably gather this model runs like a....well it sort of hobbles more than runs hence the need to redesign.
My current thoughts on this are:
- Normalising all varchars - there is alot of duplicate values in most of the varchar fields.
- Full Text Indexing
However beyond that I am not sure which route to go down. After googling for most of today I have come across a number of "solutions" however I do not want to go steaming down the track of one of these to discover that it is fatally flawed somewhere.
View 6 Replies
View Related
Jul 7, 2015
We have a daily process, which copies millions of rows of data from one DB to another over Linked Server. Just checking on the best practise, are there more efficient ways than the Linked server to copy millions of rows of data from one DB to another? I checked bulk insert but that transfers the data from the file to DB not DB to DB.Â
View 6 Replies
View Related
Jan 14, 2008
Hi,
I have a datasource to an SQL server table which is used in a dataset with the following sql
e.g.
select *
from tblcustomer
I need to join tblcustomer to another SQL table located on a different server and database (datasource)
how would i go about doing this?
I tried the below which failed, anyone can help?
select *
from tblcustomer LEFT JOIN server1.database1.tbluser on server1.database1.tbluser.userid = tblcustomer.userid
View 4 Replies
View Related
Aug 4, 2006
Hi,
I am pretty new to SSIS. I am trying to create a package which can accept data in any of several formats. i.e. CSV, Excel, a SQL Server database/table and import the data into my destination database.
So far i've managed to get this working OK. However I am now TOTALLY stuck. I'm currently trying to just concentrate on the data sources being a CSV (using a Flat File Data Source) and/or an Excel Spreadsheet.
I can get the data in and to my destination using a UNION ALL component and mapping the data sources to it so long as both the CSV file and the Excel spreadsheet exist.
My problem is that I need my package to handle the possibility that only the CSV file might exist and there is no Excel spreadsheet. In which case i'd like the package to ignore the Excel datasource completely. Currently either of my data sources do not exist I get errors and the package terminates.
Is there any way in SSIS that I can check all my data sources to see which ones exist (i.e. are valid). If they exist I want to use them. If it doesn't exist i'd like to disgard it (without error - as long as there is a single datasource the package should run)
I've tried using the AcquireConnection method in a script task on each of my connections, hoping that it would error if the file/datasource did not exist. It doesn't though (in the case of an Excel datasource it just creates a empty excel file for me).
The only other option I can come up with are to have seperate packages depending on the type of data we want to import and then run a particular package depending on the format of the source data. This seems a bit long winded. I am pretty sure I must be able to do what I want to achieve but I can't work out how.
I'll be grateful to anyone who can send me any tips/hints/links on how I can achieve this.
Many thanks
Rob Gibson
View 5 Replies
View Related
Mar 12, 2007
I've encountered a few problems using SSIS against non-SQL Server data
sources and was hoping that others might have some experience. Google
searches and browsing MSDN hasn't led me to a solution, so any advice
on the following is appreciated:
1) When using the "Data Source Views" wizard to add a data source
view from an ODBC data source, only tables appear in the object
listing. Views do not. (I've also observed this with SQL Server 2005
databases as well, but it's a bigger issue when you can't use the native SqlClient, as is the case for many ODBC-only databases.) ODBC traces show that both table and
view metadata is being returned to SSIS correctly, so it appears as
though SSIS is filtering out views from the object listing in the
wizard.
2) When creating named queries against (non-SQL Server) ODBC data
sources, SSIS appears to use the SQL Server SQL syntax for referencing
schemas/objects (e.g. "SELECT * FROM [schema].[table]", rather than
"SELECT * FROM schema.table"). This isn't valid SQL in many databases.
Am I missing something? Is there a way to change this through some
configuration setting?
View 2 Replies
View Related
Oct 2, 2006
Hi:
I am new to SSIS. I would like to know if I want to transfer data from one Oracle schema to another Oracle schema and also to do scheduling of the packages, can I still use SSIS? If yes, what are the components that need to be installed on the database server and the development environment? I hope I don't need the full SQL Server database installation in order to use SSIS.
Thanks!
MuiSukYuen
View 1 Replies
View Related
Feb 26, 2008
Hi, i'm wondering which is the best way to search data in a SQL Server.
I reach data using Data Sources and Data Views and also with OLE DB Source with a Data access mode: Named query.
I have to write the data into a Flat File. So, does any one knows which is the best practice for this? Or any one of the two are good choices?
Thanks for your help.
Beli
View 6 Replies
View Related
Jan 9, 2007
Hi,
In my project i want a report. In that report data is getting from more than one data sources(systems). While creating data source view i used named query for both primary and secondary data source. But at the time of crating "Report Model" i am getting below error.
An error occurred while executing a command.
Message: Invalid object name 'Table2'.
Command:
SELECT COUNT(*) FROM (SELECT SerialNum, ModelNum AS com_model
FROM Table2) t
Is there any way to create a report with multiple data sources?
View 8 Replies
View Related
Mar 19, 2007
Hi all :)
I'm wondering if SSIS will be the solution to the problem I'm working on.
Some of our customers give us an Excel sheet with data they want to insert or update in the database.
I've created a package that will take an Excel sheet, do some data conversion so the data types match up and after that I use a Slowly Changing Data component to create the insert/update commands.
This works great. If a customer adds a new row to the Excel sheet or updates an existing row changes are nicely reflected in the database.
But now I€™ve got the following problem. The column names and the order of the columns in the Excel sheet are not standard and in the future it could happen a customer doesn't even use an Excel sheet but something totally different.
Can I use SSIS for this? Is it possible to let the user set the mappings trough some sort of user interface? I€™ve looked at programmatically creating the package but I€™ve got to say that€™s quit hard to do€¦ It would be easier to write the whole thing myself than to create the package trough code ;)
If not I thought about transforming the data in code before I pass it on to the SSIS package in something like XML. That way I can use standard column names and data types.
So how should I solve this problem? Use SSIS or not?
Thnx :)
Wouter de Kort
View 6 Replies
View Related
Jun 18, 2014
I have a project where I am creating a test environment. My objective is to give the client staff a set of instructions to follow to make sure the test mirrors production. I have migrated the SRS databases over to the new test server. All of the reports are there as is the security but where I run into an issue is the data sources for SRS. They are all pointed at the production servers. Is there an easy way to make the change possibly using a SQL update statement or exporting all the reports to XML and doing a find and replace. The shared data sources I went through manually and changed but there were 40 data sources. Now I realized some of the reports have embedded data connections. There are hundreds of reports to go through.
View 5 Replies
View Related
Oct 3, 2014
Currently, I am using T-SQL to combine some data and using an ISNULL function like this:
,ISNULL(H.Last_Name, S.PatientLastName)
,ISNULL(H.First_Name, S.PatientFirstName)
I am wanting to change from using a query to move this data to using an SSIS Data Flow. I am familiar with using Merge Join to combine the two tables (H & S in this case), but I'm not sure where I can use the ISNULL in the manner described above. Is there a way to do it in the Merge Join? Do I have to do it after the Merge Join?
View 3 Replies
View Related
Aug 3, 2015
I've
1. Data Warehouse
2. OLAP CUBE in Analysis Services
My question is - If my Data Warehouse is changed (Having Append Data) - My OLAP Cube will have the Append Data?
It's possible, my OLAP Cube always having Append Data if my Data Warehouse is changed? If yes, how to do it without re-deploy and re-process my Analysis Services Project....
View 2 Replies
View Related
Sep 4, 2015
I need to create a function that replaces the data in a column with an 'X' based on the LEN of the data in the column. I created one that does a replacement, but it fills the column based on the max data length, and not the current length of the string or integer. An example of what I'm trying to accomplish.
Original data in a varchar(30) column:
thisisavalue
thisisanothervalue
thisisanothervalueagain
shortval
replaced with
xxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxx
xxxxxxx
My current function is replacing the data like this:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
View 4 Replies
View Related
Apr 1, 2008
I'm in the middle of developing a Database for a hospital that measures its audits, inhouse operations, and finance. What we currently have and do everyday is collect data from a large database that is real time with patient data, progess, infomation, etc and dump it into a data warehouse that runs on TSI/Eclipsys. We run reports using a number of programs and dump it into Excel sheets that have charts, reports, etc. This Database for which I'm developing won't come soley from the TSI/Eclypsys source, but this is the only source thats updated regularly. I don't want to have in sync with TSI/Eclipysys in fear that every day when it updates data may be lost, not read, or worse won't be up date if there is a problem. My question is is it possible to run a query from Sequel Server 2005 that will take that data upon request using the reporing features on Sequel Server 2005. i.e. What if I need to run a report on measure B in department 12 from Jan 1-Feb 1, instead of being in sync, can I just write queries to take that information rather than double the data and take up twice the space and trouble. FYI, these datatypes rarely change in the TSI/Eclipsys data warehouse. This sure was long question and didn't intended it to be . Thanks for listening and hope to hear back.
Stephen
View 1 Replies
View Related
Jul 23, 2007
All,
I apologize for not doing the legwork to see if i can answer my own question, but I am close to a loosely planned SQL 2005 migration and don't have time and resources to test my own theories.
Is there a way for a Reporting Services 2000 server to connect to SQL server 2005 databases? I've tried creating a new data source and changing a report to this data source, but it seems like the report is still using the old data source. I'm guessing i might need to register new data providers on the 2000 RS server and then change the existsing data sources.
Ideas?
Thx in advance,
-Peter
View 3 Replies
View Related
Jul 9, 2007
Hi guys,
I manage to get the SSIS working. Now I would need to do these tasks.
I first want to get data from 2 different sql servers. What would be the best method to accomplish this? Both are in Sql Server 2005.
Secondly I want to make sure if any of the servers couldn't be found on the network or in any case the getting data task failed for any one of them the package won't continue and an email should be send to an email address.
Thirdly If everything is ok then I should combine both and generate one sequence no for them and save them on to another location and then generate a file with modified values.
Can anyone help me regarding these tasks?
Thank you
Gemma
View 17 Replies
View Related
Oct 4, 2007
Hi all,
It looks like these options are only available in the SQL Server Management Studio? I installed SQL Server Management Express Studio and I can't even find the DTSWizard.exe on my machine.
Can you please help how I can import data from excel or where can I download the SQL Server Management Studio?
Your prompt response is greatly appreciated.
Thanks!!
Tram
View 8 Replies
View Related
Sep 10, 2007
I have one column in SQL Server 2005 of data type VARCHAR(4000).
I have imported sql Server 2005 database data into one mdb file.After importing a data into the mdb file, above column
data type converted into the memo type in the Access database.
now when I am trying to import a data from this MS Access File(db1.mdb) into the another SQL Server 2005 database, got the error of Unicode Converting a memo data type conversion in Export/Import data wizard.
Could you please let me know what is the reason?
I know that memo data type does not supported into the SQl Server 2005.
I am with SQL Server 2005 Standard Edition with SP2.
Please help me to understans this issue correctly?
View 4 Replies
View Related
Feb 28, 2007
Can anyone shed light on why I cannot get away from named pipes in my SQL2005 Reporting Services Data Sources? Can it not just use TCP/IP? My configuration uses two servers, one in which the reportserver is setup on, and another which hosts the database that is reported against. We are trying to avoid having to open up named pipes on the server that stores the data.
Any help would be appreciated.
An error has occurred during report processing.
Cannot create a connection to data source 'wc_datasource'.
an error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
Thanks!
View 1 Replies
View Related
Jan 28, 2008
Hello Guys,
I'm searching for a way to compare informations from one database against another database.
E.g. i want to check if serialnumbers of my first database (eg hardware.xls or hardware.csv ) are already stored in my sql server database.
I know i can use sql querys (Joins) for that task but i don't know how i can access 2 different databases at the same time. Do i need a tool for it? Does Excel is able to compare those data? But how? I'm searching for that thread since 2 days testing with some tools without any success...
So please help...
Kindly Regards...
Be
View 4 Replies
View Related
May 8, 2006
An SSIS package to transfer data from a DB instance on SQL Server 2005 to SQL Server 2000 is extremely slow. The package uses an OLEDB Source to OLEDB Destination for data transfer which is basically one table from sql server 2005 to sql server 2000. The job takes 5 minutes to transfer about 400 rows at night when there is very little activity on the server. During the day the job almost always times out.
On SQL Server 200 instances the job ran in minutes in the old 2000 package.
Is there an alternative to this. Tranfer Objects task does not work as there is apparently a defect according to Microsoft. Please let me know if there is any other option other than using a Execute 2000 package task or using an ActiveX Script to read records from one source and to insert them into the destination source, which I am not certain how long it might take and how viable will that be?
Any inputs will be much appreciated.
Thanks,
MShah
View 5 Replies
View Related
Feb 25, 2008
A view named "Viw_Labour_Cost_By_Service_Order_No" has been created and can be run successfully on the server.
I want to import the data which draws from the view to a table using SQL Server Import and Export Wizard.
However, when I run the wizard on the server, it gives me the following error message and stop on the step Setting Source Connection
Operation stopped...
- Initializing Data Flow Task (Success)
- Initializing Connections (Success)
- Setting SQL Command (Success)
- Setting Source Connection (Error)
Messages
Error 0xc020801c: Source - Viw_Labour_Cost_By_Service_Order_No [1]: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. The AcquireConnection method call to the connection manager "SourceConnectionOLEDB" failed with error code 0xC0014019. There may be error messages posted before this with more information on why the AcquireConnection method call failed.
(SQL Server Import and Export Wizard)
Exception from HRESULT: 0xC020801C (Microsoft.SqlServer.DTSPipelineWrap)
- Setting Destination Connection (Stopped)
- Validating (Stopped)
- Prepare for Execute (Stopped)
- Pre-execute (Stopped)
- Executing (Stopped)
- Copying to [NAV_CSG].[dbo].[Report_Labour_Cost_By_Service_Order_No] (Stopped)
- Post-execute (Stopped)
Does anyone encounter this problem before and know what is happening?
Thanks for kindly reply.
Best regards,
Calvin Lam
View 6 Replies
View Related
Oct 16, 2006
I am attempting to import data from Microsoft Access databases to SQL Server 2000 using the DTS Import/Export Wizard. I have a few errors.
Error at Destination for Row number 1. Errors encountered so far in this task: 1.
Insert error column 152 ('ViewMentalTime', DBTYPE_DBTIMESTAMP), status 6: Data overflow.
Insert error column 150 ('VRptTime', DBTYPE_DBTIMESTAMP), status 6: Data overflow.
Insert error column 147 ('ViewAppTime', DBTYPE_DBTIMESTAMP), status 6: Data overflow.
Insert error column 144 ('VPreTime', DBTYPE_DBTIMESTAMP), status 6: Data overflow.
Insert error column 15 ('Time', DBTYPE_DBTIMESTAMP), status 6: Data overflow.
Invalid character value for cast specification.
Invalid character value for cast specification.
Invalid character value for cast specification.
Invalid character value for cast specification.
Invalid character value for cast specification.
Could you please look into this and guide me
Thanks in advance
venkatesh
imtesh@gmail.com
View 4 Replies
View Related
Jun 14, 2006
l've the following situation,
l've some excel files controlled by Vendor which changing frequently. The only thing does not change is the header name of each column.
So my question is, is there any way to create a new table based on the excel file selected including the column name in SSIS? So that l can use the data reader as source to select those columns l am interested on and start the integration.
Thanks.
Regards,
Yong Boon, Lim
p/s : The excel header is at the row 7.
View 3 Replies
View Related
Sep 9, 2006
Hi My project is in .NET 2003 i.e. framework 1.1 and database in SQLServer 2000. But the reports have been developed using SQLServer 2005 Reporting Services. Now when I am trying to deploy them through deployment project of .NET its giving me following error:"Using other editions of SQL Server for report data sources and/or the report server database" is not supported in this edition of Reporting Services. Now I am really confused with this. Can any one please guide me regarding this ASAP. Thanks, Falguni
View 4 Replies
View Related
Feb 21, 2008
Hi i have data on a Server in a different database which i like to access from within my ssis job.
I just need to look up information from one table on this database so i can references it. Is there a way of doing this is SSIS. Rather then me having to load the data from one database to another as the data may change.
i tried having 2 sources of data feed into a look up but that does not work..
View 1 Replies
View Related
Apr 1, 2008
Hi there,
On my home page I have several different folders to reports which require different data sources. the problem is that within these folders there are multiple copies of the same datasource. is it possible to store all of the datasources in one folder, one location? it would certainly be easier when changes to usernames and passwords need to be modified!
Thanks,
Rhonda
View 4 Replies
View Related
Apr 4, 2008
I searched and read about Data Sources and I'm seeing that there is no advantage in using it, which is what I found from playing around with it.
I expected that you would set a global connection in Data Sources and somehow link this to the things in your Connection Manager, giving you one place to switch from one environment to another. But reading the discussions here and playing around with it, this is not the case.
So, why is it there?
Next question.... another thing I gathered so far is something called "Configurations" that will do what I was describing above. Where do I do this?
View 8 Replies
View Related