Need Feedback On My Plan To Import A Terribly Formatted Excel Spreadsheet
Mar 19, 2007
Good morning, all,
I have an Excel workbook that needs to be imported. It has three
sheets, but it's really the first that is giving me fits. Each of the
three worksheets have header info and instructions on the first 8
rows. Worksheet 1 then has, on row 9, the column names for the group
informtion. Row 10 has the group information. Row 11 has detail
column headers. Row 12 and later have detail information. Worksheets
2 and three do not have detail information, just row 9 with the column
names for the group informtion and Row 10 with group information.
Here is how I am thinking of handling this.
Run a script, outside of SSIS to save each sheet as a CSV file to a
folder. I believe that this must be done because some of the first 8
rows are blank and according to the docs, SSIS cannot have blank rows
in imported Excel sheets.
Loop over the files in the folder.
For each file, exclude the first 8 rows.
if the file name is the first worksheet then
get the next two rows and process group info
get the rest of the worksheet and process detail information
if the file name is not the first worksheet then
get the next two rows and process group info
My questions are: Does this seem feasible? Is there an easier way to
do this? Any hints or tricks that might be helpful? Any pitfalls
that I should watch out for?
I am able to import an excel spreadsheet into a table in sql server 2005 using SqlBulkCopy. The only thing that bothers me here is how to check duplicate entries and throw an error to the user regarding the duplicate entries. In the table in sql, there is no primary keys. There are five columns and the way I will have to find the duplicates is to match all those 5 columns. Since the excel spreadsheet can have 40 to 500 entries, how can I check those dupes.
I am looking for a way to import data from a CSV or Excel spread sheet and add the data directly into an Extended field instead of a regular field in the table. for example: let's say I have a comma delimited field with the following info:
NDC_M_FORMULARY,CUSTOM_EXTSIG,Custom EXT SIG NDC_M_FORMULARY,DRUG_CODE,Alternate key, user defined NDC_M_FORMULARY,CHARGE_CODE,From the Charge code table
The first column is the table name Second Column is the Column name in the table The third column contains the description that I would like to store in the Value in the Extended Property Name "MS_Description"
BTW,I did find the following T-SQL which returns the Extended description for a specific Extended Property
Here it is:
SELECT [Table Name] = i_s.TABLE_NAME, [Column Name] = i_s.COLUMN_NAME, [Description] = s.value FROM INFORMATION_SCHEMA.COLUMNS i_s LEFT OUTER JOIN
I have this situation that I need to read a spreadsheet with user names into a sql table where user name is just one of the columns. I tried using oledb connection to read the spreadsheet and sqlbulkcopy to import into sql table. There was no error, but the data wasn't imported into sql. Does anyone have any suggestion what I did wrong or what is the right way of doing this? Thanks a lot. Mia
I am using the import wizard in SQL Server 2008 R to import data from an Excel spreadsheet into a table I have created.
The spreadsheet contains 3 columns that SQL recognises as DOUBLE and they contain a 1 or 0. What data type do the corresponding fields in SQL table need to be? I have tried BIT, INT and FLOAT but keep getting an error (can't view details of the error because I get chucked out every time the error pops up). I know the problem is with the DOUBLE data because when I 'ignore' those columns the import works fine.
Firstly, i'm new to integration services and have only done a little with DTS jobs.
I'm trying to create an integration services project which will import data from an two worksheets in an Excel spreadsheet to two different tables in a database. I'm looking at only one table at present to make things a little more understandable.
One stipulation i have is that i need to be able to specify a variable value and insert that as an additional column in the database. I have and Excel source and a SQL destination both of which have been set up with there specific connection managers. I also have a variable which i add in using the derived column task.
When i try to debug this i am getting a few problems. I think these may be to do with the fact that although the worksheet in Excel has 20 rows (1st column shows these numbers) i only want those rows with data in them. If i preview the excel table it shows all the rows including those with null columns. Is there some sort of way that i can only get the rows that have data in the columns after the row number. I.e. can i select rows that do not have a second column value = to NULL.
I hope this makes sense and that someone can help me out with this problem.
All help is greatly appreciated.
Cheers,
Grant
P.S.
Apologies. I have this resolved now. I didn't see the option to use a SQL command as apposed to a table or view when setting up the Excel source.
I am still however getting the following errors which i'd appreciate some help on:
Error: 0xC0202009 at Data Flow Task, Excel Source [1]: An OLE DB error has occurred. Error code: 0x80040E21. Error: 0xC0208265 at Data Flow Task, Excel Source [1]: Failed to retrieve long data for column "Rework Entry Information (BE SPECIFIC)". Error: 0xC020901C at Data Flow Task, Excel Source [1]: There was an error with output column "Rework Entry Information" (170) on output "Excel Source Output" (9). The column status returned was: "DBSTATUS_UNAVAILABLE". Error: 0xC0209029 at Data Flow Task, Excel Source [1]: The "output column "Rework Entry Information" (170)" failed because error code 0xC0209071 occurred, and the error row disposition on "output column "Rework Entry Information" (170)" specifies failure on error. An error occurred on the specified object of the specified component. Error: 0xC0047038 at Data Flow Task, DTS.Pipeline: The PrimeOutput method on component "Excel Source" (1) returned error code 0xC0209029. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. Error: 0xC0047021 at Data Flow Task, DTS.Pipeline: Thread "SourceThread0" has exited with error code 0xC0047038. Error: 0xC0047039 at Data Flow Task, DTS.Pipeline: Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown. Error: 0xC0047021 at Data Flow Task, DTS.Pipeline: Thread "WorkThread0" has exited with error code 0xC0047039.
I get the following error when I use SQL Server 2005 Import/Export wizard to extract more than 255 columns from an excel file;
TITLE: SQL Server Import and Export Wizard ------------------------------ The preview data could not be retrieved. ------------------------------ ADDITIONAL INFORMATION: Too many fields defined. (Microsoft JET Database Engine) ------------------------------ BUTTONS: OK ------------------------------
I am new to SSIS. I am interested in using SSIS to import an excel spreadsheet into a SQL server database. My biggest concern is how to handle/manage errors that might occur when the import process occurs. Can anyone give me any guidance on this? I could write some C# code to do the import and to create a custom .txt file listing errors that occur on import. Using C# code to do the import seems like I would just be reinvinting the wheel so to speak.
Hi, I'm a Student, and since a few months ago I'm learning JAVA. I'm creating an application to call and compare times. For this I create in Excel a time table which is quite big and it would be a lot of typing work to input one by one the data in each cell in SQL Server, considering that I have to create 8 more tables. I was able to retreive the data from excel usin the JXL API of JAVA but it doesn't give all the funtions to perform math operations as JDBC. That's why I need to move the tables from Excel to SQL. I found this site http://davidhayden.com/blog/dave/archive/2006/05/31/2976.aspx which gives a code to do so, but I guess that some heathers are missing or maybe I don't know which compiler to use to run that code, I would like you help to identify which compiler use to run that code or if there is some vital piece of code missing.// Connection String to Excel Workbook string excelConnectionString = @"Provider=Microsoft .Jet.OLEDB.4.0;Data Source=Book1.xls;Extended Properties=""Excel 8.0;HDR=YES;""";
// Create Connection to Excel Workbook using (OleDbConnection connection = new OleDbConnection(excelConnectionString)) { OleDbCommand command = new OleDbCommand ("Select ID,Data FROM [Data$]", connection);
connection.Open();
// Create DbDataReader to Data Worksheet using (DbDataReader dr = command.ExecuteReader()) { // SQL Server Connection String string sqlConnectionString = "Data Source=.; Initial Catalog=Test;Integrated Security=True";
// Bulk Copy to SQL Server using (SqlBulkCopy bulkCopy = new SqlBulkCopy(sqlConnectionString)) { bulkCopy.DestinationTableName = "ExcelData"; bulkCopy.WriteToServer(dr); } } } On the other hand in this forum I that someelse use that link but implements a totally different code which I'm not able to compile also http://forums.asp.net/p/1110412/2057095.aspx#2057095. It seems this code works as I was able to read, but I do not know which language is used. Dim excelConnectionString As String = "Provider=Microsoft .Jet.OLEDB.4.0;Data Source=Book1.xls;Extended Properties=""Excel 8.0;HDR=YES;"""
' Using
Dim connection As OleDbConnection = New OleDbConnection(excelConnectionString)
Try
Dim command As OleDbCommand = New OleDbCommand("Select ID,Data FROM [Data$]", connection) connection.Open()
' Using
Dim dr As DbDataReader = command.ExecuteReader
Try
Dim sqlConnectionString As String = WebConfigurationManager.ConnectionStrings("CampaignEnterpriseConnectionString").ConnectionString
' Using
Dim bulkCopy As SqlBulkCopy = New SqlBulkCopy(sqlConnectionString)
End Try The Compilers I have are: Eclipse, Netbeans, MS Visual C++ Express Edition and MS Visual C# Express Edition. In MS Visual C++ Thanks for your help. Regads, Robert.
I'm trying to use Excel in SSIS to import the data from spreadsheet to a staging table. The package runs well from the web server using SSMS. But when I deploy and try to execute the package, I'm getting the below error. I've a question, whether I've to install the AccessDatabaseEngine driver in SQL database server or the web server where I'm executing the SSIS?
Error: The requested OLE DB provider Microsoft.Jet.OLEDB.4.0 is not registered. If the 64-bit driver is not installed, run the package in 32-bit mode.
I am attempting to run an SSIS package that, among other things, imports a spreadsheet from excel into a database table. The package runs without any issues within Visual Studio. I have tried executing the package through both, the MSDB run package and through dtexec (trying to kick of the package through a stored procedure) and I get 2 different behaviors.
Using dtexec (the method I really need to use): The package will run successfully...up to the point when the spreadsheet is imported at which time it fails with Description: The AcquireConnection method call to the connection manager "Excel Connection Manager" failed with error code 0xC0202009. Here is the code:
Running it through the MSDB Run Package UI...It will also make it up to the point where the Excel spreadsheet is imported but errors with: The Product level is insufficient for the component "Lookup Station and Account Type: (1894) ...and 1 line with that same error for every single task in that dataflow. Here is the code it runs.
/DTS "MSDBPopulateTRTLStationandtRTLUnitMapping" /SERVER "SERVERNAME" /MAXCONCURRENT " -1 " /CHECKPOINTING OFF /REPORTING V
The machine is running 32 bit OS Windows Server 2003 SP1 and Db SQL Server 2005 32 bit. I found one forum posting that suggested turning the Delay Validation property to True...but that did not fix the issue. I did create the package with my username with a ProtectionLevel of EncryptSensitiveWithUserKey. I don't think it is related to the account however because all of the tasks (serveral work tables are created) up to the Excel import will execute.
I really need to get this working as soon as possible so am open to any solutions someone can present.
I have 265 word documents which I use with mail merge to create a master document.We are going to auto generate this master document going forward so I want to store these 265 word documents, these range in size from 10kb to 50kb. As I mentioned these are formatted and also contain a merge field/marker so what I will be doing is using a value in the database, my application will take the document which is stored as a string/blob and will replace the "merge field" with the value from the database.
I can write an application to import each of these documents but I am not sure whats the best way to store them in order to be able to find and replace the merge marker.Also I will be joining a number of these together to make the master document. So for example I take 10 of these and I replace the marker fields in these with the values from the database, I will then push that string out to a PDF/Word document for printing.
Using an access project front-end to an MSDE database, I'm trying toimport a spreadsheet using File->Get External Data->Import.After I select the spreadsheet name, I tell it I want to import into anexisting table and select the table name from the list. The name inthe list is dbo.Cost_Data. When I tell it to finish, it says the tabledoes not exist.If I do the same steps as the database owner, the table name does notcontain the "dbo." on the front of it and it is successful.Thanks for any help.Jerry
I need to create a query (SQL 2000) that renders a formatted excel (xml or xls) file for each row that is outputted.
The details, I have a Campaign table that contains information for Auto and Life "Leads" and the data is submitted by telemarketers directly into the database. I need to render a file for each line, and it would be good if It were an Excel XML or XLS file, because that's what we've been using for a while.
Anyone know why cells within a matrix that are formatted as numeric export to Excel with a cell format proprty of "General"? Cells within a table however export with an appropriate format.
I need to take my 'table stats'every week and put them into an excel spreadsheet so that I can track the changes of my table sizes over time (basically I am watching to see how many rows are in each table). What I was planning on doing was to create a view of my table stats that I could then use DTS to transfer on a weekly basis into my excel spreadsheet. I have only used DTS a couple of times, and I have not ever tried it with excel. Now the problem: Everytime my DTS job runs it appends the data to the end of the origional columns that were created. Since my database has over 5000 tables these columns grow quite quickly. What I would like to do is set it up so that everytime the job runs it puts the new data into new columns in the same worksheet of my excel file. Is this possible? Any suggestions?
I'm getting an issue on a MS SQL DTS package that is doing a simple export from a MS SQL table to and Excel spreadsheet. I have three of these running but one is failing. I’m using DTSRun to run all three of these DTS packages. The only recent change was to the DTS package to fix the first step to delete the data in the spreadsheet tab named “Results”. The process works correctly in development (on different servers). The same active directory ID is being used on all three DTS packages and all three do the same i.e. export data to an excel spreadsheet in the same file location but with different names. I’ve Google’d this but only came across access issues which does not make since since it is writing the other two spreadsheets just fine. Curious.
Error I See:
Running DTS package with passed variables ... DTSRun: Loading...
Can somebody please tell me what I am doing wrong or need to do to resolve my issue. I having problems with one of the columns in an excel spreadsheet that I am trying to upload into the system. One of the columns contain values of both text and numbers such as 'ABC123', 'ABC124', '123456' etc. When I try uploading the sheet into SQL Server 2000 using DTS, the system removes all characters from the one column that I am having problems with. So entry 'ABC123' for example would be truncated to '123'. I tried formatting the column that I am having trouble with in excel to 'General' format as well as tried to transform the column to type varchar in SQL Server while using the DTS wizard but still had no luck. The problem is that SQL server thinks that the column is a float type column from the source and therefore truncates any characters.
Being new to SSIS I wish to loop through a series of excel spreadsheets and within each workbook loop through each sheet. I am aware of the For Each container but how can the each sheet in the workbook be referenced?
I have some XML that I'm passing through a variable into the XML task in SSIS. I also have an xsd file that I'm using and I want to validate (I think) that XML and have the XML task output an excel document. I've got the xsd file all set up in the "second operand" part of the XML task and the XML I'm passing in as a variable and that's in the "input" part of the XML task. My question is are there any tricks to make an excel document with these two things? Is there something I need to do in the xsd file to tell it that I want excel? Below is my XML and xsd:
I've got an Excel Spreadsheet with 5 columns of data which I need to get into an SQL Table. There's 13,000 rows in this Spreadsheet so manually doing it is out of the question.
I'm exporting via dts to an excel spreadsheet. However my database has more than 65000 records. DTS croacks and shoots out that there are too many records in the spreadsheet.
What do I do? Any way to go around that? These are my daily hit logs that are recorded and they tend to get big.
I am new to SQL and can do queries OK on SQLTalk. I need to know if there is a script to retrieve data and then export to an Excel spreadsheet for internal company use. Is there such a beast and is this the right place to look???
Hi, I'm trying to import an excel file into SQL sever(using an insert statement), i'm creating a DTS package (in enterprise manager) and have VB Script. When i parse it, i get no errors, but when i run the package it says that it ran successfully but nothing happens, it doesnt insert into the table, even though i tested the insert statement. Can anyone help me?? Here's the code:
Function Main() on error resume next Set objxl = CreateObject("Excel.Application") objxl.Visible = False
Dim xlFile xlFile = "C:Datafile.xls" Set objWkb = objxl.Workbooks.Open(xlFile)
'' Connecting to SQL Server set cn = server.CreateObject("ADODB.Connection")
Dim serverName serverName = "myserver2"
strCS = "Provider=SQLOLEDB; Data Source=myserver2;Initial Catalog=mycat; Integrated Security=SSPI"
cn.ConnectionString = strCS On Error Resume Next cn.Open Set objsht = objWkb.Worksheets.Open("Sheet1") Dim client_name, rb, date_rvd, LOB Dim sql Dim row, sequence row = 2
client_name = Trim(objsht.Cells(row, 2).Value) Do While IsNull(client_name) = False And client_name <> "" 'client_name = Trim(objsht.Cells(row, 2)) rb = Trim(objsht.Cells(row, 4).value) date_rvd = Trim(objsht.Cells(row, 6).value) LOB = "WCS"
i'm using DTS to create a procedure. what i wanna do is to pump data to the excel, however i need to re-use the same excel file everytime, ie:i need to delete all fields in the spreadsheet except the colume name and then pump all data in again. i know how to pump the data in, however i dun know how to simply clear the existing data in the spreadsheet, can anyone help me, big thx ~!
I am trying to set up a linked server in SQL Server 2005 to link to an excel spreadsheet.
-I am selecting Jet 4.0 as the provider -Product name is Excel -Data Source is the path on our network to the excel file: N:Devon 5403 4.0 Engineering4.01 ProcessLinelistIFCLDT Field.xls -Provider String is Excel 8.0 -Security | Login not defined is set to Be made using the login's current security context.
The Excel file is an Excel 2003 spreadsheet. The worksheet is titled Pages
I have a query window open in SQL Server Management Studio and the following is my select statement:
SELECT * FROM DEVON_LINE_LIST...Pages$
I get the following error message:
OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "DEVON_LINE_LIST" returned message "Cannot start your application. The workgroup information file is missing or opened exclusively by another user.". Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "DEVON_LINE_LIST" reported an error. Authentication failed. Msg 7303, Level 16, State 1, Line 1 Cannot initialize the data source object of OLE DB provider "Microsoft.Jet.OLEDB.4.0" for linked server "DEVON_LINE_LIST".
I get similar error messages no matter which security settings I pick.
Any thought as to what I can try to get this to work?
I am new to SQL program. I did little management for SQL 2000 before. I need to export from a table or view to excel spreadsheet for company's marketing resourece. Is there any easy simple way to do it?
I am using SSIS to export data from a table to an Excel spreadsheet. This all seems to work put just fine. The user would like a data in column B1 to say when the spreadsheet was created. Is there a way within SSIS to do this. I was looking at using a .NET script but it accesses the spreadsheet as a table so I do not know how to insert data above the headings in row 1. I believe the OleDB provider using column 1 as it column names for the table. Maybe I am just going about the whole think wrong?
Is there a limit to the number of fields that can be set in an OleDB Update Statement?
This works with 6 fields: cmd.CommandText = "Update [Sheet2$A2:AP2] Set F1 = '1', F2 = '35062', F3 = '6', F4 = '620000.0000', F5 = '200000.0000', F6 = '700000.0000'"
This Fails with 7 fields: cmd.CommandText = "Update [Sheet2$A2:AP2] Set F1 = '1', F2 = '35062', F3 = '6', F4 = '620000.0000', F5 = '200000.0000', F6 = '700000.0000', F7 = '123'"
The range should be plenty big with A2:AP2. In the end I'm trying to push 42 fields.
The complete segment is:
Dim ExcelConnection As String = "Provider=Microsoft.Jet.OLEDB.4.0;" & _ "Data Source=" & ExcelFileName & ";" & _ "Extended Properties=""Excel 8.0;HDR=NO""" Dim conn As New System.Data.OleDb.OleDbConnection(ExcelConnection) Dim cmd As New System.Data.OleDb.OleDbCommand()
Deaa group,I am using SQLServer 2000 in an XP Sp2. I would like to do thefollowing:I have a program running on a database server that generates some datawhich are loaded to the database. This program is used in a webapplication, invoked by some java program and JSP scripts. (I amfrontend illiterated.)The question is, is it possible to write a stored procedure to generateoutput in excel spreadsheet? So that user could call this procedureand get spreadsheet output on the client side.Any pointer to a solution would be immensely apprecaited.thanks,charia