Importing .txt Files With Separate Definition File
Apr 13, 2006
Hi all,
I receive data via FTP to our webserver nightly as .txt files and .dic (if anybody is familiar with idx realtor websites, that's what this data is).
I've learned recently that I'm not going to be able to use Access to import or link to this data, so I'm trying to get my feet wet with SQL.
I have been practicing importing text files into SQL db, but I notice that the dts imports everything as varchar 8000, and that you can edit that. I've got a .dic file that accompanies every .txt file that contains definitions of each fieldname, fieldtype & length & I was wondering how to import that data as well, without having to manually retype everything.
I would be happy to email these text files to anybody willing to take a look.
I proposed on a new server that we separate Data Files, Log Files, tempDB, Backups, etc. onto separate LUNS on a SAN with High Speed Solid State Drives.I was told that with the new technology with solid state SAN's that it would decrease performance and that it did not work the same way as it did when you had RAID 5's etc.I thought that if things were cared out correctly by a SAN Administrator they would know how to configure for optimal performance.
On the time of installation SQL Server asking me where I wont to locate the DATA files and the PROGRAM files. It’s giving to me choice to put database AND log files on one disk and program files on separate. But what about to separate LOG and DATA files. I have RAID1 especially created on F: drive for LOG files and RAID 5 on E: for DATABASE files. When I have to separate that if not on the time of installation? How I can do that?
I'm making backups of the database by first making a full backup and then differential backups. The differentials are backed up to separate files.
Restore of the full backup works fine, but I can't restore a differential backup. In Management Studio Express, I first do a full backup restore with option NO RECOVERY and then try to restore a differential backup. But this failes with the message:
"This differential backup cannot be restored because the database has not been restored to the correct earlier state."
Is it possible to restore a differential backup that is backed up to a separate file?
We just upgraded from SQL 2000 to 2005. Under 2000, I could export multiple stored procs to separate windows files. Is there a way to do this under 2005 without exporting 1 proc at a time?
I'm trying to do something which I hope can be accomplished relatively simply.
I have a report similar to bank statements let's say. When run, it currently prints out each person's statement into one file, with page breaks sepearating each person's statement. What I need to do, is when the report is run, save each person's report into a seperate file for the purpose of emailing to them later.
I could easily modify my report to just output for one particular person, but I'm not sure if there's a way to "bulk render" all the reports and have them saved to sepearate files.
I should also add that I'm using an MS Access Data Project (ADP) as the front end to my app - connected to a SQL Server 2005 DB. I currently display the reports by embedding a web browser object into an Access form and rendering the report via HTML.
SET @RowCnt = 1 SET @date = CONVERT(CHAR(10),GETDATE(),110) SET @ArchPath = '\D$EDATAWorkFoldersSendSendData' SELECT @TotalRows = count(*) FROM table1 --select @ArchPath
WHILE (@RowCnt <= @TotalRows) BEGIN SELECT @AccountNumber = AccountNumber, @output_filename FROM table1 WHERE Identity_Number = @RowCnt --PRINT @AccountNumber --test SELECT @sql = N'bcp "SELECT h.HeaderText, d.RECORD FROM table2 d INNER JOIN table3 h ON d.HeaderID = h.HeaderID WHERE d.ccountNumber = ''' + @AccountNumber+'''" queryout "'+@ArchPath+ @output_filename + '.txt" -T -c' --PRINT @sql EXEC master..xp_cmdshell @sql SELECT @RowCnt = @RowCnt + 1 END
I have a master table containing details of over 800000 surveys made up of approximately 400 distinct document names and versions. Each document can have as few as 10 questions but as many as 150. Each question represents one row.
My challenge is to create a separate spreadsheet for each of the 400 distinct document names and versions containing all the rows and columns present in the master table. The largest number of rows would be around 150 and therefore each spreadsheet will not be very big.
e.g. in my sample data below, i will need to create individual Excel files named as follows . . . "Document1Version1.xlsx" containing all the column names and 6 rows for the 6 questions relating to Document 1 version 1 "Document1Version2.xlsx" containing all the column names and 8 rows for the 8 questions relating to Document 1 version 2 "Document2Version1.xlsx" containing all the column names and 4 rows for the 4 questions relating to Document 2 version 1
I assume that one of the first things is to create a lookup of the distinct document names and versions assign some variables and then use this lookup to loop through and sequentially filter the master table data ready for creating the individual Excel files.
--CREATE TEMP TABLE FOR EXAMPLE
IF OBJECT_ID('tempdb..#excelTest') IS NOT NULL DROP TABLE #excelTest CREATE TABLE #excelTest ( [rowID] [nvarchar](10) NULL, [docName] [nvarchar](50) NULL,
We have multiple databases on a single instance in an OLTP environment. I have my data files on a separate SAN LUN from my transaction log files (and a few NDFs split out onto additional LUNs). I was wondering if there is a performance benefit to putting each LDF file on its own LUN? Or at least my few busiest LDFs?
We are currently on 2012, but I'm having to put together specs for a 2014 installation and need to answer this question without having an environment in which I can benchmark different setups. I just want to hear whether or not others have done this (why or why not?).
I€™ve created with the help of some great people an SSIS 2005 package which does the follow so far:
1) Takes an incoming txt file. Example txt file: http://www.webfound.net/split.txt
The txt file going from top to bottom is sort of grouped like this Header Row (designated by €˜HD€™) Corresponding Detail Rows for the Header Row €¦.. Next Header Row Corresponding Detail Rows
€¦and so on
http://www.webfound.net/rows.jpg
2) Header Rows are split into one table, Maintenance Detail Rows into another, and Payment Detail Rows into a third table. A uniqueID has been created for each header and it€™s related detail rows to form a PK/FK relationship as there was non prior to the import, only the relation was in order of header / related rows below it when we first started. The reason I split this out is so I can massage it later with stored proc filters, whatever€¦
Now I€™m trying to somehow bring back the data in those table together like it was initially using a query so that I can cut out each of the Header / Detail Row sections into their own txt file. So, if you look at the original txt file, each new header and it€™s related detail rows (example of a cut piece would be http://www.webfound.net/rows.jpg) need to be cut out and put into their own separate txt file.
This is where I€™m stuck. How to create a query to combine it all back into an OLE DB Souce component, then somehow read that souce and split out the sections into their own individual txt files.
The filenames of the txt files will vary and be based on one of the column values already in the header table.
I am testing out a blank database created over two physical files on two separate disks with one table called data which has one column called values nvarchar(max).
I filled the table up with a whole load of data and ran a select * against it. If I run Permon at the same time I can see that the read load has been spread over multiple disks as each of these disks is getting read from in parallel. If I create the same database on a single file and run the same select * again it takes much longer, proving that the read load has been distributed across multiple disks.
Now moving onto writes, this is where the confusion lies. I understand that SQL server fills files evenly until they need growing, after which it will then fill files individually until they are full in a round robin fashion unless you have trace 1117 turned on. What I don't understand is why the writes aren't distributed out whilst it is filling these file groups.
I ran an continual insert into my table with go 1000000 to monitor how the files are being filled up. I monitored where SQL server was physically placing the files as they were being inserted by running the following query:
;WITH CTE AS (SELECT sys.fn_PhysLocFormatter (%%physloc%%) col1, RIGHT(LEFT(sys.fn_PhysLocFormatter (%%physloc%%),2),1) AS [Physical RID], DATAID
[Code] ....
I could see that it would a thousand or so records into file 1, then a thousand or so into file 2, then a thousand or so into file 1 etc etc. In another words it would hit one disk, then another disk, then back to disk one to fill the file evenly. Is there any way to make SQL Server distribute the writes out in parallel so that both disks are writing in tandem?
By the looks of it, multiple disks only scale reads, as with writes only one disk is ever written to at once which is annoying. Any way to harness the write power of multiple disks?
What object do I reference to use SSIS from Excel. I want to generate a flat file definition based on Excel. I have a lot of fields to import and I don't feel like creating them as flat file columns. I have a few tables and I get the source file format from the vendor in an Excel format. What I would like to do is generate a flat file connection in an empty package using VBA.
I would like to know if there is a way to maintain the history of changes to the reports that have been published to the report server?
I know that the report definitions get saved onto ReportServer database. But let's say a user makes a change to the published report and then saves it back to the server. And that the latest change was incorrect and I have to revert back to the previous version of the published report. Is there a way to do that? Does the report server maintain a history of previous versions.
There is a history for each report and I think that corresponds to the history of report executions (output data). But I am talking about the history of actual report definition.
Hello,I need to import a bunch of .csv files. The problem I am having is the"non data" information in the files creating bogus rows and columndefinitions. Here is an example of the csv file.CBOT - End-of-Day Futures Bulk Download 2001.2 Year U.S. Treasury Notes FuturesDateSymbolMonth CodeYear CodeOpen20010103ZTH2001102.0937520010104ZTH2001102.0312520010105ZTH2001102.28125In this case, there are bogues rows created with the text at thebeginning of the file, and also the column names get placed into a rowas well. My question is; how do you import the file and strip out the"non-data" data? so that only the actual data gets inserted into the db?Thanks,TGru*** Sent via Developersdex http://www.developersdex.com ***Don't just participate in USENET...get rewarded for it!
I have the original copies of the .MDF and .LBF of another database that has a bunch of data in that I'm trying to important into MSSQL Express for a project I'm working on. I no longer have a running version of this database just the .MDF and .LBF files and I have no clue what version of MSSQL they were originally created with. If anyone knows a way to import this database please help.
I am trying to import a CSV file into an SQL Server table with the OleDbDataReader and SqlBulkCopy objects, like this: using (OleDbConnection dconn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\mystuff\;Extended Properties="text;HDR=No;FMT=Delimited"")) { using (OleDbCommand dcmd = new OleDbCommand("select * from mytable.csv", dconn)) { try { dconn.Open();
using (OleDbDataReader dreader = dcmd.ExecuteReader()) { try {
using (SqlConnection dconn2 = new SqlConnection(@"data source=MyDBServer;initial catalog=MyDB;user id=mydbid;password=mydbpwd")) { using (SqlBulkCopy bc = new SqlBulkCopy(dconn2)) { try { dconn2.Open(); bc.DestinationTableName = "dbo.mytable"; bc.WriteToServer(dreader); } finally { dconn2.Close(); } } } } finally { dreader.Close(); }
} } finally { dconn.Close(); } } } A couple of the columns for the destination table use a bit datatype. The CSV files uses the strings "1" and "0" to represent these.When I run this code, it throws this exception:Unhandled Exception: System.InvalidOperationException: The given value of type String from the data source cannot be converted to type bit of the specified target column. ---> System.FormatException: Failed to convert parameter value from a String to a Boolean. ---> System.FormatException: String was not recognized asa valid Boolean. at System.Boolean.Parse(String value) at System.String.System.IConvertible.ToBoolean(IFormatProvider provider) at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider) at System.Data.SqlClient.SqlParameter.CoerceValue(Object value, MetaType destinationType) --- End of inner exception stack trace --- at System.Data.SqlClient.SqlParameter.CoerceValue(Object value, MetaType destinationType) at System.Data.SqlClient.SqlBulkCopy.ConvertValue(Object value, _SqlMetaDatametadata) --- End of inner exception stack trace --- at System.Data.SqlClient.SqlBulkCopy.ConvertValue(Object value, _SqlMetaDatametadata) at System.Data.SqlClient.SqlBulkCopy.WriteToServerInternal() at System.Data.SqlClient.SqlBulkCopy.WriteRowSourceToServer(Int32 columnCount) at System.Data.SqlClient.SqlBulkCopy.WriteToServer(IDataReader reader) at MyClass.Main()It appears not to accept "1" and "0" as valid strings to convert to booleans. The System.Convert.ToBoolean method appears to work the same way. Is there any way to change this behavior? I discovered if you change the "1" to "true" and "0" to "false" in the CSV file it will accept them.
Hi, I got one more problem with importing csv files using .net. The problem is that the csv file contains double-quotation marks (""). For example, the record looks like: ...,Bearing Double "D" Flange,... And the result is: ... | Bearing Double | null (all following columns are null) The code is as following: string strCsvConn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=;Extended Properties='text;HDR=Yes;FMT=Delimited(,)';"; using (OleDbConnection cn = new OleDbConnection(strCsvConn)) { string strSQL = "SELECT * FROM " + strFileName; OleDbCommand cmd = new OleDbCommand(strSQL, cn); cn.Open(); using (OleDbDataReader dr = cmd.ExecuteReader()) {while (dr.Read()) { string str = Convert.ToString(dr[8]); } // Bulk Copy to SQL Server //using (SqlBulkCopy bulkCopy = new SqlBulkCopy(strSqlConn)) //{ // bulkCopy.DestinationTableName = strSqlTable; // bulkCopy.WriteToServer(dr); //} } } Any idea is highly appreciated. shz
I want to import multiple text files into a single table. I know I have to use BCP or DTS. But, I want import all files at once, instead of one at a time. And the file names are in sequence, viz. file1, file2, file3 etc. Can anybody tell me, How I can achieve this.
I have to import 18000 text files into a sql database. Each file contains 10 fields and around 5000 records. I am currently doing this with DTS.
What I am wondering is this: Is DTS the most efficient i.e. quickest way to import all this data. Bearing in mind there is about 90 million records to import in all.
I would appreciate the benefit of somebody elses experience when dealing with this type of thing.
I've been trying to import a TXT file into an SQL database and I'm having trouble making it work correctly. It is a ASCII text file with over 100,000 records. The fields vary by the number of characters. This can be 2 characters up to 40 (STATE would be 2 characters, CITY is 32 characters, etc.)
I can import the file with DTS. I go in and select exactly where I want the field breaks to be. Then it imports everything as Characters with column headers of Col001, Col002, Col003, etc. My problem is that I don't want everything as Characters or Col001 etc. I want different column names and columns of data to be INT, NUMERIC(x,x), etc. instead of characters every time. If I change these values to anything than the default in DTS it won't import the data correctly.
Also, I have an SQL script that I wrote for a table where I can create the field lengths, data type etc. the way I want it to look, FWIW. This seems to be going nowhere fast.
What am I doing wrong? Should I be using something else than DTS?
Hi i've 6 files with the same name part but all have a different ext. The 17 always changes to the current week number
xbouns.A17 xbouns.B17 xbouns.C17
I want to import these files into a database table. So I've create a foreach loop and select the foreach file enumerater but am not sure if this is the right way to go about it some help woth be great thanks.
I have over three hundred text files that I need to import to SQLServer.Each is in the exact same format.I want to import tham as seperate tables.Is there any way to do it in one process?Regards,Ciarán
I am a bit new to SQL Server but not to DBA or programming per se. I am having difficulties getting either an Excel or Text flat file to import properly.
I guess it would be best to ask, using either SSIS or BULK INSERT, what options need to be entered for a typical excel flat file?
I have a package with a corresponding configuration file.
If I open Sql server management studio and i connect to SSIS and right click to import a package, and then select to store the package in SQL Server (not on the file system)....
What happens with the configuration file?
Does the import take the values from the configuration file and place them in the package which then is stored in SQL Server?
Or do I need to put the configuration file someplace on the SQL Server where the package is imported so it can access it when it runs?
I'm a bit confused about what goes on there?
For example, I tried using the build command and then running the manifest file to import using the wizard and when it does that it copies the configuration files to a default location within the c:program filesmicrosoft sql server directory.
I need to write a genric CSV importer. one set of CSVs define the formats of all teh other CSVs. The format of the second set of CSVs is not know at the time of creating this project/SSIS package.
Imagine having two sets of CSV files.
Each file in the first set of CSVs define the format of files in the second set of CSVs The first set always have the same format. Each row of the CSV file defines a field with a Name and Type. The first two columns of each row of in the first set of CSVs have a FormatName and index. So a simple file may have:
The second set of CSVs contains records that have to comply with the format define in the first set. So in this aprticular case, a CSV file in the second set may have records such as:
The problem I have is creating a generic SSIS package for this. The first task, loading the first set of CSV file is fairly simple. The CSV format is known. But the second task is a bit trickier.
Assuming I have SQL tables to load the data. One 'Fields' record for each row in the CSVs from the fiirst set. One 'Rows' record for each row in the CSVs from the second set, One 'Values' record for each value in each row in the CSVs from the second set, Something like:
TABLE Fields (
FieldID int IDENTITY, FormatName varchar(100), Position int, ColumnName varchar(100), ColumnType varchar(20) )
- Dynamically create a SSIS package based on the content of each CSV fiel of the first set and execute that package for each file in the second set, selecting the correct package to execute (all records within a CSV file belong to the same format). - Write a single SSIS package that iterates through the rows of the second CSVs, does a lookup for each value of each row to find the field name and make an insert into the DB - Other SSIS method? - Don't use SSIS to parse the CSV and call a custom C# task? - Don't use SSIS at all ?
Is there a way of uploading column definitions from the Flat File Connection Manager into a SQL Server table definition. Since I have over a dozen data sources to process each with around 200 columns and of course like many BI techies I have little immediate influence over the structure of these flat files. I just know that these data sources are business critical.
Judging by looking at similar threads I can't be the only one who would greatly benefit from being able to upload column definitions from the Flat File Connection Manager into a SQL Server table definition as opposed to doing this manually.
I have a folder on an SFTP server in our internal network which gets flat files periodically (almost every minute) from another server. The file structure is like this: Flat File Format•Line 1—List of field names comma separated •Line 2—List of field type comma separated •Line 3—Data comma separated •Line 4—Data comma separated and so on... All the files will be like this without the .csv extention. There are a lot of fields and creating a column for each field manually will be a pain. Heres what I am trying to do: 1. Create a process that runs periodically on the server and imports data from files into the SQL server table. 2. Use the information in line 1 and 2 to create the schema of the table and then ignore those lines in every import (the data should be appended to the existing data). What are my options? Thank you all,Bullpit
Hi, Can anyone help? Need to upload a text file to a sql database but keep getting errors. I'm creating a page that will allow users to to bulk import and update to a MsSql database. The users provide a text file every so often with new/update information. So i want to use a DTS package to transform the infomation, and create a table in the database, then check against existing/non existing records, if the record exist, update it, if not insert it. I'm using Visual Studio.Net, ASP.Net and coding in VB.Net.
Anyone know where i can find documentation/code regarding the above? I will be greatful for any help.