Hello, i wanted to give the forum my current process flow to see if i am close or have some more work to do. The object is to import the data as fast as possible without loosing query responsiveness of the search on the web side. Any type of response will be greatly appreciated.
Current Process
I receive multiple product inventory lists form multiple vendors. These inventory lists are in many different formats like .xls format, .txt format, .csv format and .dbf format. My server converts the file format to raw .csv. The table is very large and will consist of millions of rows. Inventory comes in alphanumeric format.
Each of the inventory lists are NOT in the same format. What i mean by this is that there are different header names for the users inventory list than what matches our database table. For Example, a user may have an excel document with the header name "Amt" but our database table field name is "Amount". In order to make this import process automated I make a single mapping file for each user. This mapping file relates the users field header in their inventory file to that of the database table.
Currently I have a process converting all files to .csv format. Every 15 minutes a routine runs that converts the .csv to XML and then performs the import Bulk Insert routine. The process deletes the users entire inventory then imports to a "tempinventory" table, which then updates the inventory table. During this 15 minutes we may get 20 inventory lists to import which could be a half a million records. I have this inventory table indexed using the primary key which is the inventory number. This is the search criteria used (Inventory Number) on the web side. The import routine to SQL works quickly ONLY if Full-Text Index is turned OFF. I assume i need the Full Text Indexing on so the queries of the full inventory lists a fast response on the web side.
Assumed Issues
Right now if I were to turn off the Indexing for each import then we would have slow queries against the database on the web side.
To have one table with millions of rows that is constantly updated as well as queried at the same time is not efficient.
Assumed Corrections
Right now it seems that partitioning the table into 10 numeric partitions, one for each number would leverage the import routine as well as the web side search routine. I would then partition the alpha partition in 1 letter increments so we would have 26 partitions, one for each letter for a total of 36 partitions.
If I have the partitions seperated then it will be easier to update the seperate index as well as perform maintenance on the index, i assume this is correct.
Future Plans
I am upgrading the web side application to .NET 3.5 so we can take advantage of XLINQ and LINQ so our searches from the web side are faster and more efficient.
I am also looking into building a SilverLight application that will allow the user to install the application locally that will take their live database file and send updates to my server so they do not need to send in their inventory file at all and the inventory lists are live. This will alleviate the need to delete the users full listing in order to make a complete update. As stated above sometimes an update may just include updating the amount of only 2 of 40,000 rows.
I was also looking into db4o to see if it would be beneficial as well, http://www.db4o.com/, has anyone worked with this before in a similar manner?
Questions
I would like to make this process much more efficient from the import routine to the search routine. Is setting up the partitions as discussed a stable plan for both routines?
Is the BULK INSERT using XML to SQL the most efficient way of importing the data to SQL?
How would i handle the full text indexing to allow fast import routines without slowing down the web side searches?
After import of new data do i need to "update" the index as well?
What is a good set of "preventative maintenance" standards should i follow when dealing with this many table updates as well as the catalogs and table data?
I know there are benefits of using LINQ when querying the database from the web side but are there any other benefits that would fit into this current process?
As for the SilverLight application would it be beneficial for the user as well as me to have the application poll their database file to find changes and send only the updated values of the list to the server via XML, which is then updated by SQL?
I am unsure what is the best way to make this process as easy and automated as possible giving the user the fastest experience possible when searching from the web side.
Is this a smarter idea so i can track just the changes made by the user on their inventory list instead of importing the same redundant data they have?
Would implementing something like db4o be beneficial for this process, http://www.db4o.com/?
Please let me know if i am way off on this process or if there are some benefits that i am not using in this process. I have been doing a lot of research and this is what i have come up with so i wanted to ask the community what they thought about it as many heads are better than one. Please feel free to rip the process apart to, i take constructive criticism.
I have am having some issues bulk inserting from a flat file (CSV) to the database. I have also tried this by using the import and export wizard and get the following error:
I dont understand what the issue. The table that i have created looks like this:
CREATE TABLE IderaPatchAnalyzer ( IP_Adresse varchar(64) NOT NULL, Release_ varchar(50) NOT NULL, Level_ varchar(50)NOT NULL, Edition_ varchar(50) NOT NULL,
[Code] .....
I have in the changed the outputcolumnwidth in Ip_Adresse to 64. The length of the cells are not near 50 however i want it to be sure that its not the case. When I try to do the same in my SSIS project, i also get an error. I do get a warning: Truncation may occur due to inserting data from data flow column """"KB Available""" with a length o..... in that column there are max 5 varchar: "yes" and "no". The """"KB Available""" is the column name in the flat file (CSV), I have made checkmark in Column names in the first data row.
I have used the following guide for my SSIS project:
I have an SSIS package doing a bulk insert from a file. Then later on I'm trying to delete that file (in a file delete task), but I'm getting an error:[File System Task] Error: An error occurred with the following error message: "The process cannot access the file 'xyz' because it is being used by another process.".I'm wondering if there isn't some way to 'tweak' the bulk insert syntax so that it doesn't lock the file?
I am using SQL Server Data Tools for Visual Studio 2012. I have a very simple SSIS package with a Data Flow task that exports from an OLE DB Source to a tab-delimited unicode Flat File Destination and a Bulk Insert task that loads from the file. Both the Flat File Destination and Bulk Import are using the same code page. The Bulk Insert task is using the wide char format to read from the file. The process works fine with nvarchar and int columns, but when I add a unique identifier column it fails with "type mismatch or invalid character for the specified code page".
This full version of this reporting table will have about 12 million row for each of three years. Prior years will be in separate partitions and frozen but the current year will be reloaded each night by source_key, probably in parallel.
I am trying to do this with a computed column but I can't slide the partition back into the main table due to an apparent problem with the Check constraint. I have tried everything I can think of and still can't get it to work.
I hope I am missing something simple. Anyone know why this does not work or how to fix it?
ALTER TABLE SWITCH statement failed. Check constraints or partition function of source table 'db_template.dbo.foo_year_source_partition_test_stage' allows values that are not allowed by check constraints or partition function on target table 'db_template.dbo.foo_year_source_partition_test'.
------------------------------------------------------------ CREATE PARTITION SCHEME zzYearSourcePScheme AS PARTITION zzYearSourceRangePFN TO ( [fg_template_0], [fg_template_0], [fg_template_0], [fg_template_0], [fg_template_0] ) go
CREATE TABLE [dbo].[foo_year_source_partition_test]( [detail_date] [datetime] NULL, [source_key] [int] NULL, [year_source] AS ((CONVERT([char](2),right(datepart(year,[detail_date]),2),0)+'-') +right('0000'+CONVERT([varchar](3),[source_key],0),(3))) PERSISTED, [ys_id] int identity (1,1) ) ON zzYearSourcePScheme(year_source) go
create unique clustered index ix_year_source_ys_id on [foo_year_source_partition_test] ([year_source], [ys_id]) ON zzYearSourcePScheme(year_source) go
insert into [foo_year_source_partition_test] values('20060131',2) insert into [foo_year_source_partition_test] values('20060131',3) insert into [foo_year_source_partition_test] values('20060131',4)
SELECT *, $PARTITION.zzYearSourceRangePFN(year_source) AS Partition from [foo_year_source_partition_test] order by detail_date go
CREATE TABLE [dbo].[foo_year_source_partition_test_stage]( [detail_date] [datetime] NULL, [source_key] [int] NULL, [year_source] AS ((CONVERT([char](2),right(datepart(year,[detail_date]),2),0)+'-') +right('0000'+CONVERT([varchar](3),[source_key],0),(3))) PERSISTED, [ys_id] int identity (1,1) ) --on same one ON YearSourcePScheme(year_source)
create unique clustered index ix_year_source_ys_id on [foo_year_source_partition_test_stage] ([year_source], [ys_id]) --ON YearSourcePScheme(year_source)
ALTER TABLE db_template.dbo.foo_year_source_partition_test SWITCH PARTITION 3 to db_template.dbo.[foo_year_source_partition_test_stage]
ALTER TABLE db_template.dbo.foo_year_source_partition_test_stage WITH CHECK ADD CONSTRAINT CK_foo_year_source_partition_test_stage_YearSource CHECK ( [year_source] = '06-003' )
insert into foo_year_source_partition_test_stage values('20060202',3) insert into foo_year_source_partition_test_stage values('20060303',3) insert into foo_year_source_partition_test_stage values('20060404',3) insert into foo_year_source_partition_test_stage values('20060505',3)
ALTER TABLE db_template.dbo.foo_year_source_partition_test_stage SWITCH TO db_template.dbo.foo_year_source_partition_test PARTITION 3
I'm just learning SSIS and I've hit my first bump. I am doing a bulk import from a tab delimited text file to an empty sql table that has a Idendity column defined. How do I tell the bulk insert task to skip that column when inserting from the text file. If I remove the identity column it imports the data fine, but I want to create the indentity column in the table too.
Ok being really new at using SQL server, I have a simple question.
I am trying to use the "Bulk Insert" command to dump a zip code list into my database. Here is my problem.
I found details on the command at http://sqlserver2000.databases.aspfaq.com/how-do-i-load-text-or-csv-file-data-into-sql-server.html but when I create a procedure in the stored procedures section of my database, I cant figure out how to get it to run it.
I created the table, created the stored procedure, and tried to write some code in my web page to run it. But it is not executing.
Hi guys, Consider this Scenario. I have two Tables. Table1-Users Fields are id, name,joindate,designation, status Table2-People Fields are id, name, status The table Users have data in it say 100 records I have to fill it toPeople Table where id=id and name=name and status=status Any Way? Regards, Naveen
I have a table that contains comma delimited text, and I am trying to convert this into another table
eg my target table looks like
Produce|Price|QuantityPerPrice
and my input table contains strings such as
"apples","7.5","10"
"pears","10","8"
"oranges","8","6"
Does anyone have any ideas on how to do this? I am after a solution that does them all at once: I am currently using charindex() to find each column, one at a time, but given the speed of BULK INSERT I would much rather do it as a table. The one solution that I don't want to resort to is to export the table with delimited strings to a data file, then BULK INSERT it...
This is in the context of an ETL process - loading large blocks of data.
I bulk insert a bunch of rows (could be millions, more likely 10's of thousands) into a table, perform some queries and then I need to append those rows into a second table and truncate the first table. From an efficiency standpoint, switching the load table into a partitioning scheme would be best, but I can't use partitioned tables for reasons not relevant here.
So, what's going to be the most efficient solution? I can easily do a simple insert into/select from to copy the rows, but that will be fully logged, and I'd really like a minimally logged solution. Looking at the docs for bulk insert/bulk copy, I can't see a solution that will copy data from one table to another, but I'm suspecting that I'm overlooking something. I could re-load the rows from the client using a second bulk copy, but that seems like a terrible waste (although the client is on the same box, and always will be, so it's not as bad as it might be).
I'm trying to import data from flat file in table and have fewproblems.1.Field Delimiter is ',' (comma). If ',' occurs in quotedstring it is still treated as field delimiter. This is BUG or ?2.In table I have datetime field that can be null, but bulkinsert reports error if in flat file is null or ''. It's OK only whenreal date is specified.Table:create table AttachmentList (Code integer not null,ClassID integer null,Description varchar(200) null,ValidUntil datetime null,constraint PK_ATTACHMENTLIST primary key (Code))flat file.1,13,'Naputak, CU 261098', ''Thanks in advanceDavor
I have a web page that prompts a user to select a csv file. Using a Bulk Insert the data is loaded into a SQL Server 2005 table.
I have been using the Bulk Insert with SQL Server 200 with no problems, but with 2005 I am getting the error "You do not have permission to use the bulk load statement".
My web.config file has the following connection string: [code] <add key="connectionString" value="Server=(local);Database=BroadCastOne;trusted_connection=true" /> [/code]
I've given bulkAdmin role to the ASPNET user. It's still not working. What am I doing wrong?
My current project is creating a social network for the university I work for. One of the features allows members of a group to send a message to all other group members. Currently, I run a foreach loop over each of the group members, and run a separate INSERT statement to insert a message into my messages table. Once the group has several hundreds members, everybody starts getting timeout errors. What is the best way to do this? Here are two suggestions I've received: construct one sql statement that would contain multiple INSERT statements. It would be a large statement like: INSERT into [messages] (from_user, to_user, subject, body) VALUES (@from_user, @to_user, @subject, @body); INSERT into [messages] (from_user, to_user, subject, body) VALUES (@from_user2, @to_user2, @subject2, @body2); INSERT into [messages] (from_user, to_user, subject, body) VALUES (@from_user3, @to_user3, @subject3, @body3); etc... Or, do the foreach loop in a stored procedure. I know the pros and cons of sprocs versus dynamic sql is a sticky subject, and, personally, I'd prefer to keep my logic in the C# code-behind file. What is the best way to do this is an efficient manner? I'd be happy to share some code, if that would help. Thanks for your input!
Here, is the example of Bulk Insert into SQL Server Table. From Application you have to pass a XML string to a Stored Procedure and it will insert all data into table using that XML. Example SP.
CREATE PROCEDURE StoredProcName ( @strXML varchar(8000) ) AS Declare @intPointer int exec sp_xml_preparedocument @intPointer output, @strXML
We use BULK INSERT to load client data into our program. One of our clients uses the character '²' (0xB2) as a field delimiter in their input files. This worked fine in SS2000 but is failing in SS2005. After some testing, it appears that any high-ASCII value has the same problem; if I set the delimiter to anything below 0x80, it works and with any value of 0x80 or higher it fails.
I've verified that the format file we're using is correct for all of the tested delimiter values. (|, , ², ~, and €). The database collation sequence is SQL_Latin1_General_CP1_CI_AS if that matters.
Is there a way I can force acceptance of high-ASCII values as delimiters in SS2005? Do I need to play with the system code pages or the collation sequence?
I used bulk insert to insert a txt file into a table. It works fine. (see code below) Now, one txt file with column's name at first row and has about 200 columns. There is no table created before. How to code to create a destination table based on first row of the txt file so that bulk insert will work for that txt file?
BULK INSERT #MBRACCT FROM 'c:order.TXT' WITH ( FIELDTERMINATOR = '|', FIRSTROW = 2, ROWTERMINATOR = '' )
I have a file I'm trying to do some non-set-based processing with. Inorder to make sure I keep the order of the results, I want to BULKINSERT into a temp table with an identity column. The spec says thatyou should be able to use either KEEPIDENTITY or KEEPNULLS, but I can'tget it to work. For once, I have full code - just add any file of yourchoice that doesn't have commas/tabs. :)Any suggestions, folks?--create table ##Holding_Tank ( full_record varchar(500)) -- thisworkscreate table ##Holding_Tank (id int identity(1,1) primary key,full_record varchar(500)) --that doesn't workBULK INSERT ##Holding_TankFROM "d: elnet_scriptspsaxresult.txt"WITH(TABLOCK,KEEPIDENTITY,KEEPNULLS,MAXERRORS = 0)select * from ##Holding_tank
I have Three tables Student,Daily_Attendance_Master and Daily_Attendence_Details.
I want to run sql of insert or update of student attendence(apsent or present) in Daily_Attendence_Details based on Daily_Attendance_Master_Id and Student_Id(from one roll number to another).
If Both are present in table Daily_Attendence_Details then i want to run Updating of attendance from one roll number to another roll number in Daily_Attendence_Details on the basis of Daily_Attendence_Details_Id
And if both or any one is not present i want to run insert of student attendense from one roll number to another roll number in Daily_Attendence_Details.
I give below the structure of three tables Student,Daily_Attendance_Master and Daily_Attendance_Details.
I have a bulk insert situation that would be nice to be able to pull off. I have a flat file with 46 columns that are to go into a table. The table, I want to have a 47th column to be updated later on by means of a stored proc saying if the import into the system was sucessful or not. I have the rowterminator set as '"' thinking that would tell SQL to begin on the next row, leaving the importstatus column null but i still receive an error.
First of all, is this idea possible within this insert statement. Secondly, if so, what would be the syntax to tell the insert statement to skip that particular column. It is the last column listed in the table so it just needs to start on the next row after it inserts the last bit of data in the flatfile.
If this is not possible, is it possible to bulk insert into a temp table?
I saved the result into a csv file and then truncated the table. Now, I am trying to bulk insert the data into the table. So I used:
bulk insert rdb.dbo.scd_event_tab from 'C:userssluintel.ctrdesktopeventtab.csv' with ( codepage = 'RAW', datafiletype = 'native', fieldterminator = ' ', keepidentity, keepnulls ); go
However, I get this error:
Msg 4867, Level 16, State 1, Line 1 Bulk load data conversion error (overflow) for row 1, column 1 (JOB_ID). Msg 4866, Level 16, State 5, Line 1
The bulk load failed. The column is too long in the data file for row 1, column 3. Verify that the field terminator and row terminator are specified correctly.
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 1
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
I am running a set of SQL statements on a SQL server, to insert flat file data into a SQL table. The flat file is already FTP'ed to the SQL server. I seem to be getting an error, which is possibly pointing to a permissions issue
The statements:
BULK INSERT [Jedox_prod].[dbo].[B_BP_Customer] FROM 'c:jedox_dailyjdcom4401.txt' WITH ( FIRSTROW = 2, MAXERRORS = 0, FIELDTERMINATOR = '|', ROWTERMINATOR = ' ' ) GO
The error is : Msg 4861, Level 16, State 1, Line 1 Cannot bulk load because the file "c:jedox_dailyjdcom4401.txt" could not be opened. Operating system error code 3(failed to retrieve text for this error. Reason: 1815)
If it is permissions issue, how do I overcome this?
Hi all,We have an application through which we are bulk inserting rows into aview. The definition of the view is such that it selects columns froma table on a remote server. I have added the servers usingsp_addlinkedserver on both database servers.When I call the Commit API of oledb I get the following error:Error state: 1, Severity: 19, Server: TST-PROC22, Line#: 1, msg:SqlDumpExceptionHandler: Process 66 generated fatal exception c0000005EXCEPTION_ACCESS_VIOLATION. SQL Server is terminating this process.I would like to know if we can bulk insert rows into a view thataccesses a table on the remote server using the "bulk insert" or bcpcommand. I tried a small test through SQL Query Analyser to use "bulkinsert" on a such a view.The test that I performed was the following:On database server 1 :create table iqbal (var1 int, var2 int)On database server 2 (remote server):create view iqbal as select var1,var2 from[DBServer1].[SomeDB].[dbo].[iqbal]set xact_abort onbulk insert iqbal from '\MachineIqbaliqbaldata.txt'The bulk insert operation failed with the following error message:[Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionCheckForData(CheckforData()).Server: Msg 11, Level 16, State 1, Line 0General network error. Check your network documentation.Connection BrokenThe file iqbaldata.txt contents were :112233If the table that the view references is on the same server then weare able to bulk insert successfully.Is there a way by which I should be able to bulk insert rows into aview that selects from a table on a remote server. If not then couldanyone suggest a workaround. I would actually like to know someworkaround to get the code working using OLEDB. Due to unavoidablereasons I cannot output the records to the file and then use bcp tobulk insert the records in the remote table. I need to have some wayof doing it using OLEDB.Thanks in advanceIqbal
Overall goal: Write a Bulk Insert statement using the UNC path of a filetable directory.
Issue: When using the UNC path of the filetable directory in a Bulk Insert Statement, receiving "Operating system error code 50(The request is not supported.)" Looking for confirmation as to whether this is truly not supported.
Environment: SQL Server 2012 Standard. Windows Server 2008 R2 Standard
Hi, I have a data file which consists of data as below, 4 PPU_FFA7485E0D|| T_GLR_DET_11||
While iam inserting into table using bulk insert, this pipe(||) is also getting inserted into the table, here is my query iam using to insert the data using bulk insert.
BULK INSERT TABLE_NAME FROM FILE_PATH WITH (FIELDTERMINATOR = ''||'''+',KEEPNULLS,FIRSTROW=2,ROWTERMINATOR = '''')
I can't use DTS nor DTSwizard as I need to put it in a .sql and run it through a command line via .bat file (it's more for the users).
Each row ends with an EOL character, the fields are all fixed width, but I have a little problem here, some rows are empty but just with a EOL character.
I'm trying to set up a BULK INSERT Format File for some data that I've been sent, which, according to the data documentation, comes in fixed-width format fields (no delimiters except for end-of-row 0D0A) in SQL-Server 2005 Express.
The following is the first line... "7999163 09182003 56586 56477 3601942 1278 22139 1102 113 118 51450 1 1 63535647 10000 7999162 09182003 56586 56477 3601942 1279 22139 1102 113 118 51450 1 1 63535647 10000 "
Looking with a hex editor, all the above whitespace are 20's.
From the documentation, I've constructed the following table...
However... actually running this gives the following error...
Msg 4863, Level 16, State 4, Line 1 Bulk load data conversion error (truncation) for row 1, column 13 (EXISTENCE). Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
Since this is my first time with this, I read the BOL items on Bulk Insert, Format Files, and each of the formatting attibutes, and made up two line "toy" examples for SQLCHAR and SQLINT, including two columns - all worked as expected.
It seemed that only SQLNUMERIC/SQLDECIMAL fell apart.
Even the following trivial example doesn't work for this field of data...
Msg 9803, Level 16, State 1, Line 1 Invalid data for type "numeric". The statement has been terminated.
or
9.0 1 1 SQLNUMERIC 0 10 "/r/n" 1 TRACER_ID ""
which gives this error...
Msg 4832, Level 16, State 1, Line 1 Bulk load: An unexpected end of file was encountered in the data file. Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
Also - there was the DB_CREATE_DATE and DB_UPDATED_DATE CHAR (8) were supposed to be dates in the format of mmddyyy but clearly there is no Date datatype in SQL-Server. I would suppose these need to be converted, but am unsure how. What is clear is that the data was dumped from Oracle in text form,
Any thoughts on this would be greatly appreciated...