I read several articles of newsgroup about the bulk delete, and I found
one way is to:
-create a temporary table with all constraints of original table
-insert rows to be retained into that temp table
-drop constraints on original table
-drop the original table
-rename the temporary table
My purge is a daily job, and my question is how this work on a heavy
load operational database? I mean thousand of records are written into
my tables (the same table that I want to purge some rows from) every
second. While I am doing the copy to temp table and drop the table what
happens to those operational data?
I also realized another way of doing the bulk delete is using BCP:
1) BCP out rows to be deleted to an archive file
2) BCP out rows to be retained
3) Drop indexes and truncate table
4) BCP in rows to be retained
5) Create indexes
Again the same question: When I'm doing the BCP is there any insertion
blocking to my original table? What happens to my rows meantime to be
inserted?
Does BCP acquire an exclusive lock on the table which prevents any
other insertion?
Does any one have an experience with a BCP command for querying out 2
million records, and how long will it take?
I've got a large MS Sql Server 2000 database that has 15 indexes, with roughly 180 million rows representing 240 GB worth of data. Due to the massive size of the database we are trying to purge it down to a smaller dataset, about 40 million rows, in order to speed up the query performance and to be able to defrag the indexes (which are 30-50% fragmented). To complicate the matter, this table is also a publisher in a transactional replication setup, with one subscriber. Also, the system needs to be up constantly so I'm only allowed about a 3-5 hour period to take an outage a week.
So far I've tested several methods of delete following all best practices (batch deletes, using indexes in delete's where clause), and have come up with deleting/commiting 500 rows at a time. The problem is that it still takes 3-4 seconds to delete this many rows, on a 8 GB RAM, 4 processor machine that is not currently used or replicated.
I'm at a loss on a way to pare down the data with a delete as the current purge script will take 7 hours a day for about 3 months. Another option I'm considering is to do a truncate and copy the data back over from the replicated database, but again this has its own set of problems, i.e. network latency and slow inset times. Yet another option would be to create a replica of the table on the production db, copy the data to it, then rename the table.
Any one have experience with purging such a massive amount of data? Any help would be greatly appreciated.
I am Crystal Reports Developer and I am new in SSIS environment. I have started to read Professional SQL Server 2005 IS book. I am really confused by many tasks to choose.
I need to develop reports from data warehouse. But before I have to send the data from operational database (SQL Server 2000) to warehouse (SQL Server 2005) monthly - I have a script for retrieving the data. For my package, I chose Data Flow Task, Execute SQL Task, and OLE DB Destination, and it does not work.
Please help me if I can look similar packages performing? Thank you!!
I have a table containing 8 million records. I need to replace 2 million of these records with a scaled down query that goes something like: SELECT 1, ShareholderID, Assets1 FROM MyTable (Yields appx. 200,000 recods) SELECT 2, ShareholderID, Assets2 FROM MyTable (Yields appx. 200,000 recods) . . . SELECT 10, ShareholderID, Assets1 + Assest2 + Assets3 + ... + Assets9 FROM MyTable (Yields appx. 200,000 recods)
Updates and cursors just seem to be too slow.
So far I have done the following, but was wondering if anyone could think of a better way. SELECT 6 million records that don't need to be deleted into a #TempTable Use statements above to select into same #TempTable DROP and recreate Original Table SELECT 6 + 2 million records INTO original table.
This seems rather convoluted. Is there a better approach? Would it be worth while to dump data to a file and use bcp / Bulk Insert
I have received some reports and I have been asked to decide whether these reports can be developed as an operational report or Analytical Reports.
Basically I wanted to understand what points needs to be considered while deciding whether I should go for Analytical reporting (Cubes) or Operation Reporting.
The requirement is: I should allow single row delete from a table but not bulk delete. An audit table should get updated if there is any single delete or single update. So I wrote the triggers as follows: for single and bulk delete
ALTER TRIGGER [dbo].[TRG_Delete_Bulk_tbl_attendance] ON [dbo].[tbl_attendance] AFTER DELETE AS
[code]...
When I try to run the website, the database error I am getting is:Transaction count after EXECUTE indicates that a COMMIT or ROLLBACK TRANSACTION statement is missing. Previous count = 0, current count = 1.
we are trying to delete data from a huge 75 million records tableit takes 4hr to prune datadelete from Company where recordid in (select top 10000 recordid fromrecordid_Fed3 where flag = 0)we have a loop that prunes 10000 records at a time in a while looplet me know if there is a better way to acheive this
I have a fundamental problem with how CDC works for bulk updates.When CDC enabled table is updated for single row - My CDC system tables its recording it as update (3 & 4) which is perfect and what it should be. No Complains!But when I do a bulk update in the same CDC enabled tables for the same columns - My CDC system tables its recording as delete and then insert (1 & 2). This is not correct and this is what my problem is. We used triggers before CDC we did not face this problem with triggers every thing was fine with triggers other than performance.The way how the CDC is handling the bulk update is a big problem for me because based on the output of CDC system tables we are doing some migration work to legacy system.
It will be impossible for me to go and change my migration logic scripts because we have 100's or procedures in it.Is it a know problem with CDC? Is there any solution in CDC when a bulk update happens on a table the CDC system tables record it as updates. I don't think CDC 'net changes' in this situation because the net change would show as single inserted row.If this can't be done with CDC then I have to completely abandon CDC and go back to triggers..
I have a large table with 100 Million records that has around 1 million duplicate records that need to be deleted.
I am running a script that creates a staging table called,DuplicateTable that collects all the duplicates and then I want to write a an effecient delete statement.
Is it possible to write something like:
delete from OrigTable O join DuplicateTable D on O.Key = D.key
Or do I have to run a loop on the DuplicateTable and run a delete statement record by record ?
I have an SSIS package doing a bulk insert from a file. Then later on I'm trying to delete that file (in a file delete task), but I'm getting an error:[File System Task] Error: An error occurred with the following error message: "The process cannot access the file 'xyz' because it is being used by another process.".I'm wondering if there isn't some way to 'tweak' the bulk insert syntax so that it doesn't lock the file?
CDC is creating additional tables under System tables.
What is the performance overhead on the database by creating these tables?
I am going to access the CDC records through one ETL tool. Once read the data I am going to delete the records.
If frequency of changes are more once reading the data there may be few records will be added to the CDC. Is CDC is going to truncate the tables or only read records?
I'm trying to use Bulk insert for the first time and getting the following error. I think it might have something to do with my Format File and from the error msg there's a conversion error for the first column. In my database the Field is nvarchar(6) so my best guess is to use SQLNChar for the first column. I've checked the end of each line is CR LF therefore the is correct for line 7 right?
Msg 4863, Level 16, State 1, Line 1 Bulk load data conversion error (truncation) for row 1, column 1 (ASXCode). Msg 7399, Level 16, State 1, Line 1 The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
BULK INSERTtbl_ASX_Data_temp FROM 'M:DataASXImportTest.txt' WITH (FORMATFILE='M:DataASXSQLFormatImport.Fmt')
We use timed subscriptions to do almost all of our reporting. Reports are delivered (primarily via e-mail and printer) once they are completed and users don't have to "watch the pot boil" so to speak.
Apparently SSRS has some load balancing capability whereby it lets only a limited number of threads/reports run concurrently. We often reach this max and lock ourselves up on some very long-running reports, causing other important reports to wait a long time.
We've added some operational reports (ie. document prints) to the mix. These reports run off of OLTP data. They are very fast and very high priority. Waiting on them is not an option. Is there some way we can get SSRS to work on these operational reports in preference to other types of reports (eg. "just for kicks" reports)? I think we'd almost like to add another SSRS server and dedicate it to the operational reports. Ideally the new SSRS server would use the same Report Server database but would only work on subscriptions for certain documents.
Has anybody else tried to solve this problem? This MS document does really address subscriptions or load balancing by report: http://www.microsoft.com/technet/prodtechnol/sql/2005/pspsqlrs.mspx
I am using Master Data Service for couple of months now. I can load, update, merge and soft delete data in MDS. Occasionally we even have to hard delete data from MDS. If we keep on soft deleting records in a MDS table eventually there will be huge number of soft deleted records. Is there an easy way to hard delete all the soft deleted records from all MDS tables in a specific Model.
I recently configured SQL Server 2012 AlwaysOn Availability group using two nodes - a primary and one secondary read only replica. The group is residing on a windows 2012 cluster with an smb file share as the quorum. I am able to successfully failover through SQL and through the windows 2012 cluster. When I look at the group dashboard on the primary server and view the Operational state of each node I notice an odd value. The secondary role server is listed as Unknown. I also noticed that the Availability replicas node icons in object explorer are displaying the same icon on the primary server but on the secondary server, the primary server is shown as a server with a question mark.
Am I missing a permissions setting or is this normal behavior.
For example:
ServerA is the primary ServerB is the secondary ServerA lists the servers in Object Explorer as:
ServerA (Primary)ServerB (Secondary) ServerB lists the servers in Object Explorer as:
ServerA ServerB (Secondary)
The primary is never listed a primary on the secondary server. Again failovers are working properly, but I want to be sure I am not missing a setting somewhere.
Can anyone provide an expample of bulk copying XML data to a SQL table. I am also looking at using column mapping so that I can map fields and also insert a new GUID into the key of the SQL table. Many thanks
I am currently wrapping up a website upgrade for a client and I am working on a development server/database. The development server/database will become the live version. When the upgrade goes live, I will need to update that database with the latest data from specific datatables (no all of them) in the previously live database, but I don't know how to do a bulk refresh of datatables. Problem: specific datatables (not all datatables) from Database1 need to be updated with the data from Database2. Database1 and Database2 are copies of each other with vast differences in some of the data. Result: All of the current, up-to-date data needs to reside on Database1. Solution: Any ideas? I am using MSSQL 2000 and the databases reside on the same server.
This probably has been addressed before but I was unable to get the search to work properly on this site. I am needing a script/way of deleting all rows from a DB with the exception of one record left for each row that has duplicate column data. Example : Row 1 Field1 = 12345 Field2 =xxxxx Field 3=yyyyy Field4=zzzzz etc. Row 2 Field1 = 12345 Field2 =zzzzzz Field 3=xxxxxx Field4=yyyyyy etc. Row3 Field1 = 12345 Field2 =20202 Field 3=11111 Field4=zzzzz etc. Row 4 Field1 = 54321 Field2 =xxxxx Field 3=yyyyy Field4=zzzzz etc. Etc. Etc.
I want to be able to find the duplicates for Field1 and then delete all but 1 of those rows.( I don't care which one I keep just so only one is left.) The data in the other fields may or may not be unique.
I know how to find the duplicates it's just the deleting part I am having problems with. Any help would be much appreciated. Thanks,
Hi! I'm building a web application. I need to read data from a text or excel file and process the data and then store the result records into database. The record number is big. I can store the data record into database (SQL Server 2005) one at a time. I think it's slow. Is there any way to insert the data in bulk.
I 'am working with SQL Server7.0, and I need to transfer bulk of data(in millions) from aremote database in Rdb to SQL Server. What is the best approach other than using a comma delimited flat file? Is there a way to create a database link and then use a copy script in SQL to copy the data directly? I would appreciate any help. Thank You.
I have imported data in my table using the bulk insert command. I was supposed to fill specific columns of my table with that data so I used a view to put them in the column I wanted.
The table looks like this now:
id | id_param | val_param +-----------+--------------+ 1 | no_tel | 742062141 2 | sex | 1 3 | age | 23 4 | no_tel | 765234157 5 | sex | 1 6 | age | 34
When I want to select only the val_param that is=1 for the id_param=sex using this interogation:
select * from bd_rox where id_param='sex' and val_param='1'
it returns no value and I don`t know why.The wanted result should look like this:
id | id_param | val_param +-----------+--------------+- 2 | sex | 1 5 | sex | 1
I manage a legacy system that dumps it's data into a number of different databases (same schema) on a nightly basis using bulk insert. I need to formulate a strategy for efficiently aggregating that data into a single database right after these nightly extractions complete. Here is my current stategy:
1. Duplicate the legacy system's database schema and add an identifier column to specify which database the data loaded from.
2. Each night, delete all records in the table.
3. Each night, for each database:
3a. Set each table's default value to a value that references the current database being loaded.
3b. Use the legacy system's flat files and format files to bulk insert into the database.
3c. Clear the default value.
What other steps would faciliate performance? Dropping and recreating the indexes? Does anyone forsee faults in this strategy?
Any help would be appreciated.I am running a script that does the following in succession.1-Drop existing database and create new database2-Defines tables, stored procedures and functions in the database3-Imports data using bulk insert4-Analyzes data using stored proceduresI would like to improve the performance of the analysis in step 4 bycreating indexes in step 2.Question 1-Are indexes updated when data is bulk inserted? I know they arewhen using normal insert, update, or delete T-SQL but I am not sure aboutbulk insert of data.Question 2-Do I need to update the index statistics in any way or would theybe ready to use in step 4.Thanks,CJ
I have a set of records in application memory seperated by a record terminator ''. I can write the memory stream to a local disk file and call bcp api functions to load the file in to SQL server. But how do I transfer the in memory data directly to the SQL server, without writing to a data file, using ODBC. I am not using any .Net Framework classes in my code. The SQL server and application server(generating the data records) are on two different physical servers connected through network. I am trying to figureout the fastest and efficient way to load the data to SQL server from a remote application server. Thanks for your help.
I need to import (pereodicaly) large ammount of data to my CE database. When tested import on network this take a lot of time. That's why decided to send raw data in ASCII files (because of small size) and to import files to CE database.
Certainly, it's not a problem to write those cli by myself, but it's interesting if someone already did this...
I have to update a field within a table of 60 records or so. Each record has a different field value. it's type varchar. i was given an excel file with the field values and was thinking of a bulk update like bulk insert, but i don't recall that it's possible that way.
Is the only way to create a table, bulk insert, then merge the two tables together with UPDATE?
Just wanted to see if there was an easier way to do it, otherwise i'll take the latter route. Thanks!