SQL 2012 :: Deleting Large Batches Of Rows - Optimum Batch Size?
Oct 16, 2015
In another forum post, a poster was deleting large numbers of rows from a table in batches of 50,000.
In the bad old days ('80s - '90s), I used to have to delete rows in batches of 500, then 1000, then 5000, due to the size of the transaction rollback segments (yes - Oracle).
I always found that increasing the number of deleted rows in a single statement/transaction improved overall process speed - up to some magic point, at which some overhead in the system began slowing the deletes down, so that deleting a single batch of 10,000 rows took more than twice as much time as deleting two batches of 5,000 rows each.
good rule-of-thumb numbers (or even better, some actual statistics and/or explanations) as to how many records should be deleted in a single transaction/statement for optimum speed? 50,000 - 100,000 - 1,000,000 or unlimited? Are there significant differences between 2008, 2012, 2014?
View 9 Replies
ADVERTISEMENT
Sep 21, 2015
I have deleted nearly 30 million rows from a table. But however when I used the sp_spaceused command to calculate the data occupied by the table I don't see any difference in the data size of the table. In fact the data has increased to few MBs after the deletion, but not much.
View 8 Replies
View Related
Sep 14, 2007
I have a table with about 10 million rows, its been logging data incorrectly and I want to start again.
DELETE FROM tblLog
I just get a message about the server timing out, what can I do?
Thanks
View 5 Replies
View Related
Jan 26, 2004
Hi,
I have some problems with our database which is growing too large, and was hoping someone might have some tips on what I can do!
I have about 100 clients, each logging about 10 000 rows of status logs a day. So after just a few days the db is growing very large.
At present it's manageable, since I don't need to "dig" into the logs more than a few times a day. The system it self is not affected by the size of the log or traffic on the server. But it will increase to about 500 clients in 2004, and 1000-1500 in 2005. So I really need a smarter solution than what I have today to be able to use the log efficiently.
98-99% of these rows are status-messages which are more or less garbage during normal operation. But I still need to keep them in case an error occurs, and we need to go back an hour or two (maybe a day) to see what went wrong. After 24-48 hours these 98-99% are of no use. I do however like to keep the remaining 1-2%, they are messages like startup, errors, etc. Ideally they should be logged in two separate tables by the clients, but unfortunatelly I cannot make the clients change their logging.
This presents problems on multiple levels. Mainly in searching, which often times out, but also with backup and storagespace. At the moment I check the system for errors, and every other day I just truncate the log-file. It works, but it's not exacly elegant......
The server is a 1100 MHz P3 / 512MB / Windows 2000 Server /
SQL Server 2000. Faster hardware would help, but the problem is more of a "bad design" than "slow hardware" problem.
My log is pretty simple, as follows:
LogId - int - primary key - clustered index
ClientId - int - index asc
LogTypeId - int - index asc
LogValue - nvarchar[2500], ikke index
LogTimeStamp- datetime - index asc
I have deducted 3 different solutions:
Method 1:
Simply run "Delete from db_log where logtyipeid <> stuff_I_want_to_keep".
This is the simplest and the one i prefer, but it takes too long time to complete. Any tips to speed this process up?
Method 2:
Create a trigger which runs something like "Delete from db_log where logtypeid <> stuff_I_want_to_keep and date < today_minus_two_days" every hour or so. This will ensure that the db doesn't grow to large. But if I'm away from work a few days we might loose data we'd wanted to keep.
Method 3:
Copy what I want to keep into another table, and empty the log. Sort of like "Insert into db_log_keep stuff_to_keep; drop db_log; create table db_log; " (or truncate, but that takes a long time too)
But then I would be stuck with two log tables, "48-hour_db_log" and "db_log_keep". I could use a view to "union" them so they would appear as a single table, but that's not ideal either.
However, it seems as this method is what will work best for my set-up, unless there are other suggestions??
Method 4:
...eagerly awaiting ideas!!! :-)
(Also, whatever tips and/or links to info on maintaing VLDB's are greatly appreciated. )
Thanks in advance for your help! :-)
Nikolai
View 4 Replies
View Related
Jun 16, 2015
I have a table with about 466 Million rows. In this table there is a int column called WeeksToRetain as well as a EventDate column containing the date the row was inserted. I am trying to delete all the rows that that should be deleted according to the WeeksToRetain. For example, if the EventDate is 5/07/15 with a 1 in the WeeksToRetain column the row should be removed by 5/14/15. I am not sure what days SQL considers the beginning and end of the week. However the core issue I am having is the sheer mass of deletions I must do and log growth.
So I am trying to do the delete in batches. More specifically I want to load a temporary table with a million rows, then use the temporary table to load a sub temporary table with 100,000 rows and join this temporary table to the table I want to delete from looping through 10 times to get 1 million. The Logging.EvenLog table which is the table I'm trying to purge has a clustered index on EventDate (ASC). I would like to run this in a schedule job with enough time between executions for log backups to run.
DECLARE @i int
DECLARE @RowCount int
DECLARE @NextBatchDate datetime
CREATE TABLE #BatchProcess
(
EventDate datetime,
ApplicationID int,
[Code] .....
View 9 Replies
View Related
Feb 19, 2015
We have a database. It is enabled for mirroring. We need to delete the old records. That is around 500k records from a table. But it has foreign key relation. How to do in Production servers these kind of deletes?
View 2 Replies
View Related
Feb 11, 2015
My log file size is of 5 GB, I just wanna reduce this to some extent without adopting shrinking method. So is there any way to do the same ?
View 1 Replies
View Related
Jun 19, 2015
I want to include GO statements to execute a large stored procedure in batches, how can I do that?
View 9 Replies
View Related
Jul 28, 2015
Running SQL 2012 SP2
I've got this query that runs in 30 seconds and returns about 24000. The table variable returns about 145 rows (no performance issue here), and the TransactionTbl table has 14.2 Million rows, a compound, clustered primary key, and 6 non-clustered indexes, none of which meet the needs of the query.
declare @CltID varchar(15) = '12345'
declare @TranDate datetime = '2015-07-25'
declare @Ballance table
(Ledger_Code varchar(4),
AssetID varchar(32),
CurrencyID varchar(3) )
[Code] ....
Actual execution plan shows SQL is doing an index seek, then a nested loop join, and then fetching the remaining data from the TransactionTbl using a Key Lookup.
I designed a new indexes based on the query, which when I force it's usage via an index hint, reduces the run time to sub-second, but without the index hint the SQL optimiser won't use the new index, which looks like this:
CREATE INDEX IX_Test on GLSchemB.TransactionTbl (CltID, Date) include (Ledger_Code, Amount, CurrencyID, AssetID)and I tried this:
CREATE INDEX IX_Test on GLSchemB.TransactionTbl (CltID, Date, Ledger_Code, CurrencyID, AssetID) include (Amount)and even a full covering index!
I did some testing, including disabling all indexes but the PK, and the optimiser tells me I've got a missing index and recommends I create one EXACTLY like the one I designed, but when I put my one back it doesn't use it.
I though this may be due to fragmentation and/or stats being out of date, so I rebuilt the PK and my index, and the optimiser started using my index, doing an index seek and running sub-second. Thinking I had solved the problem I rebuilt all the indexes, testing after each one, and my index was used BUT as soon as I flushed the related query plan, the optimiser went back to using a less optimal index, with a seek and key lookup plan and taking 30 seconds.
For now I've resorted to using the OPTION (TABLE HINT(G, INDEX(IX_Test))) to force this, but it's a work around only. Why the optimiser would select a less optimal query plan?
View 8 Replies
View Related
Aug 18, 2014
SQL 2012
I have a source table in the staging database stg.fact and it needs to be merged into the warehouse table whs.Fact.
stg.fact is not a delta feed it is basically an intra-day refresh.
Both tables have a last updated date so its easy to see which have changed.
It will be new (insert) or changed (update) data that I am interested in, there are no deletions.
As this could be in the millions of rows that are inserts or updates then this needs to be efficient.
I expect whs.Fact to go to >150 million rows.
When I have done this before I started with T-SQL Merge statement and that was not performant once I got to this size.
My original option was to do this is SSIS with a lookup task that marks the inserts and updates and deal with them seperately. However I set up the lookup tranformation the reference data set will have a package variable in the SQL commnd. This does not seem possible with the lookup in 2012! Currently looking at Merge Join transformation and any clever basic T-SQL that could work as this will need to be fast, and thats where I think that T-SQL may be the better route.
Both tables will have >100,000,000 rows
Both tables have the last updated date
The Tables are in different databases but on the same SQL Instance
Each table holds 5 integer columns, one Varchar, one datatime
Last time I used Merge it was a wider table with lots of columns so don't know if this would be an option.
View 6 Replies
View Related
Nov 6, 2014
I have a production table with 400 million rows.
I have a staging table which has 48 million rows. This data is the same as the production data, except one column has a different value.
Create Table Production
(
Id Int Identity(1,1),
Code Varchar(20),
ReferenceSequence int
)
-- Staging Table
Create Table Staging
(
Code Varchar(20),
NewSequence int
)
I need to update the production table with the newSequence value from staging to replace the ReferenceSequence. I.e:
Update Production
Set ReferenceSequence = Staging.NewSequence
From Staging
where Production.Code = Staging.CodeHowever, updating 48 million rows at once will generate a lot of logging!
How can I do 1 million rows at a time, commit the changes then do the next million?
I've tried some of the examples on the following page [URL], but they look to just update the tables with the same values.
View 4 Replies
View Related
Jul 18, 2014
/****** Object: StoredProcedure [dbo].[dbo.ServiceLog] Script Date: 07/18/2014 14:30:59 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [dbo].[ServiceLogPurge]
-- Purge records dbo.ServiceLog older than 3 months:
-- Purge records in small portions to avoid locking production tables
-- for a long time. The process takes longer, but can co-exist with
-- normal usage of the tables.
[Code] ...
*** Getting this error below when executing the code ***
Msg 102, Level 15, State 1, Procedure ServiceLogPurge, Line 45
Incorrect syntax near 'Failed:'.
View 9 Replies
View Related
Nov 10, 2014
I have 2 tables with this schema
CREATE TABLE tableValues(
[LASTENCRYPTIONDT] [datetime] NULL,
[ENCRYPTIONID] [int] NULL,
[NAME] [varchar](50) NULL
[Code] ....
I want to update tableToUpdate in batches of 5000 per batch and set the lastenecryptionDT to null based on the the join to the tableValues using the column ENCRYPTIONID, and also output updated rows into another table. Incase I would need to do a rollback.
View 3 Replies
View Related
Sep 20, 2006
I have asked this question before and got some great answers, I just wanted to ask this again. I can detect dups easily, is there a way to get a total number of all dups so that I can delete a certain number at a time, then check the total again for verification that that specific number of dups are gone? I'm still using 2000. If I delete dups, won't it delete the 1 row I want to keep? And I do have a unique identity column.
thx,
Kat
View 3 Replies
View Related
Nov 15, 2007
I've got a table with a unique column, "id". I've got the id values of about 300,000 records. These records need to be DELETEd from this table. Is there a way to do this in batch? I can't imagine the only way to do it is:
DELETE FROM Table WHERE id = 1 OR id = 2 OR id = 3... OR id = 300000
View 10 Replies
View Related
Aug 22, 2006
I know how to detect & delete dups/or >dups in test with a select clause, this works fine in a small table, but if the table has a million rows say, it sounds like a proc would be faster: my question is: How do I display those rows in a proc for detecting what the problem is. The print stmt. doesn't seem to work and I wondered if I had to go through the process of building an output stream. The proc creates okay but I'm stuck after that part.
thx
Kat -- very rough code below
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO
create proc dupcount
@count int
as
set nocount on
select categoryID,
CategoryName,
Count(*) As Dups
from Categories
group by Categoryid, CategoryName
having count(*) >1
set @count = @@Rowcount
print convert(varchar(30),CategoryID) + ' ' + convert(varchar(30),@count) + CategoryName
/*set @count = @@RowCount
IF @count > 1
print convert CategoryID, CategoryName, convert(varchar(30),@count)*/
View 18 Replies
View Related
Nov 14, 2006
I have transactional replication set up between 2 SQL Server 2005 databases on 2 different boxes. Both the log reader and distribution agent run in continuous mode. The distributor is residing on a third SQL Server box. We are having performance issues with replication when there are large batch deletes/inserts happening on the publisher. There is a batch job that runs for about 8-10 hours everyday on the publisher and deletes/inserts thousands of records as part of transactions. The amount of time to replicate all this data on the subscriber is around 13-15 hours which is not acceptable to our user community. While monitoring I found that the distribution database (MSRepl_commands table) at times has millions of records in it which would explain why the latency is so high. Add to it the fact that there are large transactions occuring on the publisher.
I was wondering if anyone has faced similar problem before. Are there any conifguration changes I can make to the replication infrastructure to reduce latency?
Would appreciate your help.
Thanks.
View 4 Replies
View Related
May 18, 2001
I need to delete data from a particular table which has more than half a million records. The data needs to be deleted is more than 200,000 records from the table. What is the best way to delete the data from the table other than importing into a temporary table and performing the same operation?
Let me know if the strategy to be followed is okay.
1. Drop all the triggers
2. Drop all the indexes
3. Write a procedure with a loop setting ROWCOUNT to 1000 and delete the records. ( since if I try to delete all the rows it will give timeout error )
The above procedure will delete 1000 records for each batch inside the loop till it wipes out all the data for the specified condition.
4. Recreate Indexes and Triggers.
Please let me know if there are any other optimal solution.
Thanx,
Zombie
View 2 Replies
View Related
Mar 31, 2015
How do you prevent this:"If you import a very large number of rows, dividing the data into batches can offer advantages. After each batch complete, the transaction is logged. If, for any reason, a bulk-import operation terminates before completion, only the current transaction (batch) is rolled back." How can I let the whole batch rollback, instead only the current transaction (batch) rolling back?
View 1 Replies
View Related
Jun 6, 2004
Hi
I installed " Web Wiz Forum ASP SQL 2000 DB "
it work fine but
when i added some data in the forum for example my db size is 1.45 MB
after i delete those data the db size will not decrease
is there any code that i must enter on the sql server setup file
Excuse me i asked this question in the web wix forum site but they don`t
answer me
if know what i must to do plz tell me
Thanks
View 6 Replies
View Related
Jul 20, 2006
We are using SQL Server 2000 Standard ed, sp4 on Server 2000 Advanced.We have one table that is three times as large as the rest of the database.Most of the data is static after approximately 3-6 months, but we arerequired to keep it for 8 years. I would like to archive this table (A), butthere are complications.1. the only way to access the data is through the application (they areimages produced by the application-built on Power-Builder)2. there are multiple tables refrencing this table and vise-versa3. we restore the entire db to two other servers for testing and trainingregularly4. there might be more complications that have not been thought ofCurrently, our only plan is to setup a seperate server with a copy of this dbon it and the application. Leave only the tables necessary to access the data,and if this 'archive' works, remove from production the data from the table Aand all references to the table A from rows on the other tables.I mentioned #3 because someone mentioned a third party tool that may be ableto pull the data from the table, archive it elsewhere, and at the same time,place a 'pointer' in the table to the new storage location. The tool theymentioned only works on Oracle and we have not explored beyond that yet.I am ready to explore ideas and suggestions; I am still new to the DBA world,I am out of ideas.Thank you!--Message posted via SQLMonster.comhttp://www.sqlmonster.com/Uwe/Forum...eneral/200607/1
View 1 Replies
View Related
May 24, 2007
No transaction log involved, only the table itself.
Use sp_spaceused "table_name" to check the space used.
It seems the table size actually increased from the beginning to the middle of deletion, at the end of deletion, its size decreased.
Recovery mode set to be simple, autoshrink turned on.
The tables tested are about 50MB ~ several GB in size, all have the same behavior. The size increased about 5%~10%.
Since the deletion is called from another software, I want to know if it is possible for SQL Server to have this behavior or it is absolutely the 3rd party software's issue
Thanks!
View 2 Replies
View Related
Feb 20, 2008
Hi,
I was running out of space and thus deleted some rows from a table. To my surprise the db size increased. I then shrunk it to bring it back to what it was earlier.
When i deleted some 5000 rows, some space must have been released. Where did the space go and why did the db size increase after deleting the records?
I thght it might be log files..but db is set to Simple Recovery which does not utilize a Log File.
Any reasons?
View 6 Replies
View Related
May 24, 2006
Hi
I've been searching this site and the Web for info on an error message I get when importing from Access 2003 into SQL Server 2000.
'Data for Source Column 3('Col3') is too large for the specified buffer size'
A memo field in Access is larger than 255.
I have followed advice about putting the field to the first column. This doesn't work - the error just returns the new column number. In fact, I've tried just importing the first column - no good.
I am wary about making Registry changes as comments on the Web say this doesn't work either.
Does anybody have the solution for this.
Paul
View 6 Replies
View Related
May 9, 2008
I have set up transaction replication between two databases. Data from a table in the first database is replicated to the same table in another database.
The table at the publisher already has some data in it. The table at the subscriber is empty. When the replication is synchronizing, I get the following errors in the replication monitor:
*The process could not bulk copy into table "dbo"."virtualdatalocations_waitingqueues". (Source: MSSQL_REPL, Error number: MSSQL_REPL20037) Get help: http://help/MSSQL_REPL20037
*Field size too large
The table looks like this:
CREATE TABLE virtualdatalocations_waitingqueues (
dataid int ,
personid int ,
queueid int ,
CONSTRAINT FK_vw_dataid
FOREIGN KEY(dataid) REFERENCES datalocations(id) ON DELETE CASCADE ,
CONSTRAINT FK_vw_personid
FOREIGN KEY(personid) REFERENCES persons(id),
CONSTRAINT FK_vw_queueid
FOREIGN KEY(queueid)REFERENCES waitingqueues(id)
);
It used to run fine in the past. I couldn't find any help on google or on forums.
Any help or comments are greatly appreciated.
View 6 Replies
View Related
Feb 24, 2006
I developed a vb app that imports csv data into an sql server db. The original text file is 36.5mb. The db after import is 230mb and the log file is 555mb. Is this normal?
View 8 Replies
View Related
Jan 8, 2001
I recently started using Differential backups. They are working but are growing in size a lot quicker than I expected.
The backups are growing by 2.5GB every day although the total size of all
transaction backups is under 350MB. I would have imagined that the total transaction log backups would be a good indicator of total database changes and therefore the differential backups would approach this figure.
Please explain!
View 2 Replies
View Related
May 24, 2007
Hello,
I have a issue with the drill down. In the report there is drill down in the Amount column. I am trying to pass the customer names in this drill down but there are more than 100 customers for that specific case and drill down is not able to pass all the customers.
Is there any other way to pass the large string in the drill down?
View 2 Replies
View Related
Oct 10, 2006
hi all,
my tlogs at the subscriber are growing very large
irregardless of my recovery mode. any help
using snapshot-push replication
thanks,
joey
View 8 Replies
View Related
Nov 26, 2004
I have a table with approx 5 million rows and 36 columns. It takes approx
4 minutes to delete 1 row. The table has 3 indexes in addition to it's primary key and has twelve foreign key constraints. We are still using sequel 7.
There is a backup run every night as part of the nightly maintenence that
reorg/reindexes and checks the database integrity. Any thoughts?
Thanks
View 8 Replies
View Related
Apr 3, 2007
Hi i'm a learner of sql,
i need help in deleting rows from a table that have multiple occurences of a value.
the table has a column named product_id , i need to find out what rows have the product_id value repeated ( i.e not distinct)
how do i do that help please.
View 1 Replies
View Related
Apr 21, 2008
I have an issue that I don't know how to figure out.
I have a table that has a list of files in it. This table also has GUIDs associated with each file.
Some of these files were found in the emporary internet files folder on the systems scanned.
I want to remove all rows where the file path contains emporary internet files
The problem is that there are other tables where the file GUIDs are referrenced.
So when I try this:
DELETE FROM osScanFile
WHERE (FilePath LIKE '% emporary internet files\%')
I get an error saying that "The DELETE statement conflicted with the REFERENCE constraint "blah blah blah". ...
Basically there are rows in other tables that reference the rows I'm trying to delete in this table.
So how do I get it to delete not only the rows with the search criteria in it but also all rows from all other tables where the rows are linked?
Thanks,
Terry
View 3 Replies
View Related
Jun 14, 2001
HI There,
Generally speaking, is it better to use a large or small stripe size for a Raid 5 array (4 drives) ? I would appreciate any specifics also.
Thanks in advance.
Charlie
View 1 Replies
View Related