I've got a table with a unique column, "id". I've got the id values of about 300,000 records. These records need to be DELETEd from this table. Is there a way to do this in batch? I can't imagine the only way to do it is:
DELETE FROM Table WHERE id = 1 OR id = 2 OR id = 3... OR id = 300000
I have asked this question before and got some great answers, I just wanted to ask this again. I can detect dups easily, is there a way to get a total number of all dups so that I can delete a certain number at a time, then check the total again for verification that that specific number of dups are gone? I'm still using 2000. If I delete dups, won't it delete the 1 row I want to keep? And I do have a unique identity column.
I know how to detect & delete dups/or >dups in test with a select clause, this works fine in a small table, but if the table has a million rows say, it sounds like a proc would be faster: my question is: How do I display those rows in a proc for detecting what the problem is. The print stmt. doesn't seem to work and I wondered if I had to go through the process of building an output stream. The proc creates okay but I'm stuck after that part.
thx
Kat -- very rough code below
SET QUOTED_IDENTIFIER ON GO SET ANSI_NULLS ON GO create proc dupcount @count int as set nocount on select categoryID, CategoryName, Count(*) As Dups from Categories group by Categoryid, CategoryName having count(*) >1
In another forum post, a poster was deleting large numbers of rows from a table in batches of 50,000.
In the bad old days ('80s - '90s), I used to have to delete rows in batches of 500, then 1000, then 5000, due to the size of the transaction rollback segments (yes - Oracle).
I always found that increasing the number of deleted rows in a single statement/transaction improved overall process speed - up to some magic point, at which some overhead in the system began slowing the deletes down, so that deleting a single batch of 10,000 rows took more than twice as much time as deleting two batches of 5,000 rows each.
good rule-of-thumb numbers (or even better, some actual statistics and/or explanations) as to how many records should be deleted in a single transaction/statement for optimum speed? 50,000 - 100,000 - 1,000,000 or unlimited? Are there significant differences between 2008, 2012, 2014?
I have a situation where deleting old records is blocking updating latest records on highly transactional table and getting timeout errors from application.
In details, I have one table called Tran_table1 in OLTP database. This Tran_table1 is highly transactional table, it will receive data for insert/update continuously
While archiving 2 years old records from Tran_table1 into Tran_table1_archive in batches(using DELETE OUTPUT INTO clause), if there is any UPDATEs on Tran_table1,these updates are getting blocked and result is timeout errors in application.
Is there any SQL Server hints to avoid blocking ..
I have a table with 370 million rows and 50+ columns. I need to change the data type of a column from character to numeric. Here's what it contains:
40 million with numbers I want to keep, the rest I just want to set to null:
4k with alpha characters 55 million other numbers 275 million empty strings
An alter column statement fails not just on the alpha characters but on the empty strings. So I tried a couple things on a test database to get an idea of the time it would take:
An update statement to clear out the non-numeric data is too slow (~1.5 days, batched 10000 at a time). I think I probably should create a new column anyway though, so I'm going to copy the data to a new table since it would be faster than adding a new column to the original table.
An insert ... select ... takes about 12 hours; adding WITH (TABLOCK) didn't seem to have any effect, and I'm not sure how to batch it. Recovery model is simple.
A select ... into ... only takes about 1 hour, but can't be batched.
Using a 3rd party ETL tool takes about 5 hours, batched.
I wanted to batch it to minimize impact on other queries but primarily the logs. Is there any way to do a fast batched bulk transfer within SSMS?
How do you prevent this:"If you import a very large number of rows, dividing the data into batches can offer advantages. After each batch complete, the transaction is logged. If, for any reason, a bulk-import operation terminates before completion, only the current transaction (batch) is rolled back." How can I let the whole batch rollback, instead only the current transaction (batch) rolling back?
I have a stored procedure that attempts to INSERT @BatchSize number of records at a time into a table. Currently, I have @BatchSize set to load 50,000 at a time. The table I am inserting from has a little over 67,000 records.
When I execute the procedure with NOCOUNT left off, the procedure seems to run indefinitely, and the count of records returned surpasses what I have in the source table. However, only 50,000 records are inserted into the table.
Below is my code:
begin try
--error catching variables:
declare @Error_NumberLocal int ,@Error_MessageLocal varchar(4000) ,@Error_SeverityLocal int ,@Error_StateLocal int ,@Error_ProcedureLocal varchar(200) ,@Error_LineLocal int ,@User_NameLocal varchar(200)
[code]....
What is the problem with the looping structure that would cause this issue?
I must admit I dont know all that much about SQL, which is why I hope someone can show me the light. I have a script almost finished, however I have no idea how to have it trim database entries that are older than, say, 90 days. Any ideas?
I have a table with a load of orphaned records (I know... poor design) I'm trying to get rid of them, but I'm having a brain cramp.
I need to delete all the records from the table "Floor_Stock" that would be returned by this select statement:
SELECT FLOOR_STOCK.PRODUCT, FLOOR_STOCK.SITE FROM PRODUCT_MASTER INNER JOIN FLOOR_STOCK ON PRODUCT_MASTER.PRODUCT = FLOOR_STOCK.PRODUCT LEFT OUTER JOIN BOD_HEADER ON FLOOR_STOCK.PRODUCT = BOD_HEADER.PRODUCT AND FLOOR_STOCK.SITE = BOD_HEADER.SITE WHERE (BOD_HEADER.BOD_INDEX IS NULL) AND (PRODUCT_MASTER.PROD_TYPE IN ('f', 'n', 'k', 'b', 'l', 's'))
I was thinking along the lines of:
DELETE FROM FLOOR_STOCK INNER JOIN (SELECT FLOOR_STOCK. PRODUCT, FLOOR_STOCK.SITE FROM PRODUCT_MASTER INNER JOIN FLOOR_STOCK ON PRODUCT_MASTER. PRODUCT = FLOOR_STOCK.PRODUCT LEFT OUTER JOIN BOD_HEADER ON FLOOR_STOCK. PRODUCT = BOD_HEADER. PRODUCT AND FLOOR_STOCK.SITE = BOD_HEADER.SITE WHERE (BOD_HEADER.BOD_INDEX IS NULL) AND (PRODUCT_MASTER.PROD_TYPE IN ('f', 'n', 'k', 'b', 'l', 's'))) F ON FLOOR_STOCK. PRODUCT = F. PRODUCT AND FLOOR_STOCK.SITE = F.SITE
... but Sql Server just laughs at me: "Incorrect Syntax near the keyword INNER"
Here is the scenario. I'm working with two tables:
Contact1 Conthist
Contact1 contains basic contact information and conthist contains history records for those contacts. Conthist can hold many records related to a single contact1 record.
The link between the two tables is a column called accountno.
I'm trying to delete any records in conthist that have an accountno that does not exist in contact1. The queries that I've tried keep returning conthist records that do actually have a matching accountno.
I have a couple SQL tables that have been appended to daily over the last two years. There is now about 50,000,000 records in the table. Does anyone know the fastest way to delete records before a certain date to shorten these tables? Delete queries and everything else I've tried is taking way too long.
Apparently, deleting 7,000,000 records from a table of about 20,000,000 is not advisable. We were able to take orders at 8:00AM, but not at 7:59.
So, what's the best way of going about deleting a large number of records? Pretty basic lookup table, no relationships with other tables, 12 or so address-type fields, 4 or 5 simple indexes. I can take it down for a weekend or night, if needed.
DTS the ones to keep to another table, drop the old and rename the new table? Bulk copy out, truncate and bring back in? DTS to text, truncate and import back? Other ways?
Never worked with such a large table and need a little experienced guidance.
My Web Host does not provide administrative privilages to the SQL server I have access to. I would like to delete tens of thousands of records from two of my tables without writing to the Transaction Log. Is what I'm trying to do is delete these records quickly without utilizing any of the alotted space my web host has set aside for my transaction log (they give me 50 mb and I go way over that when I run a DELETE statement)
I need a sql statement to delete duplicate records.
I have a college table with all colleges in the nation. I noticed that all of the colleges were listed twice. How do I delete all of the duplicate records.
Here is my table. Colleges ------------------- schoolID - smallint NOT NULL, schoolName - varchar(60) NULL
Can someone help me out with the sql statement??? I'm running SQL Server 6.5.
Hi All, I am having one table named MyTable and this table contains only one column MyCol. Now i m having 10 records in it and all the records are duplicate ie value is 7 for all 10 records.
It is something like this,
MyCol 7 7 7 7 7 7 7 7 7 7
Now i m trying to delete 10th record or any record then it gives me error "Key column information is insufficient or incorrect. Too many rows were affected by update."
What should i do if i want only 4 records insted 10 records in my table? How do i delete the 6 records from table?
I have a problem where records in underlying tables of a dataview are being deleted (seemingly at random)
For example.
CREATE TABLE [Employee] (Id int, Name varchar(50)) CREATE TABLE [Company] (Id int, Name varchar(50)) CREATE TABLE [EmployeeCompany] (CompanyId int, EmployeeId int)
CREATE VIEW [dvEmployee] AS SELECT * FROM [Employee] INNER JOIN [EmployeeCompany] ON [Employee].[Id] = [EmployeeCompany].[EmployeeId]
CREATE TRIGGER [dvEmployeeUpdate] ON [dbo].[dvEmployee] INSTEAD OF UPDATE AS BEGIN UPDATE EmployeeCompany SET Status = INSERTED.Status FROM EmployeeCompany, INSERTED WHERE EmployeeCompany.CompanyId = INSERTED.CompanyId AND EmployeeCompany.EmployeeId = INSERTED.EmployeeId END
Because the column [Status] is a t-sql keyword, does the fact that the trigger contains the line "SET Status = ..." without saying "SET [Status] = ..." mean that I could lose records in the EmployeeCompany table?
Reason I'm asking is we have an already designed database that is littered with columns named the same as sql keywords (almost every table has a [Status] column, and there are many [Password] columns). When using a dataview on these tables, triggers exist that aren't putting the [] around these column names (the same as my dvEmployeeUpdate trigger above), and somehow we are seemingly randomly losing records. It is very rare, and they are getting completely deleted, and it seems to be the tables that contain the keyword columns and are used in dataviews with instead of triggers that don't put [] around the column names. Nowhere in any trigger or stored procedure is there a DELETE FROM on these tables, and the software running on the database uses only the data views, and doesn't directly access the underlying tables.
I've been going through all of the code adding the [], but my question is simply whether or not anyone has heard of this causing the deletion of any records, or whether there may be something else going on that I should be looking into?
help me out on this one. i have 2 text boxes in my page. user enter any number in those two text boxes. i slect that many record randomly from my main table, and put it into two another tables. now the problem is coming in how to delete those records which were randomly selected from main table in main table. for eg main table contains srNo. UswerID 1 abcd 2 trtr 3 tret 4 yghg 5 jjhj
user enters in text box1 '2' and in text box2 `1' so total of 3 random records are selected and put it into two another table say
table1 sr.no UserID 2 trtr
and table2 contains
sr.no. userid 3 tret 5 jjhj
now i want to delete these records which are sr.no 2,3,5 from the main table. how do i do it as user can enter any number in the text box.so writing multiple delete statements would not be possible. how do i write statements or help me with logic.
Hi I wanted to do a delete rows from a group of table. These tables have a common column UserID. I heard that there is something called ondelete cascade. But I don't know how to set it up and utilise it. Could someone tell me how to do it. Or point me to a tutorial which shows how to do it. Thanks
I have a database that is used to store a lot of data. We load the data on adaily basis, several thousand records per day. The Log file is not needed,so whats the best way to delete the records in it and reduce the sizeThanksDerrick
Hi, i need the suggestion here in very familiar db situation ..i have a main table and a primary key of that table is used in many other table as foreign key.If i am deleting a record in a main table,how do i make sure that all the corresponding record in the associated tables,where that foreign key is used, gets deleted too?What are my options?Thanks
Hello all, I have a DTS package set up to import a text file on a daily basis. I need to dump the data in 2 table after 7 days of the last import .this is the code that I have Delete From TblTemp date(Day(-7), CurrentStamp). But for some reason it deleting the data right after it imports it. And it doesn't delete anything out of the other table.
I have a function that opens a connection to an SQL database, issues a SELECT command, and reads the records with an OleDbDataReader. As the records are read, any that match certain criteria are deleted with a DELETE command. Simplified example code is shown below:
Dim dbCmd As OleDb.OleDbCommand = New OleDb.OleDbCommand() dbCmd.Connection = New OleDb.OleDbConnection(UserDbConnString) dbCmd.CommandText = "SELECT * FROM [User] ORDER BY UserID" dbCmd.Connection.Open() Dim reader as OleDb.OleDbDataReader = dbCmd.ExecuteReader(CommandBehavior.CloseConnection) While reader.Read() If reader("SomeColumn") = SomeCalculatedValue Then Dim dbCmd2 As OleDb.OleDbCommand = New OleDb.OleDbCommand() dbCmd2.Connection = New OleDb.OleDbConnection(UserDbConnString) dbCmd2.CommandText = "DELETE FROM [User] WHERE UserID = " + reader("UserID") dbCmd2.Connection.Open() dbCmd2.ExecuteNonQuery() dbCmd2.Connection.Close() End If End While reader.Close()
This code worked well with an MS Access database, but when I changed to SQL Server, I get a database timeout error when attempting to do the DELETE. I suspect the reason is that the connection the reader has open has the record locked so it cannot be deleted.
The SQL connection string I am using is something like this:
UserDbConnString = "Provider=SQLOLEDB; Server=(Local); User ID=userid; Password=password; Database=dbname"
The connection string I used for MS Access included the property "Mode=Share Deny None". I wonder if there is some similar way to tell SQL Server to allow editing of records that are open for reading with an OleDbDataReader.
I wrote a script to archive and delete records rom a table back in 2005 and 2009.
I can't seem to get the syntax right. Any sample script to simply archive and delete records?
This is what I have so far.
DECLARE @ArchiveDate Datetime SET @ArchiveDate = (SELECT TOP 1 DATEPART(yyyy,Call_Date) FROM tblCall ORDER BY Call_Date) --SELECT @ArchiveDate AS ArchiveDate DECLARE @Active bit
Rajarajan writes "Kindly don't ignore this as regular case. This is peculiar. I need to delete one of duplicate records only if they occurs consecutively. eg.
1. 232 2. 232 3. 345 4. 567 5. 232
Here only the first record has to be delete. Kindly help me out.
I loaded one table via SSIS and found that it contained many duplicate records (from the input source). I can create a SQL task to delete them, but I wonder if SSIS offers and task "out of the box" to delete dups?
I use the following function (in the BLL) to delete some records: Public Function DeleteStep4Dashboards() As Boolean Try adpDashboards.DeleteStep4Dashboards()Catch ex As Exception Return False End Try Return True End Function How can I catch the sql database errors when deleting the records goes wrong.
I am having a table where i have the following columns where the date format is dd/mm/yyyy Purchase Description From_Date To_Date------------------------------- --------------- ----------------Desktop 2/2/2007 2/3/2007Mouse 2/1/2007 28/1/2007Laptop 5/1/2008 15/3/2008Speaker 4/1/2008 21/1/2008 My requirement is i need to create a stored procedure which will look for the from_date and to_date values. If the difference is more than 30 days that record should get deleted automatically. How to write the stored procedure? Please provide me with full stored procedure Thanx in advance
In my data archiving process , I would end up deleting hunders records from the production databases but would that help me save some DISK space immediately??? Should I run some DBCC command to get some disk space ?
if SO ..!! What should I do after deleting the records..????