Deleting 8 Million Rows From Database
Jul 16, 2013i am deleating 8 Million rows from my database,I am wondering how to control T-Log,also I heard something about row lock and table lock
View 4 Repliesi am deleating 8 Million rows from my database,I am wondering how to control T-Log,also I heard something about row lock and table lock
View 4 RepliesI have a requirement to delete 1 Million records from a table having 10 Million data and it's being queried on 24/7 basis (don't have a downtime). how can I achieve that?
View 13 Replies View RelatedDELETING 100 million from a table weekly SQl SERVER 2000Hi AllWe have a table in SQL SERVER 2000 which has about 250 million recordsand this will be growing by 100 million every week. At a time the tableshould contain just 13 weeks of data. when the 14th week data needs tobe loaded the first week's data has to be deleted.And this deletes 100 million every week, since the delete is taking lotof transaction log space the job is not successful.Can you please help with what are the approaches we can take to fixthis problem?Performance and transaction log are the issues we are facing. We trieddeletion in steps too but that also is taking time. What are thedifferent ways we can address this quickly.Please reply at the earliest.ThanksHarish
View 4 Replies View Related/****** Object: StoredProcedure [dbo].[dbo.ServiceLog] Script Date: 07/18/2014 14:30:59 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER proc [dbo].[ServiceLogPurge]
-- Purge records dbo.ServiceLog older than 3 months:
-- Purge records in small portions to avoid locking production tables
-- for a long time. The process takes longer, but can co-exist with
-- normal usage of the tables.
[Code] ...
*** Getting this error below when executing the code ***
Msg 102, Level 15, State 1, Procedure ServiceLogPurge, Line 45
Incorrect syntax near 'Failed:'.
I run the following statement and it will not update beyond 7 million plus rows and I have about 38 million to complete. I keep checking updated row counts and after 1/2 day it's still the same so I know something is wrong because it was rolling through no problem when I initiated it. I need to complete ASAP so it's adding to my frustration. The 'Acct_Num_CH' field is an encrypted field (fyi).
SET rowcount 10000
UPDATE [dbo].[CC_Info_T]
SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v'
WHERE [Acct_Num_CH] IS NOT NULL
WHILE @@ROWCOUNT > 0
BEGIN
SET rowcount 10000
UPDATE [dbo].[CC_Info_T]
SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v'
WHERE [Acct_Num_CH] IS NOT NULL
END
SET rowcount 0
I'm new to using a DB and have a few questions about what I'm trying to do. I have some historical options data and want to place it into a sql express database. (I understand I might need to use a none express version once the db gets to big.) A months worth of data is over 5.5 million rows of data. So six years worth is ~400 million rows. Is it possible to put this into a sql db and be able to search it very fast? I have a months worth in a db now and it is pretty slow. Should I use a new table for each month and then have 6 years * 12 month = 72 tables to increase the search speed? I search by date and stock_symbol and the data looks like this:
Date, Stock_Symbol, Option_Symbol, Strike, BidPrice, AskPrice, Volume, OpenInterest, (and a few others)
The select statement is simple: SELECT * FROM Options WHERE Date = @Date and StockSymbol = @Symbol
Thanks
In our database, we have a very large table that gets updated every morning, start of the day is copying 4 million rows from the fact table from previous date to today's date in the same table and then some other processing. It takes 1 1/2 to 2 hrs to do this. There is a dts package created to copy these rows into temp table and then to this fact table.
This table has more than 200 million rows
Any ideas on how to accomplish this without doing the copy twice and not running into locking problems.
Thanks for any suggestions.
i have a following table
table name : emp_master
empid efname emname elamane efathername emothername deptno edob edoj createdby updateby lastupdatedatetime lastactionperformed
empid is primarykey.
this table contains 20million of records and i want to fire following query on this to get employye all data where eployee is more than 10 year old
select empid ,efname, emname, elamane, efathername, emothername, deptno ,edob ,edoj ,createdby, updateby, lastupdatedatetime ,lastactionperformed
from emp_master
where year(doj)+10 > year(getadate())
this will return approx 10 million rows and taking 18 mins. tune this query what approaches should i take to reduce the time of execution.
Hi,
i have to load 1million rows from database( or flatfile) to the database(or flat file).
which task is used as the best solution for this?
Appreciate any assistance in this regard.
Thanks,
Das
I need to update about 1.3 million rows in a table of mine.
I am getting the data from one of the columns of the same table and
updating the new column.
I am doing this using a cursor which I have put in a stored procedure.
As this is a production table which users might be accessing.It is a
web based application and I can't slow the system down.
So I am willing to run the stored prcedure during off peak hours.
However, do I need to put this in a transaction?
If I did put it in a transaction what type of isolation level should I
opt for?
Data integrity is very important for me and I don't mind to compromise
on the performance.
I am doing this because one of the columns which has "short description"
entry is has become too small for business purposes and we want to increase it's
length from varchar(100) to varchar(150).
As this is SQL 6.5, I can't increase the lenght of the column.
So Iadded a new column and will run the stored proc.
What precautions are to be taken?
This is on a high priority basis and very important too.
Thanks in advance...
Stored procedure code:
USE DB_Registration_Dev
GO
IF EXISTS (SELECT NAME FROM SYSOBJECTS WHERE NAME='usp_update_product' AND TYPE='P')
DROP PROCEDURE usp_update_product
GO
CREATE PROC usp_update_product
AS
DECLARE @short_desc varchar(100)
DECLARE @prod_id int
DECLARE sdesc_curs CURSOR
FOR
SELECT [Product].[product_id] , [Product].[short_description]
FROM Product
OPEN sdesc_curs
FETCH NEXT FROM sdesc_curs
INTO @prod_id, @short_desc
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE Product
SET [Product].[sdesc] = @short_desc
WHERE Product_id=@prod_id
FETCH NEXT FROM sdesc_curs
INTO @prod_id, @short_desc
IF @@FETCH_STATUS <> 0
PRINT ' Finished Updating the table...go ahead and have fun ...! '
END
DEALLOCATE sdesc_curs
GO
Hi, I used the /e in my bcp code. yet did not get all the rows from the main frame into the sql talbes... here is the case I have 11 million rows in an ftp server I use this code to bcp into sql server can anyonecheck if this code is good for the process, I am missing one million row in the bcp process and do not know why??? I put the /e to see if there is any error but could not see any error file in my hard drive?
Please check it out and let me know
regards
Ali
Exec master..xp_cmdshell "bcp dbname..tablename in c:ftprootNbtorder.txt /fd:ftprootformatfileablename.fmt /Servername /Usa /Password /b250000 /a8000 /eerrfileORD"
I'm looking for some performance assistance on updating a column value in a table that contains approximately 50 million rows. I have a permanent table in another database that has the key column and value to be set. My query is listed below, but I'm afraid it will run quite awhile. Any suggestions would be appreciated.
update mytable
set column2 = b.column2
from mytable as a
join mytable1 as b
on a.column1 = b.column1
There is a one to one relationship between the two tables.
Need help, I am managing a Data Warehouse (80 G.B Database Size), I purge older than 6 months data from a table which has more than 140 Million rows on daily basis. The daily data load performance is degrading. The table has no clustered indexes (only non-clustered indexes).
Tried dropping and rebuilding the non-clustered indexes, didn't work.
One way to solve the problem is drop the non-clustered index, bcp out the data, truncate the table and bcp in the data and rebuild the non-clustered indexes. This is too risky and taking 14 hours to bcp out the data.
This was not the issue in SQL Server 6.5, because SQL 6.5 always insert new record indexes at the end of the heap link (heap = non-clustered indexes without clustered index). In contrast, SQL Server 7.0 first checks for available space in existing pages by using percent free space pages (this is where it is killing the performance ).
Thanks for your help!
I have 1+ CSV files (using a foreach loop) which I'm doing a lot of transform work on and then inserting into a SQL database table.
Each CSV file usually contains about 2 days worth of data (contains date stamps) - somewhere in the region of 60k records per day.
The destination table currently contains 3 million+ rows and will get bigger.
I need to make sure that before inserting into the destination table, the data doesn't already exist.
I've read the following article: http://www.sqlis.com/311.aspx
While the lookup method works, it takes ages and eats up memory as it caches the 3m+ records before running for each CSV. Obviously this will only get worse as the table grows in size.
To make things a little more efficient what I'd like to do, is first derive the dates I'm dealing with in the current file - essentially storing the max(date) and min(date) in variables. Then in the lookup SQL use those vars, to reduce the amount of data that needs to be brought into the transformation to check against before inserting into the destination table.
Lookup SQL eg. SELECT * FROM MyTable WHERE Date BETWEEN varMinDate AND varMaxDate.
Ideally I'd use an aggregate transformation and then use the subsequent output from that either in the lookup query or store the output in vars, but I don't think you can do that and I get the feeling I'm approaching this with the wrong mindset.
Any thoughts would be great!
1) I write a select statemnet
select a.columname1, b.columname1
from table1 a,table2 b
where a.columname2 = b. columname3
How do i compare
a.columname2 <--> int type column
while b. columname3 <--->varchar type
how should i use convert function
2) whats the best way to import 1.5 million rows ?
How about text file
I am doing a performance testing for In-memory option is sql server 2014. As a part I want to insert 500 million rows of records into a in-memory enabled test table I have created.
I need a sample script to insert 500 million records into a table ....
I have been tasked with writing an update query to update a table with more than 150 million rows of data. Here are the table structures:
Source Tables :
OC
CREATE TABLE [dbo].[OC](
[OC] [nvarchar](255) NULL,
[DATE DEBUT] [date] NULL,
[DATE FIN] [date] NULL,
[Code Article] [nvarchar](255) NULL,
[INSERTION] [nvarchar](255) NULL,
[Code] ....
The update requirement is as follows:
DECLARE @Counter INT=0 --This causes the @@rowcount to be > 0
while @@rowcount>0
BEGIN
SET rowcount 10000
update r
set Comp=t.Comp
[Code] ....
The update took more than 48h and didn't terminate , how to accelerate it ?
I have a row that is being used log track plays on our website.
Here's the table:
CREATE TABLE [dbo].[Music_BandTrackPlays](
[ListenDate] [datetime] NOT NULL DEFAULT (getdate()),
[TrackId] [int] NOT NULL,
[IPAddress] [varchar](20)
) ON [PRIMARY]
There's a CLUSTERED INDEX on ListenDate ASC and a NON CLUSTERED INDEX on the TrackId.
I have a TRIGGER on the Music_BandTrackPlays table that looks like the following:
CREATE TRIGGER [trig_Increment_Music_BandTrackPlays_PlayCount]
ON [dbo].[Music_BandTrackPlays] AFTER INSERT
AS
UPDATE
Music_BandTracks
SET
Music_BandTracks.PlayCount = Music_BandTracks.PlayCount + TP.PlayCount
FROM
(SELECT TrackId, COUNT(*) AS PlayCount
FROM inserted
GROUP BY TrackId) AS TP
WHERE
Music_BandTracks.TrackId = TP.TrackId
When a simple INSERT statement is done on the Music_BandTrackPlays table, it can take quite a long time. When I remove the TRIGGER the INSERTs are immediate. The Execution plan for the TRIGGER shows that a 'Inserted Scan' is taking up most of the resources.
How exactly is the pseudo 'inserted' table formed?
For now, I think the easiest thing to do is update my logging page so it performs 2 queries. One to UPDATE the Music_BandTracks table and increment the counter, and perform the INSERT into the Music_BandTrackPlays table seperately.
I'm ok with that solution but I would really like to understand why the TRIGGER is taking so long. The 'inserted' pseudo table will be 1 row 99% of the time. Does SQL Server perform a table scan on all 20 million rows in order to determine what's new and put it in the inserted pseudo table?
Thanks!
IF NOT EXISTS (SELECT TOP 1 1 FROM dbo.syscolumns WHERE id = OBJECT_ID(N'dbo.Employee) and name = 'DoNotCall')
BEGIN
ALTER TABLE [dbo].[Employee] ADD [DoNotCall] bit not null Constraint DoNot_Call_Default DEFAULT 0
IF ( @@ERROR <> 0 )
GOTO QuitWithRollback
END
It just takes a LOT of time in SQL Server Management studio. I have to cancel the query and cancelling takes a whole lot time. I am using SQL Server 2008.
I have a table with approx 5 million rows and 36 columns. It takes approx
4 minutes to delete 1 row. The table has 3 indexes in addition to it's primary key and has twelve foreign key constraints. We are still using sequel 7.
There is a backup run every night as part of the nightly maintenence that
reorg/reindexes and checks the database integrity. Any thoughts?
Thanks
Hi i'm a learner of sql,
i need help in deleting rows from a table that have multiple occurences of a value.
the table has a column named product_id , i need to find out what rows have the product_id value repeated ( i.e not distinct)
how do i do that help please.
I have an issue that I don't know how to figure out.
I have a table that has a list of files in it. This table also has GUIDs associated with each file.
Some of these files were found in the emporary internet files folder on the systems scanned.
I want to remove all rows where the file path contains emporary internet files
The problem is that there are other tables where the file GUIDs are referrenced.
So when I try this:
DELETE FROM osScanFile
WHERE (FilePath LIKE '% emporary internet files\%')
I get an error saying that "The DELETE statement conflicted with the REFERENCE constraint "blah blah blah". ...
Basically there are rows in other tables that reference the rows I'm trying to delete in this table.
So how do I get it to delete not only the rows with the search criteria in it but also all rows from all other tables where the rows are linked?
Thanks,
Terry
I have a csv file that I need to import daily into a SQL Server 2005 table. Much of the table contents could just be overwritten with the new csv file, however there are a set of Rows within the table that need to be appended to , rather than overwritten.
There is no Primary Key in the csv file that can be used.
I'm not sure this is the best approach, but what I have been trying to do, is append the entire csv file to the existing table, and then go back and delete the duplicates.
When I run the Delete, it does delete the majority of the records, but leaves a couple hundred behind. The number left behind varies with each run, can't seem to identify a pattern here. Running the Delete a second time does clean up the rows left behind in the first execution of the Delete, and gives the result I want.
Any thoughts as to why this needs to be run twice? Or is a better approach available?
Here is my code -
SELECT [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time], dupcount=count(*)
INTO temppkgactions
FROM pkgactions
GROUP BY [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time]HAVING count(*) > 1
DELETE TOP (SELECT COUNT(*) -1 FROM dbo.temppkgactions WHERE dupcount > 1 )
FROM dbo.pkgactions
DROP TABLE temppkgactions
Thanks
What I'm trying to do is delete a user and all their related information within the other tables. I'm not wanting to delete the table, just one column with that user and their related information. So my Primary_Key is UserID within the table [alumni] and my three Foreign_Keys are CommentID, PhotoID, and AlbumID within the tables [comments], [photos], and [albums]. Here is some of the code that I have:
<asp:SqlDataSource ID="SqlDataSource2" runat="server" ConnectionString="<%$ ConnectionStrings:SoderquistString %>"
DeleteCommand="DELETE FROM [alumni] WHERE [UserID] = @UserID"
SelectCommand="SELECT [UserID], [UserName], [FirstName], [LastName], [State] FROM [alumni] WHERE ([State] = @State)">
<DeleteParameters>
<asp:Parameter Name="UserID" Type="Int32" />
</DeleteParameters>
<SelectParameters>
<asp:ControlParameter ControlID="DropDownList1" Name="state" PropertyName="SelectedValue"
Type="String" />
</SelectParameters>
</asp:SqlDataSource>
The users are set up in GridView form. Is there some type of DELETE command that I need to be writing that is different than the one above? I have tried adding onto the following DELETE statment:
DeleteCommand="DELETE FROM [alumni] WHERE [UserID] = @UserID
DELETE FROM [photo] WHERE [UserID] = @UserID;
DELETE FROM [album] WHERE [UserID] = @UserID;
DELETE FROM [comment] WHERE [UserID] = @UserID;
...but that doesn't work...and doesn't look right. I would really appreciate anyones suggestions or help that you may be able to provide. Thank you!
Hai
I have problem in deleting duplicate rows. I have a identity column in my table, if I try to use correlatted sub query with Delete command it gives error.
The other problem I have is I have a date column in my table and update that column with current date and time. If use a query to fetch a records on a particular day , it does not return any rows
select * from rates where ch_date >='02/11/99' and ch_date<='02/11/99'
If I use convert also there is some other problems. Is there any way to force date checkings to be done excluding time.
Thanks
This is an imaginary problem while discussing ROWID in ORACLE.
Consider a table without primary key, unique key, uniuqe index.
A row has inserted into the table many times.
I want to delete all but one dulicated rows. With any 'where' clause all rows(duplicated)
will be deleted. In ORACLE i can achieve this using ROWID as follows:
Delete from Table_name
where < all column values >
and ROWID <> ( Select max(rowid) from Table_name where < all column values > )
How can this be achieved in MS SQL Server 6.5 ?
According to Dr. Codd's Golden rules for RDBMS one is that
One should be able to reach each data value in the database by using
table name, row idenfication value and column name.
Does MS SQL Server 6.5 satisfy this requirement ?
Also How many of Dr. Codd's 13 Golden Rules for RDBMS does MS SQL Server 6.5
Satisfy? Which doesn't ?
Any discussion about Codd's Rules is welcome.
- Gunvant Patil
gunvantp@yahoo.com
I have two tables. Table 1 contains a distinct ID(3,4,5). Table 2 contains multiple ID's(3,3,3,3,3,4,4,4,5,5,5). I want to be able to use a join. If an ID is found in table 2, remove all entries of it. Any ideas? Thanks...
delete b.id
from table1 a join table2 b on a.id = b.id
hello there,
what is the code to Delete all rows in a Table????
thanks in advance
In my database, I have a table "tbl_c_extract" that consists of 4 columns that look the following. I'm looking at a daily batch of around 4000 records, of which 150 are likely to be duplicates.
Emp_No varchar(255), Proprietary_ID varchar(255), LeaveDateActual datetime
123456, E123456, 2014-09-27 00:00:00.000
213832, E123456, 2099-12-31 00:00:00.000
213836, E123456, 2014-01-31 00:00:00.000
In the example above, I need to remove 2 of the entries, leaving only the one that with the maximum leave date. In this case, those without a leave date have the 2099 entry.
Using CTE works exactly as I want it to, however SQL Server Agent doesn't seem to like the use of CTE..
Code:
WITH CTE (Proprietary_ID, LeaveDate, RN)
AS
(
SELECT Proprietary_ID, LeaveDate,
ROW_NUMBER() OVER(PARTITION BY Proprietary_ID ORDER BY Proprietary_ID, LeaveDate) AS RN
FROM tbl_c_extract
)
DELETE
FROM CTE
WHERE RN > 1
Hi,
I have a table named "std_attn", where, by some bad coding, lots of duplicated rows have been created. And the table don't have any PK. So Now tell me the way to remove the duplicaies..................
thnx
I need to delete some specific rows in Table1
int1 must have the value 1
int2 must have the value 1
int3 must have the value 0
and now we get to the difficult part - first character in the field [Cost No] have to be different from the letter 'B'. [Cost No] have the datatype varchar(20)
I expected the code to look something like this:
DELETE Table1
FROM Table 1
WHERE ([Int1] = 1) AND ([Int2] = 1) AND ([Int3] = 0) AND ((LEFT([Cost No]), 1) <> 'B')
But I get this error message:
The left function requires 2 arguments
What am I doing wrong and what should the right code look like?
I don't work much with the back end of software development so there is a lot about SQL Server I do not know.
We are building a database. The database will have about 10 tables in it. 3 of these tables will probably have a huge amount of data in them. Specifically each one of the 3 tables will each have about a half a million database records in it. Each record is about 100 characters max in length.(Im am including numbers as characters and summing the individual columns/fields to come up with 100).
Will a SQL server database table with A half a million records in it be possible? We have tried to normalize the database to cut down on the size of the table but it all comes out to about a half a million records per table.
Any help is deeply appreciated.
Bill
Hi folks,
I need to delete the duplicate rows from a table. How to do that in SQL server 7.0 ? If possible write an example, so that it will be much useful for me..
Thanks for ur help..
rgds,
vJ