SQL Server 2012 :: Strategy To Delete / Move Millions Of Rows In A Database?
Apr 16, 2015
I am using SQL Server 2012 SE.I am trying to delete rows from a couple of tables (GetPersonValue has 250 million rows and I am trying to delete 50Million rows and GetPerson has 35 Million rows and I am trying to delete 20 million rows). These tables are in TX replication.The plan is to delete data older than 400 days old.
I tried to move data to new tables from the last 400 days and it took me like 11 hours. If I delete data in chunks of 500000 then its taking a long time to rebuild indexes(delete plus rebuild indexes 13 hours). Since I am using standard edition partition wont work.
find ddl below:
GO
CREATE TABLE [dbo].[GetPerson](
[GetPersonId] [uniqueidentifier] NOT NULL,
[LinedActivityPersonId] [uniqueidentifier] NOT NULL,
[CTName] [nvarchar](100) NULL,
[SNum] [nvarchar](50) NULL,
[PHPrimary] [nvarchar](50) NULL,
I run the following statement and it will not update beyond 7 million plus rows and I have about 38 million to complete. I keep checking updated row counts and after 1/2 day it's still the same so I know something is wrong because it was rolling through no problem when I initiated it. I need to complete ASAP so it's adding to my frustration. The 'Acct_Num_CH' field is an encrypted field (fyi).
SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL WHILE @@ROWCOUNT > 0 BEGIN SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL END SET rowcount 0
I have deleted nearly 30 million rows from a table. But however when I used the sp_spaceused command to calculate the data occupied by the table I don't see any difference in the data size of the table. In fact the data has increased to few MBs after the deletion, but not much.
Hello, Currently we are in the process of implementing a sql server database where couple tables will have millions of rows (about 98 millions and will grow) and an asp.net site that will retrieve and sort the data
What will be the best practice installing the database in situation like this one? Do we need a cluster server? Indexing needs to be done in a special way? Thanks in advance.
I have one view which is based on couple of tables. Here is the definition of view. Which are the options i can use to optimize the view for better performance. This is one of the view which causing issue on database.
CREATE VIEW [dbo].[V_Reqs] WITH SCHEMABINDING AS SELECT purchase.Req.RequisitionID, purchase.Req.StatusCode AS Expr2, purchase.Req.CollectionDateTime, purchase.Req.ReportDateTime, purchase.Req.ReceivedDateTime, purchase.Req.PatientName, purchase.Req.AddressOne, purchase.Req.AddressTwo, purchase.Req.City, purchase.Req.PostalCode, purchase.Req.PhoneNumber,
I have a master table and i need to import the rows into the parent and child table.
Master table name is Flatfile_Inventory Parent Table name is INVENTORY Child Tables name are INVENTORY_AMOUNT,INVENTORY_DETAILS,INVENTORY_VEHICLE, Error details will be goes to LOG_INVENTORY_ERROR
I have 4 duplicate rows in the Flatfile_Inventory which i have already inserted in the Parent and child table.
Again when i run the query using stored procedure,its tells that all the 4 rows are duplicate and will move to the Log_Inventory_Error.
I need is if i have the duplicate rows in the flatfile_Inventory when i start inserting into the parent and child table the already inserted row have the unique ID i must identify it and delete that row in the both parent and chlid table.And latest row must get inserted into the Parent and child table from Flatfile_Inventory.
Is there a way to move the Distribution database (that is currently on the publisher server) to a different server without having to reync the subscriptions.
Stepping thru the code with the debugger shows the dataset rows being deleted.
After executing the code, and getting to the page presentation. Then I stop the debug and start the page creation process again ( Page_Load ). The database still has the original deleted dataset rows. Adding rows works, then updating works fine, but deleting rows, does not seem to work.
The dataset is configured to send the DataSet updates to the database. Use the standard wizard to create the dataSet.
cDependChildTA.Fill(cDependChildDs._ClientDependentChild, UserId); rowCountDb = cDependChildDs._ClientDependentChild.Count; for (row = 0; row < rowCountDb; row++) { dr_dependentChild = cDependChildDs._ClientDependentChild.Rows[0]; dr_dependentChild.Delete(); //cDependChildDs._ClientDependentChild.Rows.RemoveAt(0); //cDependChildDs._ClientDependentChild.Rows.Remove(0); /* update the Client Process Table Adapter*/ // cDependChildTA.Update(cDependChildDs._ClientDependentChild); // cDependChildTA.Update(cDependChildDs._ClientDependentChild); } /* zero rows in the DataSet at this point */ /* update the Child Table Adapter */ cDependChildTA.Update(cDependChildDs._ClientDependentChild);
i have a query to delete millions of records. I whant to delete in batches of a 1000. My Select join statement will return millions of records so this takes alot of time how to i select a 1000 records delete everything that his not in those record and loop and not select the same records again.Here is what i have :
DECLARE @i INT WHILE (1=1) BEGIN BEGIN TRAN DELETE TOP(1000) FROM dbo.ABC123 WHERE SUBSTRING(dumbdumb,1,8) NOT IN
The iussue:Sql 2KI have to keep in the database the data from the last 3 months.Every day I have to load 2 millions records in the database.So every day I have to export (in an other database as historical datacontainer) and delete the 2 millions records inserted 3 month + one day ago.The main problem is that delete operation take a while...involvingtransaction log.The question are:1) How can I improve this operation (export/delete)2) If we decide to migrate to SQL 2005, may we use some feature, as"partitioning" to resolve the problems ? In oracle I can use the "truncatepartition" statement, but in sql 2005, I'm reading, it cant be done.This becouse we can think to create a partition on the last three mounts tosplit data. The partitioning function can be dinamic or containing afunction that says "last 3 months ?" I dont think so.May you help usthank youMastino
I’m looking for clearity on partition switching. The idea is to use many BULK INSERT statements into table dbo.X_n in parallel and when BULK INSERT for table dbo.X_n is completed, switch dbo.X_n into dbo.bigdaddy. I think this is the fastest way to upload a couple hundred GB of data.
In learning about partition switching (in part) from The Data Loading Performance Guide under Partition SWITCH, I hear the instructions to say copy the main table exactly to become a target. But in that same step (#1), I read that we need to change the default file group of the target (dbo.X_n) from the default file group. Then it says I need to match indexes and lists the filegroup as something we need to match with the main table.
As an overview of the partition switching strategy, I think the whole point of BULK INSERT with partitioning is to have seperate files (in same group) to enable concurrent uploading where each table has its own file. Once the upload is completed to a table (dbo.X_n) then we do the partition switch into the main table (dbo.bigdaddy). The data we just uploaded doesn’t actually move, just the metadata for it.
“Don’t have the same filegroup on your target as the main table. You must have the same filegroup on your target as the main table.”
i have an application that generate a lot rows from 1 mellion to 2 mellions rows i wana insert this record in MS SQL server in a fast way
i am currentlly loop through this records while it is loaded in dataset building a command text that generate insert query for each row and run it against SQL server
but it takes a lot of time to be finished is there r a way to bulk insert this data?
Hi everyone - I have an ETL package which loads about 10 million rows from SQL 2005 staging tables to new, empty tables (no indexes or constraints) in another SQL 2005 DB to be SWITCHED into the main partioned data tables.
Both databases reside on the same SQL Server instance - it is a dev server so the disk aren't super fast/SAN speeds but it has plenty of RAM/CPU & SCSI disks.
The insert takes about 45 minutes - can I get this working any faster?? or is this typical for 10 million rows?? I've messed about withe the data flow a few times but I can't seem to get any significant improvements.
Any tips anyone??
I perform several lookups on dimensions - these are not cached.
I do query the source table concurrently with different WHERE clauses & run two pipelines processing the data into 2 destination tables.
Would it be better to query the base table once & use a conditional split instead of the two separate queries??
I also mulicast from each pipeline & use a UNION ALL to log some of the rows from each pipeline to anther destination table.
Hope this makes sense?? Any ideas or tips on how I can speed up this kinda transform would be appeciated..
I'm using oleDB connections.
Hope ths makes some kinda sense!! Thanks for any advice!!
Hello,Currently we are in the process of implementing a sql server database where couple tables will have millions of rows ( about 98 millions and will grow) and a web site that will retrieve and sort the data ( read only). How asp.net gridview and sqldatareader act situation like that? Will it be a very slow response? Is there any alternative? Is there any example on the net? Assuming tables are well tuned and well indexed. Thank you in advance.
I have a DataTable in memory and I want to write a C# code to dump the data into a SQL database. Is there a faster way of dumping millions of rows into a SQL table besides running INSERT INTO row by row?
Question A : I need to truncate a table, it has 21 millions of rows and it has a size of 14 GB.
1- How do I find out if this table is not being referenced by a FOREIGN KEY? 2- Does it Participates in a indexed view? 3- Is being published by using transactional replication or merge replication?
I am having one querry regarding the same line. In my stored procedure i am fetching the data from one table containing upto 5 to 6 million rows I made use of index in my database but then also I cant optimise my execution time of that sp. Please help me out of this problem.
I am setting up a database which schedules production and tracks inventory of items on a daily basis. The scheduler may put in 100 identical entries (apart from the identity column) of an item with its corresponding quantity. My problem is, if there is a shipment of product (a subtraction of quantity from the database), how can I delete a specified number of rows where the inventory listing is 100,000 pcs? I think the DELETE TOP(r) command will work but I don't know how make the command into an actual variable. Maybe there is another way too... My current not-working try; I look at the product desired to delete, figure out how many rows to delete, and since it is not always an integer, figure out a quantity to add back in. The addition part works fine but delete command needs work. Any help is appreciated. int InvRows = 0; decimal RealInvRows = 0; decimal AddQty = 0; int preAddAmount = 0; protected void DelInv_Click(object sender, EventArgs e) { Label TotProdSum = (Label)DetailsView2.FindControl("TotProdSum"); Label RowQty = (Label)DetailsView3.FindControl("RowQty"); int SubQty = Convert.ToInt32(ShipQty.Text); InvRows = SubQty / Convert.ToInt32(RowQty.Text) + 1; RealInvRows = SubQty / Convert.ToDecimal(RowQty.Text); AddQty = (InvRows - RealInvRows) * Convert.ToInt32(RowQty.Text); IntLbl.Text = Convert.ToString(InvRows); RealLbl.Text = Convert.ToString(RealInvRows); preAddAmount = Convert.ToInt32(AddQty); AddAmount.Text = Convert.ToString(preAddAmount); for (int r = 0; r <= InvRows; r++) { forWhile.DeleteCommand = "DELETE TOP (r) FROM Inventory WHERE (Inventory = @Inventory)"; forWhile.DeleteParameters.Add("Inventory", RowQty.Text); forWhile.Delete(); forWhile.DeleteParameters.Clear(); } forWhile.InsertCommand = "INSERT INTO Inventory(Dte, Product, Inventory) VALUES (@Dte, @Product, @Inventory)"; forWhile.InsertParameters.Add("Inventory", AddAmount.Text); forWhile.InsertParameters.Add("Product", InvProdDDL.Text); forWhile.InsertParameters.Add("Dte", Date.Text); forWhile.Insert(); forWhile.InsertParameters.Clear(); }
I am writing a query where I am identifying different scenarios where data changes between one week and the next. I've set up my result set in the following manner:
PrimaryID Field Changed Previous Value New Value 10003 SKUName SKU12345 SKU56789 10003 LocationId Den123 NYC987 etc...
The key here being that in the initial resultset ID 10003 is represented by one row but indicates two changes, and in the final output those two changes are being represented by two distinct rows. Obviously, I will bring in the previous and new values from a source.
I would like to know that how can I move 70 plus tables that are on sql 7.0 to sql 2012 via SSIS.I know its a two step process but what is the best route and how I can process.
I have a scenario where a customer is going to be using Log Shipping to the DR site; however, we need to maintain the normal backup strategy on the current system. (i.e. Nightly Full, Every 6 Hour Differential and Hourly Transaction Log backup)I know how to setup Transaction Log Shipping and Fail-over to DR and backup but now the local backup strategy is going to be an issue. I use the [URL] .... maintenance solution currently.
Is it even possible to do regular backups locally keeping data integrity for your backup strategy with Transaction Log Shipping enabled?
Hello,I am working on a project using SQL Server 2000 with a database containingabout 10 related tables with a lot of columns containing text. The totalcurrent size of the database is about 2 Gig. When I delete data from thedatabase, it takes a lot of system resources and monopolizes the database sothat all other query requests are slow as mud!Ideally, I would like to be able to issue delete commands to the database ona primary table and get a fast response back. Then, it doesn't matter to mehow long the actual deletion operation takes as long as its priority is lowcompared to the other query requests coming in. Typically, removing asingle row from the primary table results in a deletion of up to 300 rowsfrom related tables.Questions:1. Can I create a trigger on the primary table that will delete the rowsfrom that table, issue a delayed/low priority delete for all of the othertables, and return to the application quickly?2. Can a trigger be run in an asynchrous mode? (that is, issue the command,return immediately, and then go about its business on its own time).3. Can the priority of an SQL statement be specified?4. Is there a Transact-SQL "sleep" command that would allow you to do somework -- sleep for a little bit -- do some more work -- etc?Any help in this area would be greatly appreciated.....Thanks in advance...--Bob GangerGeneral DynamicsJoin Bytes!
USE [Testing] GO /****** Object: Table [dbo].[Testing] Script Date: 4/25/2014 11:08:18 AM ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON
[Code] ....
It seems to work fine with one million records.
Each primary key is unique, but the begindate is non-unique, and i guess even if i use datetime2 and add nanoseconds, from what i have read, there is a chance that i could have a duplicate datetime since the date is imported via XML from multiple sources.
I need to remove log segment of one database from current position to another drive. The current log segment is used by two databases. This was created by mistake. Now we need to fix the problem and create another log segment for the other database. If we keep the log setment as is, we will have problems deleting one of the databases in the future.
I am using SQL Server 2012 Standard Edition. I have a requirement to instantaneously move data from 3 tables that are dependent on each other in a database to another database to same tables with same structure and dependencies.
I can set up replication to manage this. However the data once moved over has to be deleted from the source database immediately after the move. Hence replication is ruled out. Also data is continuously being insert into source database into those 3 tables.
I want to create a sql agent job that handles the data move and delete process and shcedule it to run once every minute. What is a best strategy to handle this without causing deadlocks in the source database?Below is the ddl and all objects in source database match the destination database the only difference is destination has 100 tables and source has only 3 tables which are shown below
CREATE TABLE [dbo].[StackPosition]( [StackPositionId] [uniqueidentifier] NOT NULL, [AccountTriggerId] [uniqueidentifier] NOT NULL, [StackPositionStatusId] [int] NOT NULL, [QueuedAt] [datetime] NOT NULL, [LastUpdatedAt] [datetime] NULL,
I'm using SqlDataSource and an Access database. Let's say I got two tables:user: userID, usernamemessage: userID, messagetextLet's say a user can register on my website, and leave several messages there. I have an admin page where I can select a user and delete all of his messages just by clicking one button.What would be the best (and easiest) way to make this?Here's my suggestion:I have made a "delete query" (with userID as parameter) in MS Access. It deletes all messages of a user when I type in the userID and click ok.Would it be possible to do this on my ASP.net page? If yes, what would the script look like?(yes, it is a newbie question)
I ran the following query in Query Analyzer on a machine running SQL Server 2000. I'm attempting to delete from a linked server running SQL Server 2005:
DELETE FROM sql2005.production.dbo.products WHERE vendor='Foo' AND productId NOT IN ( SELECT productId FROM sql2000.staging.dbo.fooProductList )
The status message (and @@ROWCOUNT) told me 8 rows were affected, but nothing was actually deleted; when I ran a SELECT with the same criteria as the DELETE, all 8 rows are still there. So, once more I tried the DELETE command. This time it told me 7 rows were affected; when I ran the SELECT again, 5 of the rows were still there. Finally, after running this exact same DELETE query 5 times, I was able to remove all 8 rows. Each time it would tell me that a different number of rows had been deleted, and in no case was that number accurate.
I've never seen anything like this before. Neither of the tables involved were undergoing any other changes. There's no replication going on, or anything else that should introduce any delays. And I run queries like this all day, involving every thinkable combination of 2000 and 2005 servers, that don't give me any trouble.
Does anyone have suggestions on what might cause this sort of behavior?
I had created a trigger which sees that whether a database is updated if it is its copy the values of the updated row into another control table now I want to read the content of control_table into BIzTalk and after reading I want to delete it.Can any one suggest the suitable ay to do this?
Hi,Could anyone tell me the backup strategy for a 1000GB database?Thank you!Peter Wang*** Sent via Developersdex http://www.developersdex.com ***Don't just participate in USENET...get rewarded for it!