I have a table that has 4+ million records. I need to update those records. I am facing some performance issue. Can someone please advice?
update stage
set batch_status = 1
where update_status = 0
Update transaction
Set aId = s.aId,
b = s.b,
from stage s
Where s.aId = transaction.aId
and s.batch_status = 1
Update stage
Set update_status = 1,
batch_status = 2
where
batch_status = 1
When I run the above query with "set rowcount 1000", it runs in one minute. When I run the query for "set rowcount 10000", it runs in 1 hour 56 minutes. Can someone help me to optimize it?
I am trying to update a large table which consists of 45 million records , it is taking more than 2 days to the update , below is my approach
1. The table has only one clustered index and no other indexes on the table. 2. I am updating in batches say 20000 record-wise. 3. Changed the recovery mode to bulk logged and auto-growth size is set to  300MB and there is enough space in my disk for transaction log .
I want to update tableToUpdate in batches of 5000 per batch and set the lastenecryptionDT to null based on the the join to the tableValues using the column ENCRYPTIONID, and also output updated rows into another table. Incase I would need to do a rollback.
I have a requirement to delete 1 Million records from a table having 10 Million data and it's being queried on 24/7 basis (don't have a downtime). how can I achieve that?
I need to update about 1.3 million rows in a table of mine. I am getting the data from one of the columns of the same table and updating the new column. I am doing this using a cursor which I have put in a stored procedure. As this is a production table which users might be accessing.It is a web based application and I can't slow the system down. So I am willing to run the stored prcedure during off peak hours. However, do I need to put this in a transaction? If I did put it in a transaction what type of isolation level should I opt for? Data integrity is very important for me and I don't mind to compromise on the performance. I am doing this because one of the columns which has "short description" entry is has become too small for business purposes and we want to increase it's length from varchar(100) to varchar(150). As this is SQL 6.5, I can't increase the lenght of the column. So Iadded a new column and will run the stored proc. What precautions are to be taken? This is on a high priority basis and very important too.
Thanks in advance...
Stored procedure code:
USE DB_Registration_Dev GO IF EXISTS (SELECT NAME FROM SYSOBJECTS WHERE NAME='usp_update_product' AND TYPE='P') DROP PROCEDURE usp_update_product GO CREATE PROC usp_update_product AS DECLARE @short_desc varchar(100) DECLARE @prod_id int
DECLARE sdesc_curs CURSOR FOR SELECT [Product].[product_id] , [Product].[short_description] FROM Product
OPEN sdesc_curs
FETCH NEXT FROM sdesc_curs INTO @prod_id, @short_desc
WHILE @@FETCH_STATUS = 0 BEGIN UPDATE Product SET [Product].[sdesc] = @short_desc WHERE Product_id=@prod_id FETCH NEXT FROM sdesc_curs INTO @prod_id, @short_desc IF @@FETCH_STATUS <> 0 PRINT ' Finished Updating the table...go ahead and have fun ...! ' END DEALLOCATE sdesc_curs GO
I'm looking for some performance assistance on updating a column value in a table that contains approximately 50 million rows. I have a permanent table in another database that has the key column and value to be set. My query is listed below, but I'm afraid it will run quite awhile. Any suggestions would be appreciated.
update mytable set column2 = b.column2 from mytable as a join mytable1 as b on a.column1 = b.column1
There is a one to one relationship between the two tables.
I am currently working on a simple page to insert 1.6 million UK postcode records into an SQL server table. The table has three columns for the postcode, longditude coordinate and lattitude coordinate. The data is sourced from a pipe (|) delimited txt file and inserted into the database using a FOR loop. The problem I have is that the page will hang after inserting only 10,000 records, the page displays either an invalid View State error or a page cannot be found error. Now I assume the viewstate error stems from the fact that there is a form on the page which simply contains a button to execute the script and a few labels to show the progress. But without the form and associated viewstate the insert still fails to complete.... any ideas?? Would I be better running this on a thread or should I just do it in stages and be patient. I have now modified the page to read the database on load and pick up from where it crashes?
Hey folks...So I have a table that looks like this:CREATE TABLE [tblStation] ([CAMPAIGN] [varchar] (8),[LISTNUM] [varchar] (10),[PHONE] [varchar] (10),[EVENTTIME] [datetime] ,[STATION] [int],[OPERATOR] [varchar] (16),[EVENTCODE] [varchar],[CALLSPAN] [decimal](18, 0),[FDISP] [int],[RECORDNUM] [varchar],[STC] [varchar],[PROMOC] [varchar],[EXP_CAMP] [varchar],[PROMO3] [varchar],[MAXATT] [char],[LISTNAME] [varchar],[SITENAME] [char],[Row_id] [int] IDENTITYIt's taking nine seconds to run the following command:SELECT count([fdisp])FROM [TrunkFiles_new].[dbo].[tblStation] WITH (NOLOCK)WHERE fdisp IS NULLAnyone familiar with a table of this size having performance likethis? The [fdisp] column has a non clustered index on it.Thanks in advance...
How well SQL Server can support 300 million records... Any body is working on big database like this. can anyone give me some input on this. it's going to be 60GB database size.
i have a directory database with approx. 80 million records. i am feeding the database with bulk_insert. Indexing one of the fields took about 8 hrs. After indexing when i run queries with the indexed field the response time is under 1 sec. However if i run select queries with like on non-indexed fields it takes more than 2 mins. So i decided to index 4 other fields in the database and it looks like the indexing process is going to run for 2 days. i am a novice in SQL database design and i am not sure if this is the best way to index the table. i am just using create index. Any suggestions / advice welcome.
Hello, What is the fastest way to update 20million records in our database. I have tried to do a simple update statement like this: update trail_log with (tablockx, holdlock) set trail_log .entry_by = users.user_identity from users where trail_log.entry_by = users.user_id
but it take 10 plus hours to run since it cannot commit the transactions until the very end. So was was thinking that I need to commit in batch like after 50K but that is slow as well. Set rowcount 50000 Declare @rc int Set @rc=50000 While @rc=50000 Begin Begin Transaction update trail_log With (tablockx, holdlock) set trail_log.entry_by = users.user_identity from users where trail_log.entry_by = users.user_id and trail_log.entry_by not like '%[0-9]%' Select @rc=@@rowcount --Commit the transaction Commit End go I have let the above statement run for 1.5 hours and it only update 450000 rows. Any ideas... Maybe I'm doing it wrong. Please Help!!
I have a sql script that updates records in a table with 40 million records.
There is some functionality in the script that could be put away in functions for code reuse/elegance.
Functions would cause execution overhead.
What else could I use besides functions that would allow me the code reuse and not compromise the execution over head? Is there any thing like includes in TSQL that would allow me to do so?
I have a new client with an existing system that has just over 2 million business listings in one table. Each business listing is associated with one business category.
* Company Table (around 20 fields):
companyID companyName categoryID state postCode etc.
* Category Table (5 fields)
categoryID categoryName etc.
We are using MSSQL 2005 Express Edition with Advanced Services
A free text search needs to be performed on the companyName and categoryName limited by region (state and or postcode).
1) What kind of response times should I expect for the free text search (I have not used the free text search before)
2) How should I index the companyName and categoryName so they are both used in a joined query? i.e. Do I just configure the free text search index on each field separately and it should work?
I want to compare ONLY 1 Column values from 2 tables having more than 4.9 million records. There is a difference of 4000 rows between the 2 tables.
SELECT ID From TABLE1 where ID not in (SELECT DISTINCT ID From TABLE2)
My above query took nearly 4.5 hours to run and I had to cancel it. Is there a better way to write the query . I just want to compare the ID - column values which are missing in TABLE2
I come from a web based world were loading 1.5 million records into a temp table is suicide. I’m doing more data warehouse stuff now and I was looking into optimizing a buddies proc and noticed he was loading 1.5 million records into a temp table. We had a discussion about it because being from a web world I was drastically against it. He on the other hand didn’t feel it was an issue being it gets called once maybe twice a day. The tempdb is set to autogrow and it is on a different drive than all the other databases on the box. It has one ldf and mdf. He’s creating an index on the table after load. Why we shouldn’t be loading 1.5 million recs into temp table?
Hi I have 2 tables with more then million records in each and I have to perform full outer join. The problem is that the join clause contains 2 different parameters (int and string) like this:
Select * From a full outer join b On a.cli = b.cli OR a.reference = b.reference
Because of the OR in the clause and the million records the query is infinite. If I change to one rule only then it works fine.
How can I join these 2 big tables with 2 rules? Thanks Itay
I have tried to process > 3 million Fuzzy grouping records on two different servers with no success. 3 mill works but anything above 4 mill doesn't. Some background:
We are trying to de-dup our customer table on: name (.5 min), address1 (.5 min), city (.5 min), state (exact). .8 overall record min score. Output includes additional fields: customerid, sourceid, address2, country, phonenumber Without SP1 installed I couldn't even get a few hundred thousand records to process Two different servers - same problems. Note that SSIS and SQL Server are running locally on both The higher end server has 4GB RAM, the other 2.5 GB RAM. Plenty of free disk space on both SQL Server is configured to use 2 GB of RAM max The page file is currently at 15GB
After running a number of test on both servers trying different batch sizes etc. the one thing I noticed is that it seems to always error out when SSIS takes over and starts chewing up all the available RAM. This happens after the index is created and SSIS starts "warming caches". On both servers SQL Server uses up about 1.6GB of RAM at this point while SSIS keeps taking over RAM until all physical RAM is used up.
Some questions:
Has anyone been able to process more then 3 million records and if so what is your hardware configuration? Should we try running SSIS from a different server so it has access to the full amount of physical RAM? (so it doesn't have to fight for RAM with SQL Server) Should we install Win 2003 Enterprise Server so we can add more RAM? Any ideas why switching to the page file might be causing errors?
CREATE TABLE [dbo].[DR_Test]( [source_item_id] [int] NOT NULL, [source_line_no] [int] NULL, [buyer_id] [int] NOT NULL, [seller_member_id] [int] NULL,
[code]...
the table contains more than 80 million records so when i fetch the data using buyer_id & timezone its taking lot of more than 1 hours or so....& where buyer_id is not unique.how to fetch the data fast or need to change the structure of the table
I have 1+ CSV files (using a foreach loop) which I'm doing a lot of transform work on and then inserting into a SQL database table. Each CSV file usually contains about 2 days worth of data (contains date stamps) - somewhere in the region of 60k records per day. The destination table currently contains 3 million+ rows and will get bigger. I need to make sure that before inserting into the destination table, the data doesn't already exist.
I've read the following article: http://www.sqlis.com/311.aspx While the lookup method works, it takes ages and eats up memory as it caches the 3m+ records before running for each CSV. Obviously this will only get worse as the table grows in size.
To make things a little more efficient what I'd like to do, is first derive the dates I'm dealing with in the current file - essentially storing the max(date) and min(date) in variables. Then in the lookup SQL use those vars, to reduce the amount of data that needs to be brought into the transformation to check against before inserting into the destination table. Lookup SQL eg. SELECT * FROM MyTable WHERE Date BETWEEN varMinDate AND varMaxDate.
Ideally I'd use an aggregate transformation and then use the subsequent output from that either in the lookup query or store the output in vars, but I don't think you can do that and I get the feeling I'm approaching this with the wrong mindset.
I have a table that I need to do some computations on all the data but first I need to remove the duplicate records and insert the results into a destination table. Here's the example below. My table has 3.1 million rows. I have tried using the DISTINCT and the GROUP BY but both ways to select the data takes about half a minute to run. I'm wondering if there is a way to increase performance. Users are ok with this time since the process runs overnight but improving it won't hurt. I do have a clustered index on these fields but that doesn't seem to improve any.
I have a pretty simple SSIS package that fast loads a 100 million record table into a SQL Server 2008 table on a daily basis. This normally runs fine and completes in about 1 hour. As this is perhaps one of our largest running SSIS packages, about once every 2-3 weeks this SSIS will fail/drop connection. Once it fails, the large number of records will start rolling back. This rollback process can take 1+ hours so I cannot even restart the failed SSIS package immediately. This is a problem.
I am looking for a solution or option so I do not have to wait on that rollback to restart this particular, long running SSIS package. Is there an option/setting to leave the partial data set committed and not rollback? Then I could just restart the SSIS package immediately or set it the SSIS to auto-restart 1 time on failure. The first step in the SSIS does a truncate of the destination table.
I have a situation where deleting old records is blocking updating latest records on highly transactional table and getting timeout errors from application.
In details, I have one table called Tran_table1 in OLTP database. This Tran_table1 is highly transactional table, it will receive data for insert/update continuously
While archiving 2 years old records from Tran_table1 into Tran_table1_archive in batches(using DELETE OUTPUT INTO clause), if there is any UPDATEs on Tran_table1,these updates are getting blocked and result is timeout errors in application.
Is there any SQL Server hints to avoid blocking ..
i have a total of 5 records to update. one being a summary row and the other four being 1,2,3,4 record ids. is there any way to update these 4 records w/o using 4 seperate updates???
First off, Ive been asking a lot of questions lately and I want to sincerely thank those who have been posting VERY helpful replies. I have learned much in these past few weeks.
That being said I have hopefully one of my last questions before this project is complete.
Im running a DTS that imports data from a flat text file and updates 6 different tables. In one of my tables, "PERSON", I only want the records to be inserted or updated if the FCN field is new, or unique.
Here is my code that updates the table (without regard for uniquneess):
INSERT INTO PERSON (FIRST_NAME, LAST_NAME, MIDDLE_NAME, FCN) SELECT "FIRST", "LAST", "MIDDLE", FCN FROM DATAGRAB WHERE RECORD_DATE = convert(varchar(8), getdate()-1, 112)
How can I convert it to only update records with a unique FCN value?
Ive posted on here many times before and got the answers, so hopefully this time will be like the rest...you guys are SO good at SQL.
I have a database that has lots of fields. I use some specialised email marketing software that allows me contact each record. What I have done in the past is export the contacts i wanted to email and then format the columns like this: (all done in excel)
EmailAddress Website Salutation Website1 Website2 Website 3 info@eg.co.uk www.eg.co.uk David eg.co.uk Eg.Co.Uk Eg test@red.co.uk www.red.co.uk Sue red.co.uk Red.Co.Uk Sue
After each list looks like this, i then import this data into Access and from there our email software pics up the data and emails accordingly.
Now here comes the problem...
We have decided that we now want everything done within our sql database. Which means we want clickthroughs recording and open rates etc all appearing in our database. (we have all this sorted out)
However we dont have Website1, 2 or 3 fields in our database. We know that they need to be included in our db somewhere but with half a million contacts we don't want to do it one by one obviously.
We want a simple where we export all our contacts and then format them so they include website1, 2 and 3. THEN we want to import them, simply updating the contacts with that new data. (Or is there a SQL way of creating temp tables so these fields dont even need to be included in the db?)
I hope i managed to explain myself in way you can all understand.
I have an Excel file of 1362 rows and 30 columns. I need to load itinto my database to update addresses and phone numbers. Each recordhas its own ID number, which is defined in the Excel file and in thedatabase. The database is Millenium Version 7.2.1.1. I can't reallyfind any signifigance between any of the records. Is there a way I canjust load them into my database and update each record? Maybe by usingthe ID number?Dane WinklerDatabase SpecialistSt Vincent CollegeJoin Bytes!
I'm having a bit of problem putting together a query that will update records in one table from another table. I've got 2 tables lets call them tblA and tblB.In tblA there is EID(int), QID(int), OID(int) and in tblB OID(int) and QID(int). Also there is an input parameter @EID.What I want to have happen is when someone inputs @EID then tblB gets updated from tblA. To give you a heads up there are no PKs or FKs in either of the tables.If there is an OID in tblA it takes the QID from tblA and places it in tblB where OID from tblA and tblB match.Hopefully this makes sense I thought that I could do something like this:CREATE PROCEDURE test3(@EID int)AS UPDATE tblUsedSET QuestionID = (select QuestionID from tblExamQuestions where ExamID = @EID)WHERE OrderID = (select OrderID from tblExamQuestions where ExamID = @EID)GOBut I get an error that says that there are to many records being returned by the subqueries. Winston
I have a table with 5 columns (col1, col2, col3, col4). I want to do is: 1) To check if any two records are duplicates (if the the values in col1 of record A are identical to Record B, two records are considered as duplicates);
2) if two records are duplicate, I want to mark Record B as "Dup" in col4 ;
3) move the data of Col2 of Record B to col3 of Record A.
I have tried to use CURSOR for the job. I would appreciate if anyone can give me some hints for updating records using Cursor.
My script looks like this: CREATE PROC Mark_duplicate AS DECLARE @var1_a, /* To hold the data from Record A */ @var2_a, @var3_a, @var4_a, @var5_a,
@var1_b, /* To hold the data from Record B */ @var2_b, @var3_b, @var4_b, @var4_b,
/*** Create a CURSOR ***/ DECLARE Dup CURSOR FOR SELECT col1, col2, col3, col4 FROM TableA ORDER BY col1, col2 FOR UPDATE OF col1, col2, col3, col4
/**** OPEN the CURSOR ****/ OPEN Dup FETCH NEXT FROM Dup into @var1_a, @var2_a, @var3_a, @var4_a WHILE ( @@FETCH_STATUS =0 ) BEGIN FETCH NEXT FROM Dup into @var1_b, @var2_b, @var3_b, @var4_b WHILE ( @@FETCH_STATUS =0 ) BEGIN If ( @var1_a = var1_b ) THEN . .Updating statements . . ELSE SET @VAR1_a = @var1_b, @VAR2_a = @var2_b, @VAR3_a = @var3_b, @VAR4_a = @var4_b FETCH NEXT FROM Dup into @var1_b, @var2_b, @var3_b, var4_b END END CLOSE DUP . . .
Here's a problem that I can't find anyone else has run into. I'm usingAccess and SQL Server, but the theory would be the same for any db.I have a large number of tables that contain linked records(intersection tables mostly). In the interest of space, I'llillustrate an example:tblStudents (ID, Name)tblTeachers (ID, Name)tblClasses (ID, Name)tblEnroll (StudentID,ClassID,TeacherID)I have about 10 people who each use a separatecopy of this database (in access). I want them (at the end of eachday) to be ableto update all the records that they entered that day into a databasethat I have setup on a server (SQL Server 2005). Both databases havethe exact same structure.Caveat 1: All of the classes, students, and teachers are not the sameon each database, but the server database should contain all of them.Caveat 2: There is no way for the clients to automatically insert intothe server, they are offsite and out of range.Herein lies the problem: when a record is inserted into the server froma client, all of the links are lost since the ID will be different onthe server that it was on the client.I don't think a simple update / insert query will work, and most db'sand languages don't play nice with recordset appends.What are your thoughts??
I have a really simple query I'm trying to execute. I want to replace all instances of int X (8) in Column A (LastActivityID) with int Y (27) but the following SQL returns the said error. I understand the error, but not sure how to script the SQL differently to get the intended result. Thanks in advance.
Code Block
UPDATE Item SET LastActivityID = 27 WHERE LastActivityID = 8 Error: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
I have a following problem: I€™m importing records from an Access table into a SQL Server Table. I€™m using lookup to determine if a record already exists in the SQL Server table and in that case I should update the record if it was modified.
I thought of using OLE DB Destination for new records (done, works fine) and OLE DB Command Transformation to update the modified records. It all works but the thing that bothers me is that my table has approx. 40 columns so my OLE DB Command is very long (update table set col1 = ?, col2 = ?, col3 = ? €¦€¦). The other problem is that I€™m always updating all the columns even if only one column was modified.
Is there a better way to update the existing records?