How To Commit The Rest Of The Transactions Of An Update In Large Bulk?
Feb 8, 2004
I have to modify the table structure where the table have a lot of data already. The log is getting full due to uncommitted transactions, there is a lot of data being updated in large bulks, not all of the transactions are committed, the update task cannot be completed.
However, there is no more spare disk space for it to commit the transaction. Anyone can help?
Many times i write stoted procedures with transaction blocks. I have delete a row after begin transaction and in continue i read from table the select statement get back the deleted row:
begin tran delete mytable where id = @myid and seqid = 3
select sum(balance) from mytable where id = @myid
............ ............... commit tran .... OR rollback tran
the sum(balance) function has calculate the balance of row 3 I use SQL 7.0
I have a series of questions about SSIS and transactions. The answers to these questions are probably so obvious that I can't see them, so please feel free to just point out what it is that I'm missing. My transaction-processing experience is very low-level, so I'm probably just not seeing how it's done at the high level of SSIS.
The first question is one that I may know the answer to, so please confirm:
Consider a package with TransactionOption set to Supported. It contains a single Execute SQL Task with TransactionOption set to Required. Is it true that if that Execute SQL Task succeeds, that the transaction commits, and that if the task fails, the transaction rolls back?
Consider another package with TransactionOption set to Supported. It contains a Sequence Container with TransactionOption set to Required. That container contains our same Execute SQL Task, but that is joined to a script task by a "success" precedence constraint. The script task simply returns Dts.Results.Failure. Is it the case that the transaction will roll back? That is, is it truly a simple failure result that would initiate the rollback?
If a DataFlow Task is the one that is set to Required, does that mean that every transactional operation within that task will commit in a single transaction? For instance, if I'm inserting five rows for each input record from a flat file, and if my flat file has 1000 records in it, will I see a single transaction with 5,000 rows? Thanks for your patience!
If I have a table with 1 or more Nullable fields and I want to make sure that when an INSERT or UPDATE occurs and one or more of these fields are left to NULL either explicitly or implicitly is there I can set these to non-null values without interfering with the INSERT or UPDATE in as far as the other fields in the table?
EXAMPLE:
CREATE TABLE dbo.MYTABLE( ID NUMERIC(18,0) IDENTITY(1,1) NOT NULL, FirstName VARCHAR(50) NULL, LastName VARCHAR(50) NULL,
[Code] ....
If an INSERT looks like any of the following what can I do to change the NULL being assigned to DateAdded to a real date, preferable the value of GetDate() at the time of the insert? I've heard of INSTEAD of Triggers but I'm not trying tto over rise the entire INSERT or update just the on (maybe 2) fields that are being left as null or explicitly set to null. The same would apply for any UPDATE where DateModified is not specified or explicitly set to NULL. I would want to change it so that DateModified is not null on any UPDATE.
INSERT INTO dbo.MYTABLE( FirstName, LastName, DateAdded) VALUES('John','Smith',NULL)
INSERT INTO dbo.MYTABLE( FirstName, LastName) VALUES('John','Smith')
INSERT INTO dbo.MYTABLE( FirstName, LastName, DateAdded) SELECT FirstName, LastName, NULL FROM MYOTHERTABLE
Hope this is the right forum. I'm using Transaction=required on a page which inserts on multiple tables, 2 of which have a foreign key relationship. All works fine as log as I don't input erroneous data. However, I have a range check in the code, and if the range is exceeded, an exception is thrown and the transaction fails using ContextUtil.SetAbort(). I then correct the data and try to save and get a Foreign key contraint error. I have debugged and the primary key table seems to be carrying out the insert ok (I'm retreiving the key at that point, and can see it). But when I use the key in the child table it fails and cites the foreign key relationship. I suspect that having the same data for the primary key table 2nd time around means it doesn't think it has to commit???? Grateful for any help. I'm using Sqlserver 2005 by the way.
I have a update trigger. In this trigger I need to insert few records in 3 tables. If error comes in any of these inserts then previous inserts to get committed. This trigger was written in Sybase and it was possible to create transaction and commit the transactions.
I have a web form with a text field that needs to take in as much as the user decides to type and insert it into an nvarchar(max) field in the database behind. I've tried using the new .write() method in my update statement, but it cuts off the text after a while. Is there a way to insert/update in SQL 2005 this without resorting to Bulk Insert? It bloats the transaction log and turning the logging off requires a call to sp_dboptions (or a straight-up ALTER DATABASE), which I'd like to avoid if I can.
If application code controls all transaction processing to SQL Server, so it starts a transaction, does any commit or rollback on teh application side, how does that actually work ON SQL Server 2005... Meaning, If the app passes in a isolation level of Repeatable Read, and the database default is different, how can I see what is being used, as a DBA? Can I see any of that via Profiler? can I see when those commits/rollbacks are issued from teh application. They are not sending in "SQL" commit/rollback transaction commands. It's built-in to their architecture to control all that... How can I see what's happening on the database if these are not SQL commands for transaction handling? and how does that work, to start a transaction on the app side, and hold locks etc, on SQL Server if normal SQL Server commands are not being sent? If anyone can point me at decent references to read on that, thanks! Bruce
I am trying to run a query that deletes duplicates records on a table with 24m records. The problem is each time I run it the log file fills up and I get an error saying the log file is full. For this reason the query never ends.
Is there anyway to turn of logging when running a query?
I think it also has to do with disk drive runng out of space as the log file is growing to over 12gb.
A server I'm working on has a very unique situation, where user tables and production tables reside on the same database. Users update / create tables or populates these tables, so it can't be a table-specific trigger. However, they give a new meaning to "kamikaze pilots" as it's not uncommon for them to "accidentally" update / insert / delete 500,000,000 + records in a single statement. I've tried educating them to use batching, but to no avail, so now I'm forced to stop these statements BEFORE they execute, based on rowcount, as they fill up the database log so quickly that it goes into recovery mode (It has a 200GB log file - insane, I know).
I recon the mosts transactions allowed should be 1,000,000 records in a single statement. Looking for database trigger to stop them from executing statements with large records?
When I need to perform an update against multi-million row table Itypically specify @@rowcount, to reduce locks.e.g.set @@rowcount 1000while exists (select * from myTable where col2 is null)update myTableset col2 = col1 + 'blahblah'where col2 is nullHowever, my boss' script does something like this. I think it works OKbut it seems overly complicated to me. Any thoughts?while exists (....)begin traninsert into #tableselect ...update myTableset ...from myTable join #table ...(@numberOfRows is a counter variable, tracking #rows that have beenupdated since last batch)if @numberOfRows > 1000begincommitbegin tranendend
I have to update a field within a table of 60 records or so. Each record has a different field value. it's type varchar. i was given an excel file with the field values and was thinking of a bulk update like bulk insert, but i don't recall that it's possible that way.
Is the only way to create a table, bulk insert, then merge the two tables together with UPDATE?
Just wanted to see if there was an easier way to do it, otherwise i'll take the latter route. Thanks!
We have a filtered publisher (SQL Server 2012) with several pull and push subscriptions that have been functioning fine for years.
We have just added a new pull subscriber from a remote server. We configured an alternate folder location on the subscriber (pointing to existing path on a remote server). We generated new snapshot and partitioned snapshot, and moved to subscribing server at alternate path.We're seeing a problem we've never seen before:
When the merge agent runs, it applies the schema and performs a few hundred bulk inserts, but then proceeds to 'download changes to subscriber' as though the snapshot is old (it is brand new and should have all transactions). We have never seen this before, i.e. all data is usually applied in bulk when creating a new or reinitializing an existing subscriber.
we marked the subscriber for re-initialization (from both the publishing and subscribing server) with a fresh set of snapshots. The same problem occurred.We have also tried in our test environment (which mirrors production) and the snapshot applies as expected (via bulk inserts).
(It might be relevant to note the publisher was recently upgraded in-place from 2005 to 2012.)
The following code will not update and commit the update to a SQL Database Table. Now my where statement is looking for a Date field. Could this be the problem? Dim DBConn As SqlConnection Dim DBAdd As New SqlCommand Dim strConnect As String = ConfigurationManager.ConnectionStrings("ProtoCostConnectionString").ConnectionString DBConn = New SqlConnection(strConnect)
SQL Server 2000, C#, ASP.NET and ADO.NET. I have searched for 3 days trying to figure out why my ADO.NET Transaction executes my stored procedures and runs the .COMMIT() on the server, figured this out by using the SQL Profiler, but the data doesn't show up in the database tables and I recieve no error from SQL or ASP.NET. What gives???? Anyone else ever heard of such a thing. Please help, before I loose my sanity.
I have this sql stored procedure in SQL Server 2012:
ALTER PROCEDURE [dbo].[CreateBatchAndSaveExternalCodes] @newBatches as dbo.CreateBatchList READONLY , @productId int , @cLevelRatio int , @nLevelRatio int AS set nocount on;
where I can look for the user,who execute this statment?
It was 6.02.2007 or 7.02.2007. All users connect using SQL Serwer Authentication. It is test database and from 1 to 3 users connect to it, and submit not more then 10 SQl statmant per day.
Hello All, I have a large table (8 Million Records) that I'm attempting to update with data from another table. Performance is slow. I don't need to log the updates since the table is only in a development environment. Here's an example of the syntax I'm using:
update a set a.col1 = b.col1 from b where a.col2=b.col2 and a.col3=b.col3
Is there a way that I can use the "NO_LOG" clause when I execute this?
Now we have different packages for 4 tables data loading. These 4 packages will start at a time. Before going to load the data we have to make the Flag to 1 and after that we have to load it. Because of this we have written Update statement to update the Value to 1 in respective Package.
Now we are getting dead lock because we are using same table at a same time. Because we are updating different records.
the application will add items into a "bag". That is, the items in one table will refer a record in another table. This will be done in timely manner -- with second or minute delays between adding a new item. There will be up to thouthand of items per bag. The option is to wait until a full bag accumulates and set up all the references at once by using
UPDATE items SET container_ref = bag WHERE id IN [...]
The disadvantage of such all-at-once I see is inability to encapsulate the functionality into a SP -- the problem is to pass a set of IDs. The advantage should be efficiency in terms of total SQL Server load. How mush would it be?
I have an unusual question... I have a table that I have to select the top 300 records from, check the date on each record, and using an if... then... statement, loop through each record updating a field based on criteria.
The problem is that it takes so long to do this. Is there a way to populate an object and pass that object with the data to a SQL stored procedure to have the SQL server do all the updating as opposed to the application doing the record updating? Does that make sense? Here's a sample of the code and you'll see what I'm talking about.
Thanks in advance for any help or advice you can give.
Imports Microsoft.SqlServer.Server
Imports System.Data.SqlClient
Imports CallTracker_AgingCheck
Module CallTracker
Sub Main()
GetWebAgingData()
GetWebCallsData()
End Sub
#Region "Get Data"
Public agingStatusNumber As New ArrayList
Public agingEscalationNumber As New ArrayList
Public callStatusNumber As New ArrayList
Public CallStatus As New ArrayList
Public Sub GetWebCallsData()
Dim dt As New DataTable
Dim Criteria As String = "SELECT TOP 300 CallID, TIMEOFCALL, Status, StatusNumber " & _
"FROM WebCalls " & _
"WHERE (Status = 'OPEN') OR " & _
"(Status = 'IN PROCESS') OR " & _
"(Status = 'WORKORDER') OR " & _
"(Status = 'PRIORITY') OR " & _
"(Status = 'PENDING') " '& _
'"ORDER BY TimeOfCall DESC"
Dim Fa As String = String.Empty
Dim fromDATE As String = String.Empty
Dim toDate As String = String.Empty
Dim ds As New DataSet()
Dim tbl As String = "WebCalls"
Dim x1 As Integer = 0
Try
Using cn As New SqlClient.SqlConnection(Database.SQLConnection)
cn.Open()
Using cm As SqlClient.SqlCommand = cn.CreateCommand()
cm.CommandText = Criteria
cm.CommandType = CommandType.Text
cm.Parameters.AddWithValue("@CALLID", Fa)
cm.Parameters.AddWithValue("@TIMEOFCALL ", Fa)
cm.Parameters.AddWithValue("@STATUS", Fa)
cm.Parameters.AddWithValue("@STATUSNUMBER", Fa)
Dim Age As TimeSpan = Nothing
Dim _totalHours As Integer = Nothing
Dim EscalateTime As Integer = agingEscalationNumber.Count
Dim DAUpdateCmd As SqlCommand
Using da As New SqlDataAdapter(Criteria, cn)
da.Fill(dt)
Dim callID As Integer = Nothing
Dim statNO As Integer = Nothing
Dim toc As Integer = dt.Rows.Count
Dim x As Integer = 0
For x = 0 To toc - 1
Dim y As Date = dt.Rows(x).Item(1) 'gets the date from the first row in WebCalls for comparison with WebAging
Age = Today.Subtract(y)
_totalHours = Age.TotalHours
If _totalHours > 744 Then
Console.WriteLine("This record is older than 30 days")
Else
callID = dt.Rows(x).Item(0)
DAUpdateCmd = New SqlCommand("Update WebCalls SET STATUSNUMBER = @STATUSNUMBER where CALLID = @CALLID", da.SelectCommand.Connection)
I've got a table that I have to update in preparation for our environment move (2k to 2005 SP2). The developers that designed the application created a table called schemas, which holds the contents of an XML file inside of an ntext field named Data.I need to parse through the field and do a find/replace to replace all instances of www.site.com with www7.site.com. It's all over the place in the file. The problem is, that the datalength() of each of the fields (there are 2 rows) are above 15000.normally, I'd run something like this:update schemas set data=replace (cast(Data as varchar(max)),'www.site.com','www7.site.com') where data like '%www.site.com%'Smaller columns it works great - but it won't work on these because they're too big (the update will chop anything beyond the varchar(max) value). I could do it manually, but this DB will be refreshed from production on a weekly basis and I'd like to script as many of the environment changes to the DB as much as possible. Any ideas?
I am currently wrapping up a website upgrade for a client and I am working on a development server/database. The development server/database will become the live version. When the upgrade goes live, I will need to update that database with the latest data from specific datatables (no all of them) in the previously live database, but I don't know how to do a bulk refresh of datatables. Problem: specific datatables (not all datatables) from Database1 need to be updated with the data from Database2. Database1 and Database2 are copies of each other with vast differences in some of the data. Result: All of the current, up-to-date data needs to reside on Database1. Solution: Any ideas? I am using MSSQL 2000 and the databases reside on the same server.
We are planning to add a new attribute to one of our tables to speed updata access. Once the attribute is added, we will need to populatethat attribute for each of the records in the table.Since the table in question is very large, the update statement istaking a considerable amount of time. From reading through old postsand Books Online, it looks like one of the big things slowing down theupdate is writing to the transaction log.I have found mention to "truncate log on checkpoint" and using "SETROWCOUNT" to limit the number of rows updated at once. Or "dumptransaction databaseName with No_Log".Does anyone have any opinions on these tactics? Please let me know ifyou want more information about the situation in order to provide ananswer!
I have a table which when certain columns are updated, need a trigger to fire to update a next schedule date in that same table for that record. I can write the trigger, but my question for performance and efficiency is which approach would be better. Separate triggers fo the 8 columns, or a large trigger with an If to check if these columns are updated. Thanks
I'm updating the name data in a large user database with the following UPDATE statement. The staging table was bulk loaded from a flat file and contains 10 million records. The production table (Recipients) contains 15 million records. This worked correctly but this single update statement took an entire ten hours to run which is way too long. While it was running the server was clearly 100% disk bound. CPU activity was near nothing. We've just upgraded RAM from 1GB to 2GB but we expect data sizes to grow significantly and we can't keep adding RAM. Absolutely nothing else is running on this server. Any ideas how I can optimize this?
UPDATE Recipients SET [First] = Stages.[First] , [Last] = Stages.[Last] FROM Stages INNER JOIN Recipients ON (Stages.UserName = Recipients.UserName AND Stages.DomainID = Recipients.DomainID) WHERE (CASE WHEN Stages.[First] IS NULL THEN 1 ELSE 0 END + CASE WHEN Stages.[Last] IS NULL THEN 1 ELSE 0 END) <= (CASE WHEN Recipients.[First] IS NULL THEN 1 ELSE 0 END + CASE WHEN Recipients.[Last] IS NULL THEN 1 ELSE 0 END)
Text execution plan. I've made small annotations with the % information from the graphical execution plan:
|--Clustered Index Update(OBJECT:([Recipients].[dbo].[Recipients].[PK_Recipients]), SET:([Recipients].[First]=[Stages].[First], [Recipients].[Last]=[Stages].[Last])) |--Top(ROWCOUNT est 0) |--Sort(DISTINCT ORDER BY:([Bmk1000] ASC)) 14% |--Merge Join(Inner Join, MANY-TO-MANY MERGE:([Stages].[DomainID], [Stages].[UserName])=([Recipients].[DomainID], [Recipients].[UserName]), RESIDUAL:(([Recipients].[UserName]=[Stages].[UserName] AND [Recipients].[DomainID]=[Stages].[Domain 25% |--Clustered Index Scan(OBJECT:([Recipients].[dbo].[Stages].[IX_Stages]), ORDERED FORWARD) 61% |--Clustered Index Scan(OBJECT:([Recipients].[dbo].[Recipients].[PK_Recipients]), ORDERED FORWARD)
Everything I've heard on the subject suggests you change the index scans to index seeks. How do I do this?
Any other tuning advice is greatly appreciated.
Here are the exact statements I used to create the tables:
CREATE TABLE Recipients ( ID INT IDENTITY (1, 1) NOT NULL, UserName VARCHAR (50) NOT NULL, DomainID INT NOT NULL, First VARCHAR (24) NULL, Last VARCHAR (24) NULL, StreetAddress VARCHAR (32) NULL, City VARCHAR (24) NULL, State VARCHAR (16) NULL, Postal VARCHAR (10) NULL, SourceID INT NULL,
CREATE TABLE Stages ( ID INT NULL, UserName VARCHAR(50) NOT NULL, DomainID INT NULL, Domain VARCHAR(50) NOT NULL, First VARCHAR(24) NULL, Last VARCHAR(24) NULL, StreetAddress VARCHAR(32) NULL, City VARCHAR(24) NULL, State VARCHAR(24) NULL, Postal VARCHAR(10) NULL ) CREATE CLUSTERED INDEX IX_Stages ON Stages (DomainID, UserName)
We have a simple UPDATE query, joining two tables, that takes much longer than 10 hours to run, but if we break the table in six (10 million rows in each table), it takes only fifteen minutes to run each part.
Why? And how can we tell in advance whether a query will cross the threshold into l.o.n.g.r.u.n.n.i.n.g query? Or, how can we prevent it?
The system is Windows XP Pro with 4GB RAM (/3GB switch), and SQL Server Standard 2005. Log files, swap files, dbf files are on separate drives. The system is dedicated to SQL Server. No other queries are running at the same time. The database is in Simple logging mode. Each table is a few GB with 60 million rows.
An example problem query is: (updating fewer than 10 bytes) UPDATE bigtable SET bigtable.custage = scores.custage, bigtable.custscore = scores.custscore FROM bigtable JOIN t2 ON bigtable.custid = scores.custid
In this case, each table has 60 million rows. 'custid' is a sequential, unique integer. SCORES table is clustered on 'custid' and is 1.5GB in size. BIGTABLE has an index on 'custid', and is 6GB in size. There is a one-to-one match between the tables on 'custid', but not enforced. The SCORES table was created by exporting a few fields (but all 60 million records) from BIGTABLE, updating the values in a separate program, then importing back in SQL Server into the SCORES table.
The first time this query was run, we stopped it after it ran 16 hours. When we broke up the bigtable into 10 million record chunks (big1, big2, big3..., big6) each update only took 15 minutes, for 90 minutes total.
* How can in we tell in advance that the full chunk would take more than a few hours? * Why is it taking SO MUCH LONGER than in smaller chunks? * When a query is taking that long to run, is there any way to tell where in the plan it is? * What should we do differently?
Thanks for any help; this is a real head scratcher for us.
INSERT INTO Users (userName, UserSalt, UserHash1, UserHash2, UT_memberID) select memberFirstName + '.' + memberLastName + '56' as userName, '{AxxxxxDE-6xx6-4xxD-Bxx9-3xxxx79xxxxE}', '{4xxxxxx6-8xx5-6xxD-Cxx6-4xxxFxxx1xx9}', '{0xxx8xxE-Cxx4-6xx8-ExxB-Dxxxx4xxx2xC}', members.memberID From members Inner Join groupLeaders ON members.memberID = groupLeaders.memberID SELECT @@Identity AS UserID
How can I modify the portion that is inserting the '56' at the end of each username to do the following:
1) check to see if username already exists in the database (using a query with "LIKE %'")
2) if not, create the username "as-is" or how it should be without the number
3) if already exists, get a count of records matching your search criteria .... now make a new username + + (count + 1).ToString();
Any thoughts... I am struggling to put these two pieces together.
Thanks,
Zoop
[EDIT - original post below this]
I have modified my method to make this a bit easier. I added a memberID field to my [Users] table so that I can update my [Members] table in a difference statement after the insert takes place.
I have the following query, and it completes succesfully in query analyzer (though I haven't actually executed the SP, just testing the syntax...) anyway, here is what I have:
Code:
INSERT INTO Users (userName, UserSalt, UserHash1, UserHash2, UT_memberID) select memberFirstName + '.' + memberLastName + '56' as userName, '{AxxxxxDE-6xx6-4xxD-Bxx9-3xxxx79xxxxE}', '{4xxxxxx6-8xx5-6xxD-Cxx6-4xxxFxxx1xx9}', '{0xxx8xxE-Cxx4-6xx8-ExxB-Dxxxx4xxx2xC}', members.memberID From members Inner Join groupLeaders ON members.memberID = groupLeaders.memberID SELECT @@Identity AS UserID
I am hoping this will create a user for all members whose 'memberID' can be found in the groupLeaders table... is this correct?
Also, notice the 56 being appended to the end of each username. I would like this to be a random number generated within a given range... can this be done? any advice?
Thanks,
Zoop
[Original post below - provide more background]
I have three tables involved with this insert/update:
I want to insert into the [Users] table the memberFirstName.memberLastName + randomNum into the 'UserName' column from the [Members] table. Also, I want to make all passwords the same, in this case I know the Salt, Hash1, Hash2 I will be using and would like to pass these in for the 'UserHash1' 'UserHash2' fields.
Now, I only want to make this insert where the memberID is in the GroupLeaders table. and Finally, I need to Update my Members table with a UserID where the memberID matches the one used from the groupLeaders table.
Does anyone have any ideas on how I can accomplish this, even if it requires adding a temporary field to one of my tables... here is what I have so far, but am recieving errors and can't quite figure this one out. (btw - I also don't know how to gen the rand num and was using the literal 23 as a placeholder.) Thanks...
Code:
INSERT INTO Users (userName, UserSalt, UserHash1, UserHash2) select a.memberFirstName + '.' + a.memberLastName + '23' + as userName, '{AA99FCDE-6E06-437D-B9E9-3E3D27955C3E}', '{7xxxxxx2-4xx6-9xx1-7xx9-4x3xx4Axxx59}', '{0xx8xxE-Cxx4-6xxx-xxxx-Fxx3xxxx3xxF}', b.memberID as newMemID From members a, groupLeaders b Where a.memberID = b.memberID SELECT @@Identity AS UserID
Update Members Set UserID = Ident_Current('Users') where memberID = newMemID
On a daily basis I need to perform a bulk update. Table totals about 50,000 records with approximately 5,000 changing (deletes, edits, and new records) per day. I'd like to push just the updates somehow, but VB is too slow and I haven't found a way in to handle it in DTS. Not much experience w/ DTS.
I'm transfering between two SQL 2000 servers w/ a VB app sitting in the middle.