I come from a web based world were loading 1.5 million records into a temp table is suicide. I’m doing more data warehouse stuff now and I was looking into optimizing a buddies proc and noticed he was loading 1.5 million records into a temp table. We had a discussion about it because being from a web world I was drastically against it. He on the other hand didn’t feel it was an issue being it gets called once maybe twice a day. The tempdb is set to autogrow and it is on a different drive than all the other databases on the box. It has one ldf and mdf. He’s creating an index on the table after load. Why we shouldn’t be loading 1.5 million recs into temp table?
I have a requirement to delete 1 Million records from a table having 10 Million data and it's being queried on 24/7 basis (don't have a downtime). how can I achieve that?
Our system runs a SQL Server 2012 DB, it has a table (table_a) which has over 10M records. Our system have to receive data file from previous system daily which contains approximate 3M updated or new records for table_a. My job is to update table_a with the new data.
The initial solution is:
1 Create a table (table_b) which structur is as the same as table_a
2 Use BCP to import updated records into table_b
3 Remove outdated data in table_a: delete from table_a inner join table_b on table_a.key_fileds = table_b.key_fields
4 Append updated or new data into table_a: insert into table_a select * from table_b
As the test result, this solution is very inefficient. Step 3 costs several hours, e.g. How can I improve it?
I am currently working on a simple page to insert 1.6 million UK postcode records into an SQL server table. The table has three columns for the postcode, longditude coordinate and lattitude coordinate. The data is sourced from a pipe (|) delimited txt file and inserted into the database using a FOR loop. The problem I have is that the page will hang after inserting only 10,000 records, the page displays either an invalid View State error or a page cannot be found error. Now I assume the viewstate error stems from the fact that there is a form on the page which simply contains a button to execute the script and a few labels to show the progress. But without the form and associated viewstate the insert still fails to complete.... any ideas?? Would I be better running this on a thread or should I just do it in stages and be patient. I have now modified the page to read the database on load and pick up from where it crashes?
Hello, What is the fastest way to update 20million records in our database. I have tried to do a simple update statement like this: update trail_log with (tablockx, holdlock) set trail_log .entry_by = users.user_identity from users where trail_log.entry_by = users.user_id
but it take 10 plus hours to run since it cannot commit the transactions until the very end. So was was thinking that I need to commit in batch like after 50K but that is slow as well. Set rowcount 50000 Declare @rc int Set @rc=50000 While @rc=50000 Begin Begin Transaction update trail_log With (tablockx, holdlock) set trail_log.entry_by = users.user_identity from users where trail_log.entry_by = users.user_id and trail_log.entry_by not like '%[0-9]%' Select @rc=@@rowcount --Commit the transaction Commit End go I have let the above statement run for 1.5 hours and it only update 450000 rows. Any ideas... Maybe I'm doing it wrong. Please Help!!
I want to update tableToUpdate in batches of 5000 per batch and set the lastenecryptionDT to null based on the the join to the tableValues using the column ENCRYPTIONID, and also output updated rows into another table. Incase I would need to do a rollback.
i have a directory database with approx. 80 million records. i am feeding the database with bulk_insert. Indexing one of the fields took about 8 hrs. After indexing when i run queries with the indexed field the response time is under 1 sec. However if i run select queries with like on non-indexed fields it takes more than 2 mins. So i decided to index 4 other fields in the database and it looks like the indexing process is going to run for 2 days. i am a novice in SQL database design and i am not sure if this is the best way to index the table. i am just using create index. Any suggestions / advice welcome.
I run the following statement and it will not update beyond 7 million plus rows and I have about 38 million to complete. I keep checking updated row counts and after 1/2 day it's still the same so I know something is wrong because it was rolling through no problem when I initiated it. I need to complete ASAP so it's adding to my frustration. The 'Acct_Num_CH' field is an encrypted field (fyi).
SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL WHILE @@ROWCOUNT > 0 BEGIN SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL END SET rowcount 0
I am trying to update a large table which consists of 45 million records , it is taking more than 2 days to the update , below is my approach
1. The table has only one clustered index and no other indexes on the table. 2. I am updating in batches say 20000 record-wise. 3. Changed the recovery mode to bulk logged and auto-growth size is set to 300MB and there is enough space in my disk for transaction log .
I have a table that I need to do some computations on all the data but first I need to remove the duplicate records and insert the results into a destination table. Here's the example below. My table has 3.1 million rows. I have tried using the DISTINCT and the GROUP BY but both ways to select the data takes about half a minute to run. I'm wondering if there is a way to increase performance. Users are ok with this time since the process runs overnight but improving it won't hurt. I do have a clustered index on these fields but that doesn't seem to improve any.
Is it possible to allow a user to insert and update data in a table but prevent them from performing deletes against that same table? For auditing purposes I need to prevent the end users from being able to delete data.
I am doing a performance testing for In-memory option is sql server 2014. As a part I want to insert 500 million rows of records into a in-memory enabled test table I have created.
I need a sample script to insert 500 million records into a table ....
I am currently working with C and SQL Server 2012. My requirement is to Bulk fetch the records and Insert/Update the same in the other table with some business logic? How do i do this?
So I know that each employee should have 2 Type 1's and 4 Type 2's. I hope that makes sense, I'm trying to change my data because ours is very proprietary.
I need to identify employees who do not have all their stages and list the stages they are missing. The final report should only have employees and the associated missing types and stages.
I do a count by employee to see how many types they have to identify the ones that don't have all the types and stages.
My count would look something like this:
EmployeeNumber Type Total 100, 1, 2 100, 2, 2 200, 1, 1 200 1, 2
So I know that employee 100 should have 2 more Type 2's and employee 200 should have 1 more Type 1 and 2 more Type 2's based on the required list.
The problem I'm having is taking that required list and joining to my list of employees with missing data and pulling from it the types and stages that are missing by employee. I thought I could get a list of the employees that are missing information and right join it to the required list where the missing records would be nulls. But, that doesn't work because some employees do have the required information and so I'm not getting any nulls returned.
The following list of actions leads to a corrupt database on SQL Server 2005.
Create a database snapshot Drop a table in the database Backup the database Restore from backup Revert to the snapshot
I'm not entirely surprised that it results in a corruption, what is surprising is that I can revert from a snapshot after restoring. There needs to be some kind of check to prevent reverting to a snapshot in a case like this. Until SQL Server prevents you from doing it, I'd recommend a best practice is to delete all snapshots before you restore a database so that you cannot do this by accident.
I have a row that is being used log track plays on our website.
Here's the table:
CREATE TABLE [dbo].[Music_BandTrackPlays]( [ListenDate] [datetime] NOT NULL DEFAULT (getdate()), [TrackId] [int] NOT NULL, [IPAddress] [varchar](20) ) ON [PRIMARY]
There's a CLUSTERED INDEX on ListenDate ASC and a NON CLUSTERED INDEX on the TrackId.
I have a TRIGGER on the Music_BandTrackPlays table that looks like the following:
CREATE TRIGGER [trig_Increment_Music_BandTrackPlays_PlayCount] ON [dbo].[Music_BandTrackPlays] AFTER INSERT AS UPDATE Music_BandTracks SET Music_BandTracks.PlayCount = Music_BandTracks.PlayCount + TP.PlayCount FROM (SELECT TrackId, COUNT(*) AS PlayCount FROM inserted GROUP BY TrackId) AS TP WHERE Music_BandTracks.TrackId = TP.TrackId
When a simple INSERT statement is done on the Music_BandTrackPlays table, it can take quite a long time. When I remove the TRIGGER the INSERTs are immediate. The Execution plan for the TRIGGER shows that a 'Inserted Scan' is taking up most of the resources.
How exactly is the pseudo 'inserted' table formed?
For now, I think the easiest thing to do is update my logging page so it performs 2 queries. One to UPDATE the Music_BandTracks table and increment the counter, and perform the INSERT into the Music_BandTrackPlays table seperately.
I'm ok with that solution but I would really like to understand why the TRIGGER is taking so long. The 'inserted' pseudo table will be 1 row 99% of the time. Does SQL Server perform a table scan on all 20 million rows in order to determine what's new and put it in the inserted pseudo table?
I have a master securities table which has 7 fields. As a part of the daily process I am uploading flat files into database tables. The flat files contains the master(static) security data as well as the analytics(transaction) data. I need to
1) separate the master (static) data from the flat files,
2) check whether that data is present in the master table, if not then insert that data into the master table
3) If data present then move that existing record to an history table and then update the main master table.
All the 7 fields need to be checked to uniquely identify a single record in the master table.
How can this be done? Whether we can us a combination of data flow items or write a sql procedure to do all this.
I am inserting updating few tables from snapshot and reading same bunch of tables from reporting using readcommitted . It is showing some deadlocks i think it is write in this situation as " x" is not compitable with "s" ,"is".
I need to use Bulk insert statement for copying a table with 200 million rows to another table on the same server...the table has no primary key or identity column.... script for BULK INSERT ...
I have a table that has 4+ million records. I need to update those records. I am facing some performance issue. Can someone please advice?
update stage set batch_status = 1 where update_status = 0
Update transaction Set aId = s.aId, b = s.b,
from stage s Where s.aId = transaction.aId and s.batch_status = 1
Update stage Set update_status = 1, batch_status = 2
where
batch_status = 1
When I run the above query with "set rowcount 1000", it runs in one minute. When I run the query for "set rowcount 10000", it runs in 1 hour 56 minutes. Can someone help me to optimize it?
Hey folks...So I have a table that looks like this:CREATE TABLE [tblStation] ([CAMPAIGN] [varchar] (8),[LISTNUM] [varchar] (10),[PHONE] [varchar] (10),[EVENTTIME] [datetime] ,[STATION] [int],[OPERATOR] [varchar] (16),[EVENTCODE] [varchar],[CALLSPAN] [decimal](18, 0),[FDISP] [int],[RECORDNUM] [varchar],[STC] [varchar],[PROMOC] [varchar],[EXP_CAMP] [varchar],[PROMO3] [varchar],[MAXATT] [char],[LISTNAME] [varchar],[SITENAME] [char],[Row_id] [int] IDENTITYIt's taking nine seconds to run the following command:SELECT count([fdisp])FROM [TrunkFiles_new].[dbo].[tblStation] WITH (NOLOCK)WHERE fdisp IS NULLAnyone familiar with a table of this size having performance likethis? The [fdisp] column has a non clustered index on it.Thanks in advance...
Good day to all, I am new here so i hope i am doing things correctly.
The Company i work for make coils of shaped wire and work a 6 - 6 shift pattern
I have a database that is updated from a data collection source (MS Access) at 06:00 every morning. This seems to be working ok, my problem is that most coils fit nicely into the 6 - 6 shift pattern, but some now and again drift over into the next shift. I have written a crystal report that picks up this data. at the moment the coils are put in the database as: [Coil Start Time], [Coil Finish Time], [Coil Start Weight], [Coil Finish Weight], etc.
I have written (been helped to write) a SQL statement that will do the following:
Step 1: If the Coil Finish time is greater than the shift end time, then set the shit end time to be coil end time and zero start and finish wheight. Step 2: The original Coil record is duplicated and Coil Start time set to start time of shift, all other data left alone.
Example of code:
-->>
SELECT [Batch Name], [Batch Start], [Batch End], [Coil Start Weight], [Coil Finish Weight], [Product], [Shift], [Operator ID], [Works Order No] FROM dbo.tblCoilData WHERE (DATEPART(hour, [Batch Start]) >= 6 AND DATEPART(hour, [Batch End]) < 18) OR ((DATEPART(hour, [Batch Start]) < 6 OR DATEPART(hour, [Batch Start]) >= 18) AND (DATEPART(hour, [Batch End]) < 6 OR DATEPART(hour, [Batch End]) >= 18)) UNION ALL SELECT [Batch Name], [Batch Start], DATEADD(hour, 17, DATEADD(minute, 59, CONVERT(char(10), [Batch End], 101))), 0, 0, [Product], [Shift], [Operator ID], [Works Order No] FROM dbo.tblCoilData WHERE DATEPART(hour, [Batch Start]) >= 6 AND DATEPART(hour, [Batch Start]) < 18 AND (DATEPART(hour, [Batch End]) < 6 OR DATEPART(hour, [Batch End]) >= 18) UNION ALL SELECT [Batch Name], DATEADD(hour, 18, CONVERT(char(10), [Batch Start], 101)), [Batch End], [Coil Start Weight], [Coil Finish Weight], [Product], [Shift], [Operator ID], [Works Order No] FROM dbo.tblCoilData WHERE DATEPART(hour, [Batch Start]) >= 6 AND DATEPART(hour, [Batch Start]) < 18 AND (DATEPART(hour, [Batch End]) < 6 OR DATEPART(hour, [Batch End]) >= 18) UNION ALL SELECT [Batch Name], [Batch Start], DATEADD(hour, 5, DATEADD(minute, 59, CONVERT(char(10), [Batch End], 101))), 0, 0, [Product], [Shift], [Operator ID], [Works Order No] FROM dbo.tblCoilData WHERE (DATEPART(hour, [Batch Start]) < 6 OR DATEPART(hour, [Batch Start]) >= 18) AND DATEPART(hour, [Batch End]) >= 6 AND DATEPART(hour, [Batch End]) < 18 UNION ALL SELECT [Batch Name], DATEADD(hour, 6, CONVERT(char(10), [Batch Start], 101)), [Batch End], [Coil Start Weight], [Coil Finish Weight], [Product], [Shift], [Operator ID], [Works Order No] FROM dbo.tblCoilData WHERE (DATEPART(hour, [Batch Start]) < 6 OR DATEPART(hour, [Batch Start]) >= 18) AND DATEPART(hour, [Batch End]) >= 6 AND DATEPART(hour, [Batch End]) < 18
<<--
I have 2 options now
option 1: Leave this as a SQL View and report from this
option 2: Insert updated records to the tblCoilData table so that the data in the table is permanent
I would prefer option 2 but am a bit of a nugget when it comes to writing update / insert statements, Could someone please help me with this
How well SQL Server can support 300 million records... Any body is working on big database like this. can anyone give me some input on this. it's going to be 60GB database size.
I have a sql script that updates records in a table with 40 million records.
There is some functionality in the script that could be put away in functions for code reuse/elegance.
Functions would cause execution overhead.
What else could I use besides functions that would allow me the code reuse and not compromise the execution over head? Is there any thing like includes in TSQL that would allow me to do so?
I'd like to do the following thing with a data flow task
Get all the records from a source (for example customers from a textfile, flat file source) Then check for each record if the customer already exists in a table, for example with a customerID. If not, insert the record in the table (ole db destination), else copy the customer thats already in the table to another table (history table) and update the record with the customer from the textfile.
Is this possible?, and what kind of data flow transformation do I need?
I have a new client with an existing system that has just over 2 million business listings in one table. Each business listing is associated with one business category.
* Company Table (around 20 fields):
companyID companyName categoryID state postCode etc.
* Category Table (5 fields)
categoryID categoryName etc.
We are using MSSQL 2005 Express Edition with Advanced Services
A free text search needs to be performed on the companyName and categoryName limited by region (state and or postcode).
1) What kind of response times should I expect for the free text search (I have not used the free text search before)
2) How should I index the companyName and categoryName so they are both used in a joined query? i.e. Do I just configure the free text search index on each field separately and it should work?
I want to compare ONLY 1 Column values from 2 tables having more than 4.9 million records. There is a difference of 4000 rows between the 2 tables.
SELECT ID From TABLE1 where ID not in (SELECT DISTINCT ID From TABLE2)
My above query took nearly 4.5 hours to run and I had to cancel it. Is there a better way to write the query . I just want to compare the ID - column values which are missing in TABLE2
Hi I have 2 tables with more then million records in each and I have to perform full outer join. The problem is that the join clause contains 2 different parameters (int and string) like this:
Select * From a full outer join b On a.cli = b.cli OR a.reference = b.reference
Because of the OR in the clause and the million records the query is infinite. If I change to one rule only then it works fine.
How can I join these 2 big tables with 2 rules? Thanks Itay
I have tried to process > 3 million Fuzzy grouping records on two different servers with no success. 3 mill works but anything above 4 mill doesn't. Some background:
We are trying to de-dup our customer table on: name (.5 min), address1 (.5 min), city (.5 min), state (exact). .8 overall record min score. Output includes additional fields: customerid, sourceid, address2, country, phonenumber Without SP1 installed I couldn't even get a few hundred thousand records to process Two different servers - same problems. Note that SSIS and SQL Server are running locally on both The higher end server has 4GB RAM, the other 2.5 GB RAM. Plenty of free disk space on both SQL Server is configured to use 2 GB of RAM max The page file is currently at 15GB
After running a number of test on both servers trying different batch sizes etc. the one thing I noticed is that it seems to always error out when SSIS takes over and starts chewing up all the available RAM. This happens after the index is created and SSIS starts "warming caches". On both servers SQL Server uses up about 1.6GB of RAM at this point while SSIS keeps taking over RAM until all physical RAM is used up.
Some questions:
Has anyone been able to process more then 3 million records and if so what is your hardware configuration? Should we try running SSIS from a different server so it has access to the full amount of physical RAM? (so it doesn't have to fight for RAM with SQL Server) Should we install Win 2003 Enterprise Server so we can add more RAM? Any ideas why switching to the page file might be causing errors?