SQL Server 2014 :: Insert 500 Million Rows Into In-memory Table
Jul 29, 2014
I am doing a performance testing for In-memory option is sql server 2014. As a part I want to insert 500 million rows of records into a in-memory enabled test table I have created.
I need a sample script to insert 500 million records into a table ....
I have a row that is being used log track plays on our website.
Here's the table:
CREATE TABLE [dbo].[Music_BandTrackPlays]( [ListenDate] [datetime] NOT NULL DEFAULT (getdate()), [TrackId] [int] NOT NULL, [IPAddress] [varchar](20) ) ON [PRIMARY]
There's a CLUSTERED INDEX on ListenDate ASC and a NON CLUSTERED INDEX on the TrackId.
I have a TRIGGER on the Music_BandTrackPlays table that looks like the following:
CREATE TRIGGER [trig_Increment_Music_BandTrackPlays_PlayCount] ON [dbo].[Music_BandTrackPlays] AFTER INSERT AS UPDATE Music_BandTracks SET Music_BandTracks.PlayCount = Music_BandTracks.PlayCount + TP.PlayCount FROM (SELECT TrackId, COUNT(*) AS PlayCount FROM inserted GROUP BY TrackId) AS TP WHERE Music_BandTracks.TrackId = TP.TrackId
When a simple INSERT statement is done on the Music_BandTrackPlays table, it can take quite a long time. When I remove the TRIGGER the INSERTs are immediate. The Execution plan for the TRIGGER shows that a 'Inserted Scan' is taking up most of the resources.
How exactly is the pseudo 'inserted' table formed?
For now, I think the easiest thing to do is update my logging page so it performs 2 queries. One to UPDATE the Music_BandTracks table and increment the counter, and perform the INSERT into the Music_BandTrackPlays table seperately.
I'm ok with that solution but I would really like to understand why the TRIGGER is taking so long. The 'inserted' pseudo table will be 1 row 99% of the time. Does SQL Server perform a table scan on all 20 million rows in order to determine what's new and put it in the inserted pseudo table?
I need to use Bulk insert statement for copying a table with 200 million rows to another table on the same server...the table has no primary key or identity column.... script for BULK INSERT ...
I'm looking for some performance assistance on updating a column value in a table that contains approximately 50 million rows. I have a permanent table in another database that has the key column and value to be set. My query is listed below, but I'm afraid it will run quite awhile. Any suggestions would be appreciated.
update mytable set column2 = b.column2 from mytable as a join mytable1 as b on a.column1 = b.column1
There is a one to one relationship between the two tables.
CREATE TABLE [Sales].[Test_inmem] ( [c1] [int] NOT NULL, [c2] [nvarchar](20) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, [ModifiedDate] [datetime2](7) NOT NULL CONSTRAINT [IMDF_Test_ModifiedDate] DEFAULT (sysdatetime()),
[Code] ....
I have to generate 1000000 random records into it. I tried various ways to insert records, but not being a developer could not do it. I hope to make the C1 as a serial number, C2 can be anything, C3 I want to be the timestamp.
I've been having some trouble getting a single-column "varchar(5)" field to reliably use a table seek instead of a table scan. The production table in this case contains 25 million rows. As impressive as it is to scan 25 million rows in 35 seconds, the query should run much faster.
Typically, this table is accessed with a query that includes:
SELECT ... FROM SummaryTable WHERE ixZIP IN (SELECT ZipCode FROM @ZipCodesForMO)
This query insists on using a table scan. I've tried WITH (FORCESEEK) for example, but that just makes the query fail.
As I've investigated this issue I also tried:
SELECT * FROM Summaries WHERE ZipCode IN ('xxxxx', 'xxxxx', 'xxxxx')
When I run this query with 64 or fewer (actual, valid) ZIP codes, the query uses a table seek.But when I give it 65 or more ZIP codes it uses a table scan.
To summarize, the production query always uses a table scan, and when I specify 65 or more ZIP codes the query also uses a table scan. I'm wondering if the data type of the indexed column (Latin1_General_100_BIN2) is somehow the problem. I'll likely try converting the ZIP codes to an integer to see what happens.
I try to load data into a memOpt table (INSERT INTO ... SELECT ... FROM ...). The source table has a size about 1 Gb and 13 Mio Rows. During this load the LDF File grows to size of 350 GB (until the space if the disk is run out of space). The Server has about 110 GB Memory for the SQL Server reserved. The tempdB doesn't grow. The Bucket Size in the create statement has a size of 262144. The Hash key as 4 fields`(2 fields have the datatype int,1 has smallint, 1 has varchar(200). ) The disk for the datafiles has still space for the datafiles (incl. the hekaton files).
How can I reduce the size of the ldf files during the load of the data ?
I have a requirement to delete 1 Million records from a table having 10 Million data and it's being queried on 24/7 basis (don't have a downtime). how can I achieve that?
IF NOT EXISTS (SELECT TOP 1 1 FROM dbo.syscolumns WHERE id = OBJECT_ID(N'dbo.Employee) and name = 'DoNotCall') BEGIN ALTER TABLE [dbo].[Employee] ADD [DoNotCall] bit not null Constraint DoNot_Call_Default DEFAULT 0 IF ( @@ERROR <> 0 ) GOTO QuitWithRollback END
It just takes a LOT of time in SQL Server Management studio. I have to cancel the query and cancelling takes a whole lot time. I am using SQL Server 2008.
Select statement joining file1 to file2. File 1 may have 0, 1, or many corresponding rows in file2. I need to count the corresponding rows in table2. Table2 also has a Boolean column and I need to count the number of rows where it is true. So I need to count the total number of matching rows and the count of those that are set to true. This is an example of what I have so far. I had to add each column being selected into a Group by to make it work, but I do not know why. Is there some other way this should be set up.
SELECT c.CarId, c.CarName, c.CarColor, COUNT(t.TrailerId) as trailerCount, (add count of boolian, say t.TrailerFull is true) FROM Car c LEFT JOIN Trailer t on t.CarId = c.CarId GROUP BY c.CarId, c.CarName, c.CarColor
I run the following statement and it will not update beyond 7 million plus rows and I have about 38 million to complete. I keep checking updated row counts and after 1/2 day it's still the same so I know something is wrong because it was rolling through no problem when I initiated it. I need to complete ASAP so it's adding to my frustration. The 'Acct_Num_CH' field is an encrypted field (fyi).
SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL WHILE @@ROWCOUNT > 0 BEGIN SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL END SET rowcount 0
I need to update about 1.3 million rows in a table of mine. I am getting the data from one of the columns of the same table and updating the new column. I am doing this using a cursor which I have put in a stored procedure. As this is a production table which users might be accessing.It is a web based application and I can't slow the system down. So I am willing to run the stored prcedure during off peak hours. However, do I need to put this in a transaction? If I did put it in a transaction what type of isolation level should I opt for? Data integrity is very important for me and I don't mind to compromise on the performance. I am doing this because one of the columns which has "short description" entry is has become too small for business purposes and we want to increase it's length from varchar(100) to varchar(150). As this is SQL 6.5, I can't increase the lenght of the column. So Iadded a new column and will run the stored proc. What precautions are to be taken? This is on a high priority basis and very important too.
Thanks in advance...
Stored procedure code:
USE DB_Registration_Dev GO IF EXISTS (SELECT NAME FROM SYSOBJECTS WHERE NAME='usp_update_product' AND TYPE='P') DROP PROCEDURE usp_update_product GO CREATE PROC usp_update_product AS DECLARE @short_desc varchar(100) DECLARE @prod_id int
DECLARE sdesc_curs CURSOR FOR SELECT [Product].[product_id] , [Product].[short_description] FROM Product
OPEN sdesc_curs
FETCH NEXT FROM sdesc_curs INTO @prod_id, @short_desc
WHILE @@FETCH_STATUS = 0 BEGIN UPDATE Product SET [Product].[sdesc] = @short_desc WHERE Product_id=@prod_id FETCH NEXT FROM sdesc_curs INTO @prod_id, @short_desc IF @@FETCH_STATUS <> 0 PRINT ' Finished Updating the table...go ahead and have fun ...! ' END DEALLOCATE sdesc_curs GO
My question is: How can I insert a row for each unique TemplateId. So let's say I have templateIds like, 2,5,6,7... For each unique templateId, how can I insert one more row?
I have a CTE query against a table with 32K rows that runs fine in 2008R2. I am running it in 2014 Std Ed. against the same data and it runs very slowly. Looking at the execution plan I think I see what's contributing to the slowness.
Note that the "actual number of rows" is some 351M...how is this possible?
the query:
declare @amts table (claim int,allowed decimal(12,2),copay decimal(12,2),deductible decimal(12,2),coins decimal(12,2)); ;with unpaid (claimID) as (select claimID from claim where amt+copay + disct+mm + ded=0) insert @amts select lineID, sum(rc), sum(copay), sum(deduct), case when sum(mm)>0 and (sum(mm)<sum(mmamt)) then sum(mm) else 0 end from claimln where status is null and lineID not in (select claimID from unpaid) group by lineID
it's like there's some massively recursive process going on?
I tried with the following and result is coming for one month i.e. JUL but not with the second Month i.e Jun
SELECT 'Jul1' AS MON, [BNQ], [FNB], [RS] FROM (SELECT REVENUECODE, SUM(ROUND(((Jul/31)*30),0)) AS JUL FROM RM_USERBUDGETTBL WHERE USERNAME='rahul' AND FY=2015 GROUP BY REVENUECODE, USERNAME ) AS SourceTable PIVOT (SUM(JUL) FOR REVENUECODE IN ([BNQ], [FNB], [RS])) AS PivotTable
I have two tables for insertion in one transaction scope. Table one have 10 rows. After first table insert statement (not yet committed) if I run select on first table from other session, it holds table until my insert is committed or rolled back and from (SSMS), it display 10 rows and then wait for transaction scope till finished. My question is do I need to use no lock hint in this situation. Or there is something wrong with isolation level. One saying that in this situation table should not hols select while insert is in transaction scope.
I have a master table with after insert trigger on it.. When record is inserted into master table, the trigger fires and is captured in the backoffice table. In case the trigger fails, my record is neither in the master table nor in the back office table..
Is there anyway to capture the record either in the master table or in a separate table.
I have a need to insert stored procedure output a table and in addition to that add a datetimestamp column.. For example, Below is the process to get sp_who output into Table_Test table. But I want to add one additional column in Table_test table with datetimestamp when the procedure was executed.
I do not insert/update/delete on the view directly.
For every insert/update in table A /B the values should get insert/update in the view respectively. This insert/update on view should invoke the trigger.
And I am unable to see this trigger work on the view if any insert/update occurs on base table level.
Trigger is working only if any operation is done directly on the view.
In a t-sql 2012 sql script, I have the following script, that only works for a few records since the value of TST.dbo.LockCombination.seq only contains the value or 1 in most cases. Basically for every join listed below, there should be 5 records where each record has a distinct seq value of 1, 2, 3, 4, and 5. Thus my goal is to determine how to add the missing rows to the TST.dbo.LockCombination where there are no rows for seq values of between 2 to 5. I would like to know how to insert the missing rows and then do the following update statement. Thus can you show me the sql on how to add the rows for at least one of the missing sequence numbers?
UDATE LKC SET LKC.combo = lockCombo2 FROM [LockerPopulation] A JOIN TST.dbo.School SCH ON A.schoolnumber = SCH.type JOIN TST.dbo.Locker LKR ON SCH.schoolID = LKR.schoolID AND A.lockerNumber = LKR.number JOIN TST.dbo.Lock LK ON LKR.lockID = LK.lockID JOIN TST.dbo.LockCombination LKC ON LK.lockID = LKC.lockID WHERE LKC.seq = 2
A normal select statement looks like the following:
select * from TST.dbo.Locker LKR JOIN TST.dbo.Lock LK ON LKR.lockID = LK.lockID JOIN TST.dbo.LockCombination LKC ON LK.lockID = LKC.lockID where LKR.number in (000,001,1237)
In case you need the ddl statements for the tables affected here are the ddl statements:
CREATE TABLE [dbo].[Locker]( [lockerID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL, [schoolID] [int] NOT NULL, [number] [varchar](10) NOT NULL, [serialNumber] [varchar](20) NULL, [type] [varchar](3) NULL, [locationID] [int] NULL,
I am writing a query to return some production data. Basically i need to insert either 1 or 2 rows into a Table variable based on a decision as to does the production part make 1 or 2 items ( The Raw data does not allow for this it comes from a look up in my database)
I can retrieve all the source data i need easily but when i come to insert it into the table variable i need to insert 1 record if its a single part or 2 records if its a twin part. I know could use a cursor but im sure there has to be an easier way !
Below is the code i have at the moment
declare @startdate as datetime declare @enddate as datetime declare @Line as Integer DECLARE @count INT
set @startdate = '2015-01-01' set @enddate = '2015-01-31'
I am finding it difficult to find an example that allows for insertion of additional rows into a table, without dropping the table I'm inserting into. Or inserting specific values. Like this example..
[URL] ....
I have 6 table I am formatting the data to conform to the final table as I'm inserting it into, but none of these examples gives me the example needed. I am using SQL 2012.
<code> SELECT CONVERT(VARCHAR(50),[FName]) + ' ' + CONVERT(VARCHAR(50),[LName]) AS [CustName] ,CAST('ALARMCOM' as nvarchar(8)) as VendorName ,CONVERT(VARCHAR(25),[CUSTOMER_CS_ACCOUNT_NUMBER]) AS [Cust_ID] ,CONVERT(VARCHAR(40),[Charge_Description])as [ChargeType] ,CASE
I'm new to using a DB and have a few questions about what I'm trying to do. I have some historical options data and want to place it into a sql express database. (I understand I might need to use a none express version once the db gets to big.) A months worth of data is over 5.5 million rows of data. So six years worth is ~400 million rows. Is it possible to put this into a sql db and be able to search it very fast? I have a months worth in a db now and it is pretty slow. Should I use a new table for each month and then have 6 years * 12 month = 72 tables to increase the search speed? I search by date and stock_symbol and the data looks like this: Date, Stock_Symbol, Option_Symbol, Strike, BidPrice, AskPrice, Volume, OpenInterest, (and a few others) The select statement is simple: SELECT * FROM Options WHERE Date = @Date and StockSymbol = @Symbol Thanks
DELETING 100 million from a table weekly SQl SERVER 2000Hi AllWe have a table in SQL SERVER 2000 which has about 250 million recordsand this will be growing by 100 million every week. At a time the tableshould contain just 13 weeks of data. when the 14th week data needs tobe loaded the first week's data has to be deleted.And this deletes 100 million every week, since the delete is taking lotof transaction log space the job is not successful.Can you please help with what are the approaches we can take to fixthis problem?Performance and transaction log are the issues we are facing. We trieddeletion in steps too but that also is taking time. What are thedifferent ways we can address this quickly.Please reply at the earliest.ThanksHarish
In our database, we have a very large table that gets updated every morning, start of the day is copying 4 million rows from the fact table from previous date to today's date in the same table and then some other processing. It takes 1 1/2 to 2 hrs to do this. There is a dts package created to copy these rows into temp table and then to this fact table.
This table has more than 200 million rows
Any ideas on how to accomplish this without doing the copy twice and not running into locking problems.