SQL Server Installing With Millions Or Rows In Tables
Oct 11, 2007
Hello,
Currently we are in the process of implementing a sql server database where couple tables will have millions of rows (about 98 millions and will grow) and an asp.net site that will retrieve and sort the data
What will be the best practice installing the database in situation like this one?
Do we need a cluster server? Indexing needs to be done in a special way?
Thanks in advance.
I run the following statement and it will not update beyond 7 million plus rows and I have about 38 million to complete. I keep checking updated row counts and after 1/2 day it's still the same so I know something is wrong because it was rolling through no problem when I initiated it. I need to complete ASAP so it's adding to my frustration. The 'Acct_Num_CH' field is an encrypted field (fyi).
SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL WHILE @@ROWCOUNT > 0 BEGIN SET rowcount 10000 UPDATE [dbo].[CC_Info_T] SET [Acct_Num_CH] = 'ayIWt6C8sgimC6t61EJ9d8BB3+bfIZ8v' WHERE [Acct_Num_CH] IS NOT NULL END SET rowcount 0
I have one view which is based on couple of tables. Here is the definition of view. Which are the options i can use to optimize the view for better performance. This is one of the view which causing issue on database.
CREATE VIEW [dbo].[V_Reqs] WITH SCHEMABINDING AS SELECT purchase.Req.RequisitionID, purchase.Req.StatusCode AS Expr2, purchase.Req.CollectionDateTime, purchase.Req.ReportDateTime, purchase.Req.ReceivedDateTime, purchase.Req.PatientName, purchase.Req.AddressOne, purchase.Req.AddressTwo, purchase.Req.City, purchase.Req.PostalCode, purchase.Req.PhoneNumber,
I am using SQL Server 2012 SE.I am trying to delete rows from a couple of tables (GetPersonValue has 250 million rows and I am trying to delete 50Million rows and GetPerson has 35 Million rows and I am trying to delete 20 million rows). These tables are in TX replication.The plan is to delete data older than 400 days old.
I tried to move data to new tables from the last 400 days and it took me like 11 hours. If I delete data in chunks of 500000 then its taking a long time to rebuild indexes(delete plus rebuild indexes 13 hours). Since I am using standard edition partition wont work.
find ddl below:
GO CREATE TABLE [dbo].[GetPerson]( [GetPersonId] [uniqueidentifier] NOT NULL, [LinedActivityPersonId] [uniqueidentifier] NOT NULL, [CTName] [nvarchar](100) NULL, [SNum] [nvarchar](50) NULL, [PHPrimary] [nvarchar](50) NULL,
i have an application that generate a lot rows from 1 mellion to 2 mellions rows i wana insert this record in MS SQL server in a fast way
i am currentlly loop through this records while it is loaded in dataset building a command text that generate insert query for each row and run it against SQL server
but it takes a lot of time to be finished is there r a way to bulk insert this data?
Hi everyone - I have an ETL package which loads about 10 million rows from SQL 2005 staging tables to new, empty tables (no indexes or constraints) in another SQL 2005 DB to be SWITCHED into the main partioned data tables.
Both databases reside on the same SQL Server instance - it is a dev server so the disk aren't super fast/SAN speeds but it has plenty of RAM/CPU & SCSI disks.
The insert takes about 45 minutes - can I get this working any faster?? or is this typical for 10 million rows?? I've messed about withe the data flow a few times but I can't seem to get any significant improvements.
Any tips anyone??
I perform several lookups on dimensions - these are not cached.
I do query the source table concurrently with different WHERE clauses & run two pipelines processing the data into 2 destination tables.
Would it be better to query the base table once & use a conditional split instead of the two separate queries??
I also mulicast from each pipeline & use a UNION ALL to log some of the rows from each pipeline to anther destination table.
Hope this makes sense?? Any ideas or tips on how I can speed up this kinda transform would be appeciated..
I'm using oleDB connections.
Hope ths makes some kinda sense!! Thanks for any advice!!
Hello,Currently we are in the process of implementing a sql server database where couple tables will have millions of rows ( about 98 millions and will grow) and a web site that will retrieve and sort the data ( read only). How asp.net gridview and sqldatareader act situation like that? Will it be a very slow response? Is there any alternative? Is there any example on the net? Assuming tables are well tuned and well indexed. Thank you in advance.
I have a DataTable in memory and I want to write a C# code to dump the data into a SQL database. Is there a faster way of dumping millions of rows into a SQL table besides running INSERT INTO row by row?
Question A : I need to truncate a table, it has 21 millions of rows and it has a size of 14 GB.
1- How do I find out if this table is not being referenced by a FOREIGN KEY? 2- Does it Participates in a indexed view? 3- Is being published by using transactional replication or merge replication?
I am having one querry regarding the same line. In my stored procedure i am fetching the data from one table containing upto 5 to 6 million rows I made use of index in my database but then also I cant optimise my execution time of that sp. Please help me out of this problem.
I have deleted nearly 30 million rows from a table. But however when I used the sp_spaceused command to calculate the data occupied by the table I don't see any difference in the data size of the table. In fact the data has increased to few MBs after the deletion, but not much.
) AS BEGIN SET NOCOUNT ON --Exception Handling Variable Declaration DECLARE @ErrorMessage NVARCHAR(200), @ErrorNumber INT, @ErrorSeverity INT, @ErrorState INT, @ErrorProcedure NVARCHAR(50), @ErrorLine INT, @ErrorDesc NVARCHAR(100)
DECLARE @XMLPayment INT BEGIN TRY IF @XMLParams IS NOT NULL BEGIN --BEGIN IF
SET @ErrorDesc='Error Occured While Inserting into TIX_PAYMENT_SCHEDULE FROM XML'
INSERT INTO TIX_PAYMENT_SCHEDULE ( OwedAmountId, ProposalId, BrandId, DueDate, OverdueDate , CreatedDateTime, LastUpdatedDateTime, ExpectedAmount, ActualAmountReceived, ScheduleBatchJournalId, RuleId, TransactionStatusId, ActionId, IsLate, IsPaymentReceived , IsValidSchedule, --Added by DC : 119 IsCatchupBalanced, CatchupBalanceIdentifier, HasModified --------------------------------------------------- ) SELECT Main.ELEMENT.value('(OwedAmountId)[1]','int') AS OwedAmountId, Main.ELEMENT.value('(ProposalId)[1]','int') AS ProposalId, Main.ELEMENT.value('(BrandId)[1]','int') AS BrandId, convert(datetime,Main.ELEMENT.value('(DueDate)[1]','varchar(100)')) AS DueDate, convert(datetime,Main.ELEMENT.value('(OverdueDate)[1]','varchar(100)')) AS OverdueDate, @ToDate AS CreatedDateTime, @ToDate AS LastUpdatedDateTime, convert(decimal(18,2),Main.ELEMENT.value('(ExpectedAmount)[1]','varchar(100)')) AS ExpectedAmount, convert(decimal(18,2),Main.ELEMENT.value('(ActualAmountReceived)[1]','varchar(100)')) AS ActualAmountReceived, Main.ELEMENT.value('(ScheduleBatchJournalId)[1]','bigint') AS ScheduleBatchJournalId, Main.ELEMENT.value('(RuleId)[1]','int') AS RuleId, Main.ELEMENT.value('(TransactionStatusId)[1]','int') AS TransactionStatusId, Main.ELEMENT.value('(ActionId)[1]','int') AS ActionId, Main.ELEMENT.value('(IsLate)[1]','char(1)') AS IsLate, Main.ELEMENT.value('(IsPaymentReceived)[1]','char(1)') AS IsPaymentReceived, Main.ELEMENT.value('(IsValidSchedule)[1]','char(1)') AS IsValidSchedule
--Added by DC for 119
,Main.ELEMENT.value('(IsCatchupBalanced)[1]','char(1)') AS IsCatchupBalanced ,Main.ELEMENT.value('(CatchupBalanceIdentifier)[1]','nvarchar(1000)') AS CatchupBalanceIdentifier ,@HasModified ---------------------------------------------------------------------
FROM @XMLParams.nodes ('(/ROOT/DATA)') AS Main(ELEMENT)
END--END IF
END TRY--Main END TRY BEGIN CATCH --Main BEGIN CATCH
SELECT @ErrorMessage = @ErrorDesc+Char(13)+Error_Message(), @ErrorSeverity = Error_Severity(), @ErrorState = Error_State(), @ErrorNumber = Error_Number(), @ErrorProcedure = Error_Procedure(), @ErrorLine = Error_Line() RAISERROR( @ErrorMessage, @ErrorSeverity, @ErrorState, @ErrorNumber, @ErrorProcedure, @ErrorLine ) END CATCH --Main END CATCH END --Main END
BEGIN TRY --Exception Handling SET @ErrorDesc='Error Occured while fetching records from TIX_PAYMENT_SCHEDULE'
SELECT PaymentScheduleId, OwedAmountId, ProposalId, DueDate, OverdueDate, ExpectedAmount, TransactionStatusId, IsPaymentReceived, IsLate, ActionId, ActualAmountReceived, IsValidSchedule, BrandId, CaseScheduleId, ReasonId, Comments, NoOfDays, ActionDate, IsCatchupBalanced, CatchupBalanceIdentifier, HasModified from TIX_PAYMENT_SCHEDULE with (nolock) WHERE DUEDATE <=@ToDate AND IsValidSchedule=@IsValidSchedule
SELECT DISTINCT OwedAmountId,proposalId,brandId from TIX_PAYMENT_SCHEDULE with (nolock) WHERE DUEDATE <=@ToDate AND IsValidSchedule=@IsValidSchedule Order By OwedAmountId,ProposalId,BrandId asc SELECT DISTINCT ProposalId from TIX_PAYMENT_SCHEDULE with (nolock) WHERE DUEDATE <=@ToDate AND IsValidSchedule=@IsValidSchedule Order By ProposalId asc
END TRY BEGIN CATCH SELECT @ErrorMessage=@ErrorDesc+CHAR(13)+ Error_Message(), @ErrorNumber=Error_Number(), @ErrorState=Error_State(), @ErrorProcedure=Error_Procedure(), @ErrorLine=Error_Line(), @ErrorSeverity=Error_Severity()
I run the script below once a day to keep track of row count over time. I would like to compare the results from today and yesterday to see if anyone deleted more than 20% of data from any given table. How would I do this? I really don't need the data anymore than a day just to compare the results.
Mon - Run script to collect row count Tues - Run script to collect current row into temp table ,compare all row count in both tables ,purge records from Monday and insert current Wed - Run script to collect current row into temp table ,compare all row count in both tables
I have a source table in the staging database stg.fact and it needs to be merged into the warehouse table whs.Fact.
stg.fact is not a delta feed it is basically an intra-day refresh.
Both tables have a last updated date so its easy to see which have changed.
It will be new (insert) or changed (update) data that I am interested in, there are no deletions.
As this could be in the millions of rows that are inserts or updates then this needs to be efficient.
I expect whs.Fact to go to >150 million rows.
When I have done this before I started with T-SQL Merge statement and that was not performant once I got to this size.
My original option was to do this is SSIS with a lookup task that marks the inserts and updates and deal with them seperately. However I set up the lookup tranformation the reference data set will have a package variable in the SQL commnd. This does not seem possible with the lookup in 2012! Currently looking at Merge Join transformation and any clever basic T-SQL that could work as this will need to be fast, and thats where I think that T-SQL may be the better route.
Both tables will have >100,000,000 rows Both tables have the last updated date The Tables are in different databases but on the same SQL Instance Each table holds 5 integer columns, one Varchar, one datatime
Last time I used Merge it was a wider table with lots of columns so don't know if this would be an option.
If F.parent_id(101)=T.team_id(101) and T.team_id(101)=T.parent_folder_id (101) then output should come as 'Mobile/c' (this is for f.parent_id=101)
If F.Parent_id=T.team_id and T.team_id!=T.parent_folder_id then parent_folder_id have to start search on team_id column where it got match and pick the Team_name from that corresponding id
Ex: F.parent_id=202 is matching with T.Team_id (202) but this T.team_id(202) is not matching with T.parent_folderid(200) , so this T.parent_folderid (200) have to search on T.id (200) ,if now T.id(200) is matching with T.Parent_folder_id(200) then it have to give the names from the starting hirache
I am trying to join two tables and looks like the data is messed up. I want to split the rows into columns as there is more than one value in the row. But somehow I don't see a pattern in here to split the rows.
This how the data is
Create Table #Sample (Numbers Varchar(MAX)) Insert INTO #Sample Values('1000') Insert INTO #Sample Values ('1024 AND 1025') Insert INTO #Sample Values ('109 ,110,111') Insert INTO #Sample Values ('Old # 1033 replaced with new Invoice # 1544') Insert INTO #Sample Values ('1355 Cancelled and Invoice 1922 added') Select * from #Sample
This is what is expected...
Create Table #Result (Numbers Varchar(MAX)) Insert INTO #Result Values('1000') Insert INTO #Result Values ('1024') Insert INTO #Result Values ('1025') Insert INTO #Result Values ('109') Insert INTO #Result Values ('110')
[Code] ....
How I can implement this ? I believe if there are any numbers I need to split into two columns .
We have the below query which is pulling in Sales and Revenue information. Since the sale is recorded in just one month and the revenue is recorded each month, we need to have the results of this query to only list the Sales amount once, but still have all the other revenue amounts listed for each month. In this example, the sale is record in year 2014 and month 10, but there are revenues in every month as well for the rest of 2014 and the start of 2015 but we only want to the sales amount to appear once on this results set.
Is it possible to install and configure the MSDTC resource in a SQL Server 2000 cluster after SQL is installed and running?
When I recently went through a rebuild of my cluster, I forgot to install the resource before installing SQL Server. Now, if I install and bring online the MSDTC resource the SQL disk groups will not fail over correctly. The SQL Server resource will not come online.
Thanks in advance for any help. I would really like to avoid rebuilding again.
Hi all,I was given a task to create a houseHolding logic under a table thathave millions records.first let me explain what is a house holding:let's say I have 2 records that have the same phone number, that meanthat both records are under the same household, but this can get morecomplicatedthis article explain ithttp://www.teradata.com/t/page/115924/index.htmlif anyone worked with household he knows that you need to scan thetable many time to get all the house holds, I used a dts to do it.I tested the dts on 11 records like the article did and that workgreat, but once I went to million records each loop is taking me 2 houror so....a and I have no idea how how many loops I will have to do.if anyone out there worked with household queries and used sql, yourimput would help me allotthanks.
While doing migration by using cursors for the below given sample data its taking more hours to complete the process. Therefore want to know is there any way I can do it in simple query.
Iam having Amount value alone and Balance has to be calculated value based on CalType. At the same time the Balance has to be reset as 0 when AcNo has changed.
I have been trying to store millions of files (layer tiles) in NTFS now for a while with little success as the read/write speed drops off the chart or NTFS itself gets corrupted. My questions are:
1) Is there any way to store millions of files successfully in NTFS? 2) What does VE use to store its tiles (I am guessing SQL Server)? 3) If VE does not use SQL Server to store tile images, has anyone tried it and what are the pros/cons?
Hi all, Im new here so il start with a little introduction of myself, My name is Arjan im 19 years old from Holland, and i work for a company to compleet my ICT Education.
My situation:
My boss gave me a server with server 2003 standard and Sql server 2005 and visual studio 2005 installed already, he asked me if i could figure out how the 'new' reporting services work, Im pretty new to SQL and the reporting service but i figured out i had to install asp.net / frameworks and IIS.
So right now i wanna start the Reporting Services Configuration Manager and i get an error that says 'Invalid namespace' and when im trying to approach by using my browser i get 'page not found' so obviously their is Alot wrong. I asked my boss if i could not reinstall everything and do it in the correct order (IIS / ASP.net / Frameworks before installing SQL server 2k5 but that was not an option because we dont seem to have the cd's anymore.
The server is not connected to any network or the internet.
My Question:
Is their any way to fix this? and if yes could anyone tell me where to start
Thanks in Advance!
ps if their is information or logs that u need in order to help me just say so :-)
Hi,There are about 30 millions records on my mssql server and I want to access 2 million of them at one time. However, when I try to access with sql command I get time out error. I want to select first 100 record and select the other 100 and so on. May I obtain this?For example;select * from tbl_Customer where name = @name_ ->time out errorSomeone has said that you can solve this problem with < cursors > but I can't find enough article. Thanks...
i have a query to delete millions of records. I whant to delete in batches of a 1000. My Select join statement will return millions of records so this takes alot of time how to i select a 1000 records delete everything that his not in those record and loop and not select the same records again.Here is what i have :
DECLARE @i INT WHILE (1=1) BEGIN BEGIN TRAN DELETE TOP(1000) FROM dbo.ABC123 WHERE SUBSTRING(dumbdumb,1,8) NOT IN
The iussue:Sql 2KI have to keep in the database the data from the last 3 months.Every day I have to load 2 millions records in the database.So every day I have to export (in an other database as historical datacontainer) and delete the 2 millions records inserted 3 month + one day ago.The main problem is that delete operation take a while...involvingtransaction log.The question are:1) How can I improve this operation (export/delete)2) If we decide to migrate to SQL 2005, may we use some feature, as"partitioning" to resolve the problems ? In oracle I can use the "truncatepartition" statement, but in sql 2005, I'm reading, it cant be done.This becouse we can think to create a partition on the last three mounts tosplit data. The partitioning function can be dinamic or containing afunction that says "last 3 months ?" I dont think so.May you help usthank youMastino
I'm trying to compare about 28 million records (270 length) from table A and B using the Lookup task as described in this forum. The process works fine with about two million records or so on my desktop ( p.4 3.39GHz, 1.5 GB Ram), but hangs with the amount of data I'm trying to process. I tried using full and partial caching, but to no avail. I'm thinking this is a hardware resource problem. So, does anyone has any recommendation on the hardware needed for this kind of operation and/or suggestion? Thanks in advance...
Hello all quick question-- Im looking for the most effiecient way to extract data daily from a table with some 9.5 mill records and growing. These are transaction records and ideally I would like to bring over the last days transactions and add them to my existing table. I cannot use the transaction date as sometime we have to operate in an "Offline" mode where the records are brought over sometime later. This could be days are unfortunetaly a week or more. there are some 30 fields in the transaction table so is there a more efficient way to do this simply creating a concatenated key?? Would it be more effiecient to drop and recreate the table daily? that sounds extreme so wanted to get a few ideas.