I have some problems with our database which is growing too large, and was hoping someone might have some tips on what I can do!
I have about 100 clients, each logging about 10 000 rows of status logs a day. So after just a few days the db is growing very large.
At present it's manageable, since I don't need to "dig" into the logs more than a few times a day. The system it self is not affected by the size of the log or traffic on the server. But it will increase to about 500 clients in 2004, and 1000-1500 in 2005. So I really need a smarter solution than what I have today to be able to use the log efficiently.
98-99% of these rows are status-messages which are more or less garbage during normal operation. But I still need to keep them in case an error occurs, and we need to go back an hour or two (maybe a day) to see what went wrong. After 24-48 hours these 98-99% are of no use. I do however like to keep the remaining 1-2%, they are messages like startup, errors, etc. Ideally they should be logged in two separate tables by the clients, but unfortunatelly I cannot make the clients change their logging.
This presents problems on multiple levels. Mainly in searching, which often times out, but also with backup and storagespace. At the moment I check the system for errors, and every other day I just truncate the log-file. It works, but it's not exacly elegant......
The server is a 1100 MHz P3 / 512MB / Windows 2000 Server /
SQL Server 2000. Faster hardware would help, but the problem is more of a "bad design" than "slow hardware" problem.
My log is pretty simple, as follows:
LogId - int - primary key - clustered index
ClientId - int - index asc
LogTypeId - int - index asc
LogValue - nvarchar[2500], ikke index
LogTimeStamp- datetime - index asc
I have deducted 3 different solutions:
Method 1:
Simply run "Delete from db_log where logtyipeid <> stuff_I_want_to_keep".
This is the simplest and the one i prefer, but it takes too long time to complete. Any tips to speed this process up?
Method 2:
Create a trigger which runs something like "Delete from db_log where logtypeid <> stuff_I_want_to_keep and date < today_minus_two_days" every hour or so. This will ensure that the db doesn't grow to large. But if I'm away from work a few days we might loose data we'd wanted to keep.
Method 3:
Copy what I want to keep into another table, and empty the log. Sort of like "Insert into db_log_keep stuff_to_keep; drop db_log; create table db_log; " (or truncate, but that takes a long time too)
But then I would be stuck with two log tables, "48-hour_db_log" and "db_log_keep". I could use a view to "union" them so they would appear as a single table, but that's not ideal either.
However, it seems as this method is what will work best for my set-up, unless there are other suggestions??
Method 4:
...eagerly awaiting ideas!!! :-)
(Also, whatever tips and/or links to info on maintaing VLDB's are greatly appreciated. )
We have a database. It is enabled for mirroring. We need to delete the old records. That is around 500k records from a table. But it has foreign key relation. How to do in Production servers these kind of deletes?
In another forum post, a poster was deleting large numbers of rows from a table in batches of 50,000.
In the bad old days ('80s - '90s), I used to have to delete rows in batches of 500, then 1000, then 5000, due to the size of the transaction rollback segments (yes - Oracle).
I always found that increasing the number of deleted rows in a single statement/transaction improved overall process speed - up to some magic point, at which some overhead in the system began slowing the deletes down, so that deleting a single batch of 10,000 rows took more than twice as much time as deleting two batches of 5,000 rows each.
good rule-of-thumb numbers (or even better, some actual statistics and/or explanations) as to how many records should be deleted in a single transaction/statement for optimum speed? 50,000 - 100,000 - 1,000,000 or unlimited? Are there significant differences between 2008, 2012, 2014?
We are busy designing a generic analytical system at work that willhold multiple analytic types over time. This system is being developedin SQL 2000.Example of tableIDENTITY intItemId int [PK]AnalyticType int [PK]AnalyticDate DateTime [PK]Value numeric(28,15)ItemId - the item for which the analytic is being storedAnalyticType - an arbitrary typeThe [PK] tag indicates the composite primary key.Our scenario is the following:* For this time series data, we expect around 250 days per year(working days) and the dataset could extend to over 20 years* Up to 50 analytic types* Up to 20,000 itemsLooking at the combined calculation - this comes to roughly somethinglike25 * 20,000 * 50 * 250 or around 5 billion rows.We will be inserting around 50*20,000 or around 1 million rows each day(the inserts will take place in the middle of the night (outside themain query time) - this could be done through something like BCP orBULK INSERT.Our real problem is we have not previously worked with such largetables before and are nervous that our system is going to grind to ahalt. Our biggest tables are around 20 million rows at the moment.Scanning through google and microsoft's own site we have found aparititioning method that is available.http://www.microsoft.com/resources/...art5/c1861.mspxHaving experimented with the above system it seems rather quirky andlooking at the available literature it seems that this is not moreeffective than a clustered index as far as queries go.It needs to be optimized for queries like:Given the ItemID and the AnalyticType search for a specific date or aspecific range of dates.If anyone has any experience or helpful suggestions I would reallyappreciate it.ThanksA
A select query returns around 1 million rows. The column in the WHERE condition is indexed. This query takes nearly 1 minute for returning the all the records. Is this normal ?
Does the number of records returned affect the performance inspite of the indexing ?
I need to insert a very large number of rows into a table (in SQL Server 7.0) using ADO. Could you please tell me i there is a way for FAST insert, something similar to BCP ... or any other way of inserting large number of rows efficiently
Table structure: col1 IDENTITY (seed=1 increment=1) + few other columns (col2...col7) + one text column (col 8) I have around 50,000,000 rows per day inserted in the table T1. At the end of the day 40,000,000 rows are deleted. I have to keep the records for 12 months and then archive it. Database is 24/7 web serving and there is no down time allowed. IDENTITY column will go out of range (overflow) after less than two years, unless the identity seed is reset to the start value (seed=1, increment=1). At the end of 12th month data is archived in another table and only last month is kept in the table T1. So table T1 enters new year with data from last month of the previous year. There are few other tables that refer to this table by using there own field with values from T1.IDENTITY column (referential integrity is not enforced). Identity column in T1 is needed as a unique id for some search actions. Performance is an issue therefore bigint data type is used for this identity column rather than decimal.
Another problem I have is how to do table update on one column (1 mil rows to be updated out of 2 mil of rows) with the minimum impact on the users who are querying this table heavily. Not need to mention that it is web app 24/7 no down time.
i need to add a datetime column to an exisitng table that has like 1.2 million records and its being accessed frequently but i cant afford to stop the db at all
whenever i do : alter table mytable add Updated_date datetime
it just takes too long and i have to stop executing the query after a couple of mins I am running sql express 2005 sp2. db size is over 3 gb but still under the 4 gb limit
can u plz advice on how to add this column. its urgent!!
/****** Object: StoredProcedure [dbo].[dbo.ServiceLog] Script Date: 07/18/2014 14:30:59 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER proc [dbo].[ServiceLogPurge]
-- Purge records dbo.ServiceLog older than 3 months: -- Purge records in small portions to avoid locking production tables -- for a long time. The process takes longer, but can co-exist with -- normal usage of the tables.
[Code] ...
*** Getting this error below when executing the code ***
Msg 102, Level 15, State 1, Procedure ServiceLogPurge, Line 45 Incorrect syntax near 'Failed:'.
A New Monthly data is being loaded, checked and finally approved after 6 or 7 iteration before approval.Because of this iteration the monthly data set is being added then deleted then added then deleted few times.Because the table is big this process takes time, any thoughts on how to make the delete insert process faster.Keep in mind I cannot do much because it is a production table and is being access by other users to do other analysis.
Delete is done based on trx_date which is a year/month combo, like 201508.
The table has monthly sales by customer aggregated.
The table structure is:
CREATE TABLE [dbo].[Sales]( [batch_key] [int] NOT NULL, [Company_key] [int] NOT NULL, [customer_key] [char](22) NOT NULL, [Trx_Date] [int] NOT NULL, [account] [nvarchar](35) NOT NULL,
We are running SQL Server 2005 Ent Edition with SP2 on a Windows 2003 Ent. Server SP2 with Intel E6600 Dual core CPU and 4GB of RAM. We have an C# application which perform a large number of calculation that run in a loop. The application first load transactions that needs to be updated and then goes to each one of the rows, query another table get some values and update the transaction.
I have set a limit of 2GB of RAM for SQL server and when I run the application, it performs 5 records update (the process described above) per second. After roughly 10,000 records, the application slows down to about 1 record per second. I have tried to examine the activity monitor however I can't find anything that might indicate what's causing this.
I have read that there are some known issues with Hyper-Threaded CPUs however since my CPU is Dual-core, I do not know if the issue applies to those CPUs too and I have no one to disable one core in the bios.
The only thing that I have noticed is that if I change the Max Degree of Parallelism when the server slows down (I.e. From 0 to 1 and then back to 0), the server speeds up for another 10,000 records update and then slows down. Does anyone has an idea of what's causing it? What does the property change do that make the server speed up again?
If there is no solution for this problem, does anyone know if there is a stored procedure or anything else than can be used programmatically to speed up the server when it slows down? (This is not the optimal solution however I will use it as a workaround)
I am trying to enable database mirroring for 100 database. It goes error free till 59 databases (some times 60 databases) with the status (principal, synchronized) on principal. on the 60th or 61st database it gave the status (principal, disconnected). Also mirror starts acting abnormal. connection to mirror starts to give connection timeout and it is not enabling database mirroring on any more databases. I have SQL SERVER 2005 Enterprise with SP1 on the servers. witness is not included yet.
this are my test servers... i have more than 500 databases on my production servers.
principal and mirror both are using port 5022 for ENDPOINT communication.
I am trying to enable database mirroring for 100 database. It goes error free till 59 databases (some times 60 databases) with the status (principal, synchronized) on principal. on the 60th or 61st database it gave the status (principal, disconnected). Also mirror starts acting abnormal. connection to mirror starts to give connection timeout and it is not enabling database mirroring on any more databases. I have SQL SERVER 2005 Enterprise with SP1 on the servers. witness is not included yet.
these are my test servers... i have more than 500 databases on my production servers.
principal and mirror both are using port 5022 for ENDPOINT communication.
All of the databases are critical and all must be included in the Database Mirroring. so, after that I tried to implement database mirroring again...... System has 3 GB of RAM, SQL SERVER (Mirror) using 85 MB of RAM but still giving this error while trying to enable database mirroring for 37th Database.....
"There is insufficient system Memory to run this query"
Hi We are using the SQL Server 2005 Full Text Service. The data is not huge, but the kind of data is that each record is small and there are a large number of records. There are 35 million records now with 11 GB of data and about 1.6 GB of FT catalog on the table. This is expected to grow to at least 10 times the size of this data. The issue is with FTS taking a long time to return results when the number of hits (rows) getting returned from FTS is large for some searches, it takes a very long time. With the same data & catalog, those full text queries for less common words return timely. The nature of the problem doesnt allow us to only have top results. We need all the results. So it’s not about the size of data but the number of results getting returned from FT. (As the catalog is inverted). The machine is dual processor with 4 GB RAM.
I am considering splitting the table and hence the catalog and using multiple servers to do full text searches in smaller catalogs. Is there any other way this issue can be solved ?
If splitting is the only way, can you give me an idea as to what is a statistical/standard limit to the number of search results/cataog size as which FTS gives good results
I have a 2 node cluster having 4 cores each wherein having 3 instances of SQL 2008 R2 enterprise comprising of 60 databases, 20 on each instance. I need to setup mirroring for each of the databases to a secondary server having 4 cores and 3 instances. What i understand is that in this case the mirror server will be providing max of 512 worker threads and the 60 mirror databases would consume 240 threads.what all needs to be checked for looking into the feasabilty of going ahead with a async mirror setup as mentioned above.
I've installed the MDW (Mangement Data Warehouse) database on our central monitoring SQL Server. I've then added a number of servers to be monitored. The data is collected on the servers that are being monitored and uploaded to the central MDW Monitoring server.
On the servers that are being monitored, I'm seeing a large number (over 1000) of SPIDs being generated by 'SQL Server Data Collector'.
Is this normal behaviour? I've seen more blocking as a result of this.
Is there any way to reduce the number of SPIDs generated?
I am importing a text file with a column (serial numbers) with alphanumeric data, some mixed and some only numeric. The very large values that are all numeric are being converted to exponential when I run it thru an import package in SQL Server Integration Services (2005)
Ex. 4110041233214321 --> 4110040000000000 (displays as 4.11E+15)
In the past I dealt with this by importing the text file into Excel and changing the format of the column to number. This works even when many of the values contain alpha characters. I am not sure how to accomplish this same thing without going thru Excel. If you have any ideas on this I would be happy to hear from you.
I need to delete data from a particular table which has more than half a million records. The data needs to be deleted is more than 200,000 records from the table. What is the best way to delete the data from the table other than importing into a temporary table and performing the same operation?
Let me know if the strategy to be followed is okay.
1. Drop all the triggers 2. Drop all the indexes 3. Write a procedure with a loop setting ROWCOUNT to 1000 and delete the records. ( since if I try to delete all the rows it will give timeout error ) The above procedure will delete 1000 records for each batch inside the loop till it wipes out all the data for the specified condition. 4. Recreate Indexes and Triggers.
Please let me know if there are any other optimal solution.
I have a table 300+GB. it holds 10 years of Data. I need to delete 5 years of data and put it to another server so I can have more space.
If I delete 5 years of data, Transaction log gets so huge and size of the database even gets bigger because of the .ldf file which even gets bigger! I think I can shrink the log file and the data file. Is this the best way to do it?
I have a source table in the staging database stg.fact and it needs to be merged into the warehouse table whs.Fact.
stg.fact is not a delta feed it is basically an intra-day refresh.
Both tables have a last updated date so its easy to see which have changed.
It will be new (insert) or changed (update) data that I am interested in, there are no deletions.
As this could be in the millions of rows that are inserts or updates then this needs to be efficient.
I expect whs.Fact to go to >150 million rows.
When I have done this before I started with T-SQL Merge statement and that was not performant once I got to this size.
My original option was to do this is SSIS with a lookup task that marks the inserts and updates and deal with them seperately. However I set up the lookup tranformation the reference data set will have a package variable in the SQL commnd. This does not seem possible with the lookup in 2012! Currently looking at Merge Join transformation and any clever basic T-SQL that could work as this will need to be fast, and thats where I think that T-SQL may be the better route.
Both tables will have >100,000,000 rows Both tables have the last updated date The Tables are in different databases but on the same SQL Instance Each table holds 5 integer columns, one Varchar, one datatime
Last time I used Merge it was a wider table with lots of columns so don't know if this would be an option.
DELETING 100 million from a table weekly SQl SERVER 2000Hi AllWe have a table in SQL SERVER 2000 which has about 250 million recordsand this will be growing by 100 million every week. At a time the tableshould contain just 13 weeks of data. when the 14th week data needs tobe loaded the first week's data has to be deleted.And this deletes 100 million every week, since the delete is taking lotof transaction log space the job is not successful.Can you please help with what are the approaches we can take to fixthis problem?Performance and transaction log are the issues we are facing. We trieddeletion in steps too but that also is taking time. What are thedifferent ways we can address this quickly.Please reply at the earliest.ThanksHarish
I have stumbled on a problem with running a large number of SSIS packages in parallel, using the €œdtexec€? command from inside an SQL Server job.
I€™ve described the environment, the goal and the problem below. Sorry if it€™s a bit too long, but I tried to be as clear as possible.
The environment: Windows Server 2003 Enterprise x64 Edition, SQL Server 2005 32bit Enterprise Edition SP2.
The goal: We have a large number of text files that we€™re loading into a staging area of a data warehouse (based on SQL Server 2k5, as said above).
We have one €œmain€? SSIS package that takes a list of files to load from an XML file, loops through that list and for each file in the list starts an SSIS package by using €œdtexec€? command. The command is started asynchronously by using system.diagnostics.process.start() method. This means that a large number of SSIS packages are started in parallel. These packages perform the actual loading (with BULK insert).
I have successfully run the loading process from the command prompt (using the dtexec command to start the main package) a number of times.
In order to move the loading to a production environment and schedule it, we have set up an SQL Server Agent job. We€™ve created a proxy user with the necessary rights (the same user that runs the job from command prompt), created an the SQL Agent job (there is one step of type €œcmdexec€? that runs the €œmain€? SSIS package with the €œdtexec€? command).
If the input XML file for the main package contains a small number of files (for example 10), the SQL Server Agent job works fine €“ the SSIS packages are started in parallel and they finish work successfully.
The problem: When the number of the concurrently started SSIS packages gets too big, the packages start to fail. When a large number of SSIS package executions are already taking place, the new dtexec commands fail after 0 seconds of work with an empty error message.
Please bear in mind that the same loading still works perfectly from command prompt on the same server with the same user. It only fails when run from the SQL Agent Job.
I€™ve tried to understand the limit, when do the packages start to fail, and I believe that the threshold is 80 parallel executions (I understand that it might not be desirable to start so many SSIS packages at once, but I€™d like to do it despite this).
Additional information:
The dtexec utility provides an error message where the package variables are shown and the fact that the package ran 0 seconds, but the €œMessage€? is empty (€œMessage: €œ). Turning the logging on in all the packages does not provide an error message either, just a lot of run-time information. The try-catch block around the process.start() script in the main package€™s script task also does not reveal any errors. I€™ve increased the €œmax worker threads€? number for the cmdexec subsystem in the msdb.dbo.syssubsystems table to a safely high number and restarted the SQL Server, but this had no effect either.
The request:
Can anyone give ideas what could be the cause of the problem? If you have any ideas about how to further debug the problem, they are also very welcome. Thanks in advance!
hi.. I want to store a RMVB file to SQL SERVER 2000 ,and read from it,iwant to play the RMVB file in web,the size of the RMVB file is more than 300MB less 1G.the SQL Field Image can include it. Now My Quesstion is How can i Store and Read the RMVB file from SQL Server2000? I used SqlInsertCommand.ExecuteNoquery() in my program,but it Too slow,ao make a unknown error. Thank you for your help.
I'm in the process of migrating a lot of data (millions of rows, 4GB+of data) from an older SQL Server 7.0 database to a new SQL Server2000 machine.Time is not of the essence; my main concern during the migration isthat when I copy in the new data, the new database isn't paralyzed bythe amount of bulk copying being one. For this reason, I'm splittingthe data into one-month chunks (the data's all timestamped and goesback about 3 years), exporting as CSV, compressing the files, and thenimporting them on the target server. The reason I'm using CSV isbecause we may want to also copy this data to other non-SQL Serversystems later, and CSV is pretty universal. I'm also copying in thisformat because the target server is remotely hosted and is notaccessible by any method except FTP and Remote Desktop -- nodatabase-to-database copying allowed for security reasons.My questions:1) Given all of this, what would be the least intrusive way to copyover all this data? The target server has to remain running and berelatively uninterrupted. One of the issues that goes hand-in-handwith this is indexes: should I copy over all the data first and thencreate indexes, or allow SQL Server to rebuild indexes as I go?2) Another option is to make a SQL Server backup of the database fromthe old server, upload it, mount it, and then copy over the data. I'mworried that this would slow operations down to a crawl, though, whichis why I'm taking the piecemeal approach.Comments, suggestions, raw fish?
I'm running a resource-intensive stored procedure, which reads a filewith about 50,000 lines with a BULK INSERT into a temp table, thengoes through it and inserts a record for each line into another table.While this procedure is running, SQL server stops accepting any otherrequests coming from the website.Question:Is there a way to make SQL server "listen", or emulate an "interrupt"to other requests while in the middle of a long intensive process?I really appreciate your replies.Thank you,Oleg.
Hi all,I am doing some large updates,that may update 10,000 plus rows.This works fine when I execute the SQL directlyin Query Analyzer.If I set the timeout on my VB connection to 0 (zero)the connection should not time out????But it does.If I set the time out to a high value, say 1200,I get the same problem well within 1200 seconds.Also, I am getting the problem that the log fills up,but it is set to auto grow????Any ideas would be appreciated.ThanksGreg
I have a table with approx 5 million rows and 36 columns. It takes approx 4 minutes to delete 1 row. The table has 3 indexes in addition to it's primary key and has twelve foreign key constraints. We are still using sequel 7. There is a backup run every night as part of the nightly maintenence that reorg/reindexes and checks the database integrity. Any thoughts?
Hi, Good morning to all.My table: User_Group_Map(UserID UNIQUEIDENTIFIER,GroupID UNIQUEIDENTIFIER) Now, I want to write one stored procedure that can insert rows into the above table, but more number of rows at-once. Means, the program should allow multiple insertions without the need to call the stored procedure from front-end more number of times. Can anyone please help me on this... Thanks in advance...Ashok kumar.