I have some problems with our database which is growing too large, and was hoping someone might have some tips on what I can do!
I have about 100 clients, each logging about 10 000 rows of status logs a day. So after just a few days the db is growing very large.
At present it's manageable, since I don't need to "dig" into the logs more than a few times a day. The system it self is not affected by the size of the log or traffic on the server. But it will increase to about 500 clients in 2004, and 1000-1500 in 2005. So I really need a smarter solution than what I have today to be able to use the log efficiently.
98-99% of these rows are status-messages which are more or less garbage during normal operation. But I still need to keep them in case an error occurs, and we need to go back an hour or two (maybe a day) to see what went wrong. After 24-48 hours these 98-99% are of no use. I do however like to keep the remaining 1-2%, they are messages like startup, errors, etc. Ideally they should be logged in two separate tables by the clients, but unfortunatelly I cannot make the clients change their logging.
This presents problems on multiple levels. Mainly in searching, which often times out, but also with backup and storagespace. At the moment I check the system for errors, and every other day I just truncate the log-file. It works, but it's not exacly elegant......
The server is a 1100 MHz P3 / 512MB / Windows 2000 Server / SQL Server 2000. Faster hardware would help, but the problem is more of a "bad design" than "slow hardware" problem.
My log is pretty simple, as follows:
LogId - int - primary key - clustered index ClientId - int - index asc LogTypeId - int - index asc LogValue - nvarchar[2500], ikke index LogTimeStamp- datetime - index asc
I have deducted 3 different solutions:
Method 1: Simply run "Delete from db_log where logtyipeid <> stuff_I_want_to_keep".
This is the simplest and the one i prefer, but it takes too long time to complete. Any tips to speed this process up?
Method 2: Create a trigger which runs something like "Delete from db_log where logtypeid <> stuff_I_want_to_keep and date < today_minus_two_days" every hour or so. This will ensure that the db doesn't grow to large. But if I'm away from work a few days we might loose data we'd wanted to keep.
Method 3:
Copy what I want to keep into another table, and empty the log. Sort of like "Insert into db_log_keep stuff_to_keep; drop db_log; create table db_log; " (or truncate, but that takes a long time too)
But then I would be stuck with two log tables, "48-hour_db_log" and "db_log_keep". I could use a view to "union" them so they would appear as a single table, but that's not ideal either.
However, it seems as this method is what will work best for my set-up, unless there are other suggestions??
Method 4:
...eagerly awaiting ideas!!! :-)
(Also, whatever tips and/or links to info on maintaing VLDB's are greatly appreciated. )
We have a database. It is enabled for mirroring. We need to delete the old records. That is around 500k records from a table. But it has foreign key relation. How to do in Production servers these kind of deletes?
In another forum post, a poster was deleting large numbers of rows from a table in batches of 50,000.
In the bad old days ('80s - '90s), I used to have to delete rows in batches of 500, then 1000, then 5000, due to the size of the transaction rollback segments (yes - Oracle).
I always found that increasing the number of deleted rows in a single statement/transaction improved overall process speed - up to some magic point, at which some overhead in the system began slowing the deletes down, so that deleting a single batch of 10,000 rows took more than twice as much time as deleting two batches of 5,000 rows each.
good rule-of-thumb numbers (or even better, some actual statistics and/or explanations) as to how many records should be deleted in a single transaction/statement for optimum speed? 50,000 - 100,000 - 1,000,000 or unlimited? Are there significant differences between 2008, 2012, 2014?
We are busy designing a generic analytical system at work that willhold multiple analytic types over time. This system is being developedin SQL 2000.Example of tableIDENTITY intItemId int [PK]AnalyticType int [PK]AnalyticDate DateTime [PK]Value numeric(28,15)ItemId - the item for which the analytic is being storedAnalyticType - an arbitrary typeThe [PK] tag indicates the composite primary key.Our scenario is the following:* For this time series data, we expect around 250 days per year(working days) and the dataset could extend to over 20 years* Up to 50 analytic types* Up to 20,000 itemsLooking at the combined calculation - this comes to roughly somethinglike25 * 20,000 * 50 * 250 or around 5 billion rows.We will be inserting around 50*20,000 or around 1 million rows each day(the inserts will take place in the middle of the night (outside themain query time) - this could be done through something like BCP orBULK INSERT.Our real problem is we have not previously worked with such largetables before and are nervous that our system is going to grind to ahalt. Our biggest tables are around 20 million rows at the moment.Scanning through google and microsoft's own site we have found aparititioning method that is available.http://www.microsoft.com/resources/...art5/c1861.mspxHaving experimented with the above system it seems rather quirky andlooking at the available literature it seems that this is not moreeffective than a clustered index as far as queries go.It needs to be optimized for queries like:Given the ItemID and the AnalyticType search for a specific date or aspecific range of dates.If anyone has any experience or helpful suggestions I would reallyappreciate it.ThanksA
A select query returns around 1 million rows. The column in the WHERE condition is indexed. This query takes nearly 1 minute for returning the all the records. Is this normal ?
Does the number of records returned affect the performance inspite of the indexing ?
I need to insert a very large number of rows into a table (in SQL Server 7.0) using ADO. Could you please tell me i there is a way for FAST insert, something similar to BCP ... or any other way of inserting large number of rows efficiently
I need to have a script where it ask the user for a value, the script will search for all records that match the value. Then it will display the numbers of records found and ask the user to enter a different value. The rest of the script will use this new value and increment by 1 n times as the number of records found. I started the script where it will ask for "HANDLE" and display the number of records found with that "HANDLE"
declare @HANDLE as varchar(30) declare @COUNT as varchar(10) declare @STARTINV as varchar(20)
set @HANDLE = ?C --This is the parameter to search for records with this value set @STARTINV = ?C --User will input the starting invoice number SELECT COUNT as OrderCount FROM SHIPHIST where HANDLE = @HANDLE
I just can't figure out how to proceed to use the entered invoice # and increment by 1 until it reach the number of records found.
This will be the end results:
Count=5 --results from query STARTINV=00010 --Value entered by user
Table structure: col1 IDENTITY (seed=1 increment=1) + few other columns (col2...col7) + one text column (col 8) I have around 50,000,000 rows per day inserted in the table T1. At the end of the day 40,000,000 rows are deleted. I have to keep the records for 12 months and then archive it. Database is 24/7 web serving and there is no down time allowed. IDENTITY column will go out of range (overflow) after less than two years, unless the identity seed is reset to the start value (seed=1, increment=1). At the end of 12th month data is archived in another table and only last month is kept in the table T1. So table T1 enters new year with data from last month of the previous year. There are few other tables that refer to this table by using there own field with values from T1.IDENTITY column (referential integrity is not enforced). Identity column in T1 is needed as a unique id for some search actions. Performance is an issue therefore bigint data type is used for this identity column rather than decimal.
Another problem I have is how to do table update on one column (1 mil rows to be updated out of 2 mil of rows) with the minimum impact on the users who are querying this table heavily. Not need to mention that it is web app 24/7 no down time.
i need to add a datetime column to an exisitng table that has like 1.2 million records and its being accessed frequently but i cant afford to stop the db at all
whenever i do : alter table mytable add Updated_date datetime
it just takes too long and i have to stop executing the query after a couple of mins I am running sql express 2005 sp2. db size is over 3 gb but still under the 4 gb limit
can u plz advice on how to add this column. its urgent!!
/****** Object: StoredProcedure [dbo].[dbo.ServiceLog] Script Date: 07/18/2014 14:30:59 ******/ SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO ALTER proc [dbo].[ServiceLogPurge]
-- Purge records dbo.ServiceLog older than 3 months: -- Purge records in small portions to avoid locking production tables -- for a long time. The process takes longer, but can co-exist with -- normal usage of the tables.
[Code] ...
*** Getting this error below when executing the code ***
Msg 102, Level 15, State 1, Procedure ServiceLogPurge, Line 45 Incorrect syntax near 'Failed:'.
We have an application in which one particular stored procedure goes from .3 second response times to 20-40 second response times, suddenly, at seemingly random times of the day. Recompiling that particular stored proc fixes the problem temporarily.
The 3 main tables have between 500K and 1.5M records, are defragged daily and contain the past 42 days of work. Old records are deleted once daily. The slowdowns do not conincide with the deletions, occuring without a clear pattern. Nothing else on going on on that server -- it is dedicated to this one app. Nothing shows in event log. It is clustered Windows 2003, with SQL Server 2005 RTM version. SP2 of SQL2005 due for an install next Monday. This behavior also was seen when the app was on a different server running SQL2K SP3, so think the underlying problem is some sort of design issue with the app, not a SQL server bug. Has anyone seen something like this and what suggestions do you have for doing a permenant fix? Think that a recompile of the sp causes a new execution plan, but why would that be necessary daily or even several times a day? 30-50 users are banging away at it. The app is an order entry system. The tables contain what are basically order histories and label data. Am running out of good ideas. Thanks for any help.
I need to delete data from a particular table which has more than half a million records. The data needs to be deleted is more than 200,000 records from the table. What is the best way to delete the data from the table other than importing into a temporary table and performing the same operation?
Let me know if the strategy to be followed is okay.
1. Drop all the triggers 2. Drop all the indexes 3. Write a procedure with a loop setting ROWCOUNT to 1000 and delete the records. ( since if I try to delete all the rows it will give timeout error ) The above procedure will delete 1000 records for each batch inside the loop till it wipes out all the data for the specified condition. 4. Recreate Indexes and Triggers.
Please let me know if there are any other optimal solution.
I need to write a SQL to find number of hours between a begin time and end time. The fields are varchar. There are several date functions in sql, but I am not able to figure out how to get the hours between two times that is not in date format.
Is there a way to get the number of occurances of a word within a specific row?
I want to query a dataset and have a column act as the number of occurances of a word within that row. I also need to use Full-Text searching... Is this possible?
How is the best way to make a function for summing an arbitrary number of times values (table parm?)- I 've read it's necessary to convert to seconds, sum then convert back, but Im' wondering if there's an alternative.
Here's the example I want to sum: 00:02:01:30 00:01:28:10 00:01:01:50 00:06:50:30 00:00:01:50
I'm trying to quantify the number of times folks use SQL Server Management Studio to change client data in one of our production databases. Does SQL Server keep this statistic? How do I get to this data?
In report builder 2.0, I cannot determine how to display the number of times a field has been filled out when a one to many relationship is involved. I get an error every time I use two aggregates.
I am building a query. I have a table with 4 columns and need to try and put the times together. There are some inconsistencies with this, and i'm hoping to exclude them.. Here is a sample table:
Function 1 = Clock In Type 1 Function 2 = Clock Out Type 1 Function 3 = Clock In Type 2 Function 4 = Clock Out Type 2
Basically what I need to do is take the time from rows with Function 1 and match it with Function 2 so I can get a total time of the clock in. Function 3 rows need to match up with Function 4 rows so I can get another set of total times. There may be more clock in rows then clock out rows or more clock out rows then clock in rows, and there may be multiple clock ins & outs per day per employee. I'm basically trying to get totals for each Clock In/Out type.
How can I write it in SQL? I have a table that column A has two values of Product and Cost...Where value of A is Product, I have to add 6 more rows, and add column b with values from 1 to 7 for each rows ( counter)Where value of A is Cost, I have to add 4 more rows, and add column b with values from 1 to 5 for each rows ( counter).
I have a table with approx 5 million rows and 36 columns. It takes approx 4 minutes to delete 1 row. The table has 3 indexes in addition to it's primary key and has twelve foreign key constraints. We are still using sequel 7. There is a backup run every night as part of the nightly maintenence that reorg/reindexes and checks the database integrity. Any thoughts?
I'm currently in the middle of building quite a large CMS using ASP.NET and MSSQL2K and have began to question if the amount of queries I am using for one page to be built is too many?
For one page (View Forum) I am getting all of the templates and checking access then pulling a list of threads, getting the first and last posts, then user info for the first and last posts... anyway to view 10 threads on the page the number of queries comes to about 54 and the page takes 0.064 seconds to load.
My question is, Is this to many queries to be running for a single page load? All queries are using Stored Procedures.
hi, Does large number of connections to a sql server result in slogging queries?? by large number of connections I mean in the range of 2000 to 2500 connections to various databases on the same server. This happens on the production database, if I download the copy of the database and run the specific queries on the local box it hardly takes 1 second to execute, whereas on the production box it takes about 45 to 60 secs, i m working on the indexes part... was not sure whether number of connections to a server affect the performance of the queries or it is just that the indexes need defragmentation...
I have an EXECUTE SQL Task, which gets a result-set from SQLServer using OLEDB Connection.
The result set is mapped to an object variable , say @[User::FilePath]
There are 33 row is the above resultset.
Then, I have a For-each loop, inside which, I have a Script task .
I am trying to put the above @[User::FilePath] recordset into a DataTable using DataAdapter.Fill() function. I perform some read operations to its rows.
The problem is , in the First Iteration of For-Each-Loop, the number of rows in data-table shows 33.
But from the Next Iteration, it comes out to be 0. (ZERO!!)
I have a csv file that I need to import daily into a SQL Server 2005 table. Much of the table contents could just be overwritten with the new csv file, however there are a set of Rows within the table that need to be appended to , rather than overwritten. There is no Primary Key in the csv file that can be used. I'm not sure this is the best approach, but what I have been trying to do, is append the entire csv file to the existing table, and then go back and delete the duplicates. When I run the Delete, it does delete the majority of the records, but leaves a couple hundred behind. The number left behind varies with each run, can't seem to identify a pattern here. Running the Delete a second time does clean up the rows left behind in the first execution of the Delete, and gives the result I want. Any thoughts as to why this needs to be run twice? Or is a better approach available? Here is my code - SELECT [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time], dupcount=count(*) INTO temppkgactions FROM pkgactions GROUP BY [Pkg ID], [Elm (s)], [Type Name (s)], [End Exec Date], [End Exec Time]HAVING count(*) > 1
DELETE TOP (SELECT COUNT(*) -1 FROM dbo.temppkgactions WHERE dupcount > 1 ) FROM dbo.pkgactions DROP TABLE temppkgactions
What I'm trying to do is delete a user and all their related information within the other tables. I'm not wanting to delete the table, just one column with that user and their related information. So my Primary_Key is UserID within the table [alumni] and my three Foreign_Keys are CommentID, PhotoID, and AlbumID within the tables [comments], [photos], and [albums]. Here is some of the code that I have: <asp:SqlDataSource ID="SqlDataSource2" runat="server" ConnectionString="<%$ ConnectionStrings:SoderquistString %>" DeleteCommand="DELETE FROM [alumni] WHERE [UserID] = @UserID" SelectCommand="SELECT [UserID], [UserName], [FirstName], [LastName], [State] FROM [alumni] WHERE ([State] = @State)"> <DeleteParameters> <asp:Parameter Name="UserID" Type="Int32" /> </DeleteParameters> <SelectParameters> <asp:ControlParameter ControlID="DropDownList1" Name="state" PropertyName="SelectedValue" Type="String" /> </SelectParameters> </asp:SqlDataSource> The users are set up in GridView form. Is there some type of DELETE command that I need to be writing that is different than the one above? I have tried adding onto the following DELETE statment: DeleteCommand="DELETE FROM [alumni] WHERE [UserID] = @UserID DELETE FROM [photo] WHERE [UserID] = @UserID; DELETE FROM [album] WHERE [UserID] = @UserID; DELETE FROM [comment] WHERE [UserID] = @UserID; ...but that doesn't work...and doesn't look right. I would really appreciate anyones suggestions or help that you may be able to provide. Thank you!