We're building a company wide network monitoring system
in Java, and need some advice on the database design and
tuning.
The application will need to concurrently INSERT,
DELETE, and SELECT from our EVENT table as efficiently as
possible. We plan to implement an INSERT thread, a DELETE
thread, and a SELECT thread within our Java program.
The EVENT table will have several hundred million records
in it at any given time. We will prune, using DELETE, about
every five seconds to keep the active record set down to
a user controlled size. And one of the three queries will
be executed about every twenty seconds. Finally, we'll
INSERT as fast as we can in the INSERT thread.
Being new to MSSQL, we need advice on
1) Server Tuning - Memory allocations, etc.
2) Table Tuning - Field types
3) Index Tuning - Are the indexes right
4) Query Tuning - Hints, etc.
5) Process Tuning - Better ways to INSERT and DELETE, etc.
Thanks, in advance, for any suggestions you can make :-)
The table is
// CREATE TABLE EVENT (
// ID INT PRIMARY KEY NOT NULL,
// IPSOURCE INT NOT NULL,
// IPDEST INT NOT NULL,
// UNIXTIME BIGINT NOT NULL,
// TYPE TINYINT NOT NULL,
// DEVICEID SMALLINT NOT NULL,
// PROTOCOL TINYINT NOT NULL
// )
//
// CREATE INDEX INDEX_SRC_DEST_TYPE
// ON EVENT (
// IPSOURCE,IPDEST,TYPE
// )
The SELECTS are
private static String QueryString1 =
"SELECT ID,IPSOURCE,IPDEST,TYPE "+
"FROM EVENT "+
"WHERE ID >= ? "+
" AND ID <= ?";
private static String QueryString2 =
"SELECT COUNT(*),IPSOURCE "+
"FROM EVENT "+
"GROUP BY IPSOURCE "+
"ORDER BY 1 DESC";
private static String QueryString3 =
"SELECT COUNT(*),IPDEST "+
"FROM EVENT "+
"WHERE IPSOURCE = ? "+
" AND TYPE = ? "+
"GROUP BY IPDEST "+
"ORDER BY 1 DESC";
The DELETE is
private static String DeleteIDString =
"DELETE FROM EVENT "+
"WHERE ID < ?";
I have a small tricky problem here...need help of all you experts.
Let me explain in detail. I have three tables
1. Emp Table: Columns-> EMPID and DeptID 2. Dept Table: Columns-> DeptName and DeptID 3. Team table : Columns -> Date, EmpID1, EmpID2, DeptNo.
There is a stored procedure which runs every day, and for "EVERY" deptID that exists in the dept table, selects two employee from emp table and puts them in the team table. Now assuming that there are several thousands of departments in the dept table, the amount of data entered in Team table is tremendous every day.
If I continue to run the stored proc for 1 month, the team table will have lots of rows in it and I have to retain all the records.
The real problem is when I want to retrive data for a employee(empid1 or empid2) from Team table and view the related details like date, deptno and empid1 or empid2 from emp table. HOw do we optimise the data retrieval and storage for the table Team. I cannot use partitions as I have SQL server 2005 standard edition.
Please help me to optimize the query and data retrieval time from Team table.
Dear Advance,I used one stored procedure to retrive 3 different result set. and in the codebehind i seperate it. means from the dataset i seperate three different datatable and then show my data as my need.but the main problem is ... after retriving the datafrom the database i have to user foreach loop to bind the coulmns data to my different custom class.example: foreach (DataRow oDrow in MyDataTable.Rows) {oClass=new Class();oClass.Name1=oDrow["Name1"] .toString();oClass.Name2=oDrow["Name2"] .toString();.... } 1. so my first question is there any optimization possible ?2. my result set is too loong ... so should keep just one hit to database or hit more than one time Currently i am optimizing my web application. in the previous version 1 have to hit the database 3/4 times for different purposes. but now it hits only one time... but it takes time in the codebehind to perform different operation.Any Suggestion
I have a SP that calls about 10 stored procedures sequentially. The 10 SP's are basically complex update statements, each one individual. Is there any way to optimize this? I know putting the 10 into 1 SP would make it compile faster but thats about it. Are there any execution tricks of Stored Procedures firing off sequentially?..or anything I should know?
Hello All, What is the best way to optimize this code or rewrite it using ISNULL ?
CREATE PROCEDURE get_employees (@dept char(8), @class char(5)) AS IF (@dept IS NULL AND @class IS NOT NULL) SELECT * FROM employee WHERE employee.dept IS NULL AND employee.class=@class ELSE IF (@dept IS NULL AND @class IS NULL) SELECT * FROM employee WHERE employee.dept IS NULL AND employee.class IS NULL ELSE IF (@dept IS NOT NULL AND @class IS NULL) SELECT * FROM employee WHERE employee.dept=@dept AND employee.class IS NULL ELSE SELECT * FROM employee WHERE employee.dept=@dept AND employee.class=@class
I am wondering if the size of the data file makes a difference in running Insert's and/or doing Fetch's. Our DB was 11GB in size, I ran a dbcc shrinkdatabase and it shrank it to 5.5 GB in size, now that it is smaller will it run a select query faster as opposed to when we run large inserts and it has to automatically grow to accommodate the insert. I am trying to figure out if I should leave my .mdf file large or keep it small or does it even make a difference. I am only doing large inserts while loading data to get ready for production after that the inserts will be hourly but much smaller, however our queries to the DB after it is in production will be much more intensive.
There are two main tables in my app,in order to optimize search via scope condition, I set many indexs for these two tables
however,at the same time the two tables are also used for my etl app,everyday there are more than thousands of data need to be updated or inserted, but index is not suitable for huge modification,any idea about how to handle this?
Hello, what is the meaning about <MissingIndexGroup Impact="99.9521"> in the Queryplan? Should I create a Grouped Index? An what is the meaning about Impact="99.9521"?
If the Impact =100 you get a 100% better performance, and if the impact =20 ypu get a 20% better performance, is this the meaning?
Hi, Can anyone help me optimize the SELECT statement in the 3rd step? I am actually writing a monthly report. So for each employee (500 employees) in a row, his attendance totals for all days in a month are displayed. The problem is that in the 3rd step, there are actually 31 SELECT statements which are assigned to 31 variables. After I assign these variable, I insert them in a Table (4th step) and display it. The troublesome part is the 3rd step. As there are 500 employees, then 500x31 times the variables are assigned and inserted in the table. This is taking more than 4 minutes which I know is not required :). Can anyone help me optimize the SELECT statements I have in the 3rd step or give a better suggestion. DECLARE @EmpID, @DateFrom, @Total1 .... // Declaring different variables SELECT @DateFrom = // Set to start of any month e.g. 2007-06-01 ...... 1st Loop (condition -- Get all employees, working fine) BEGIN SELECT @EmpID = // Get EmployeeID ...... 2nd SELECT @Total1 = SUM (Abences) ...... 3rd FROM Attendance WHERE employee_id_fk = @EmpID (from 2nd step) AND Date_Absent = DATEADD ("day", 0, Convert (varchar, @DateFrom)) (from 1st step) SELECT @Total2 ........................... same as above SELECT @Total3 ........................... same as above INSERT IN @TABLE (@EmpID, @Total1, ...... @Total31) ...... 4th Iterate (condition) to next employee ...... 5th END It's only the loop which consumes the 4 minutes. If I can somehow optimize this part, I will be most satisfied. Thanks for anyone helping me....
Could any one tell me what is the best way to declare a connection from ASP .net to a SQL database so the sql could support the maximum users, because it seems that the way i'm using is not correct cuz when i make some transactions from my website to the database, the database send an error message saying that there are no more free connections.
This may sound a little silly, but does anyone have any words of wisdom on how to optimize a server/database for minimim rollback? We have some multimillion row tables we were trying to do updates against, and after several days they increased the size of the transaction log to the point they filled up the drive the database files/logs were on. We've now been running a rollback for about five days. I'd like to make sure this doesn't happen again.
I am using the Database maintenance on a database that is about 4gb. The database optiiztion is running about an hour. Does this job only do an update stats? If I run the stored procedure sp_updatestats on the database it only takes a couple of minutes. Are thes two processes doin the same thing? Do I need them if the create, update statistics are turned on?
Trying to optimize a query, and having problems interpreting the data. We have a query that queries 5 tables with 4 INNER JOINS. When I use INNER HASH JOIN, this is the result:
(Using SQL Programmer)
SQL Server Execution Times: CPU time = 40 ms, elapsed time = 80 ms.
Now, when timing the code execution on my ASP page, it's "faster" not using the HASH. Using HASH, there are a few Hash Match/Inner Joins reported in the Execution Plan. Not using HASH, there are Bookmark Lookups/Nested Loops.
My question is which is better to "see": Boomark Lookups/Nested Loops or Hash Match/Inner Joins for the CPU/Server?
IS there any way to rewrite this Query in optimized way?
SELECT dbo.Table1.EmpId E from dbo.Table1 where EmpId in( SELECT dbo.Table1.EmpId FROM (SELECT DISTINCT PersonID, MAX(dtmStatusDate) AS dtmStatusDate FROM dbo.Table1 GROUP BY PersonID) derived_table INNER JOIN dbo.Table1 ON derived_table.PersonID = dbo.Table1.PersonID AND derived_table.dtmStatusDate = dbo.Table1.dtmStatusDate))
How can I optimized the following query: (SELECT e.SID FROMStudents s JOINTable1e ON e.SID= s.SID JOINTable2 ed ON ed.Enrollment = e.Enrollment JOINTable3 t ON t.TNum = e.TNum JOINTable4 bt ON bt.TNum = t.TNum JOINTable5 b ON b.Batch = bt.Batch JOIN IPlans i ON i.IPlan = ed.IPlan JOINPGroups g ON g.PGroup= i.PGroup
WHERE t.TStatus= 'ACP' ANDed.EStatus= 'APR' ANDe.SID=(select distinct SID from Table1 where Enrollment=@DpEnrollment)) AND(ed.EffectiveDate= (SELECT EffectiveDate FROM Table2 ed JOIN Table1 e ON e.enrollment=ed.enrollment WHERE IPlan = @DpIPlan ANDTCoord = @DpTCoord ANDAGCoord= @DpAGCoord ANDDCoord=@DpDCoord ) ANDDSeq= @DpDSeq) ANDe.SID= (select distinct SID from Table1 where Enrollment=@DpEnrollment)) ) ANDed.TerminationDate= (SELECT TerminationDate FROM Table2 ed JOIN Table1 e ON e.enrollment=ed.enrollment WHERE IPlan = @DpIPlan ANDTCoord = @DpTCoord ANDAGCoord= @DpAGCoord ANDDCoord= @DpDCoord ) ANDDSeq= @DpDSeq) ANDe.SID= (select distinct SID from Table1 where Enrollment=@DpEnrollment)) ) ))
DECLARE @PTEffDate_tmp AS SMALLDATETIME SELECT @PTEffDate_tmp = DateAdd(day, -1, PDate) FROM PDates pd WHERE iplan = @DIPlan and pd.TCoord = @DTCoord and DType = 'EF'
DECLARE @PTCoord_tmp as char(3) SELECT @PTCoord_tmp = tc.TCoord FROM PDates pd JOIN TCoords tc ON (pd.TCoord = tc.TCoord) WHERE pd.Iplan = @DIPlan and tc.TGroup = @TGroup_tmp and PDate = @PTEffDate_tmp and DateType = 'TR1'
DECLARE @EStatus_tmp as char(3) SELECT @EStatus_tmp = EDStatus From EDetails ed JOIN ENR e ON (ed.enr = e.enr) JOIN Trans t ON (e.transID = t.TransID) WHERE iplan = @DIPlan and ed.TCoord = @PTCoord_tmp and t.TransS= 'ACP' and DCoord = @DCoord and CEnr is null
How can I optimazed my query. Since my DB is more then 1 mln it takes a while to do all those join? select * FROM EEMaster eem JOIN NHistory nh ON eem.SNumber = nh.SNumber OR eem.OldNumber = nh.SNumber OR eem.CID = (Replicate ('0',12-len( nh.SNumber))+ nh.SNumber )
Well i wanted to prove to some guys that cursors are not really that important:shocked: . :D So this code is suppose to remove duplicate tuples from a table without temporary tables or cursors:D. Except it needs some optimization(and alot of system down time, not sure about that:confused: ). I would like it, if some one could find an instance of the table when the below code fails or some way to optimize the code or anything;) .
--trashtable for real data create table abc (col1 tinyint, col2 tinyint, col3 tinyint)
--trash values for trash table insert into abc values (1,1,1) insert into abc values (1,1,1) insert into abc values (1,1,1) insert into abc values (1,1,1) insert into abc values (2,2,2) insert into abc values (2,2,2) insert into abc values (2,2,2) insert into abc values (3,2,1) insert into abc values (2,2,3) insert into abc values (3,2,4)
--check that there are ten rows select * from abc --check that there are only five distinct rows select distinct * from abc
--run code : next 15 line as a batch declare @lp tinyint declare @col1 tinyint,@col2 tinyint,@col3 tinyint set @lp=1 while @lp>0 begin if not exists (select top 1 * from abc group by col1,col2,col3 having count(col1)>1) set @lp=0 else begin select top 1 @col1 = col1,@col2 = col2,@col3 = col3 from abc group by col1,col2,col3 having count(col1)>1 delete from abc where col1=@col1 and col2=@col2 and col3=@col3 insert into abc values(@col1,@col2,@col3) end end
--only distinct values left in trash table select * from abc
--think code can be optimized --just wanted to prove: can be done without cursors or temporary tables
Hi All, Were I work we have a standalone system that writes information to an event log. Currently this event log is in .mdb (MS Access) format. The problem we have is that the .mdb seems to get very slow to access after 100,000 rows or so, so it needs to be cleared out regularly. We have long discussed using an SQL server to log the events to instead of an .mdb file.
I have written a VB program to test the two DB formats and i expected MS SQL server 2005 to be faster at reading/writing than the .mdb. Both the server and the .mdb are local to the system (it's a standalone system), so we know it's not network that is making the SQL server slower. So here is my question: does anyone know of any good tips/tricks in the server configuration options to speed it up/generally improve performance?
The table definitions are the same in both SQL server and the .mdb file: Table:event_log_0000_000000 Module - Text Event_date - Text Event_Time - Text Event - Text Record_Number - int, primary key I know it would probably be better to have Event_date and Event_Time as datetime types, but I’m not in charge of that decision. The data/table doesn't matter to much i just need to prove that the SQL server is better (and faster) than a .mdb file.
The VB program uses DAO to access the .mdb DB and ADODB to access the SQL server - this is the only difference to how the DB's are accessed and I don't think it would account for the slowness of the SQL server.
This is my first post here, so I’ve probably missed out some vital information, so please ask.
Also sorry if this is the wrong place to post this question, it sort of covers Access/SQL Server 2005/Database programming areas, so wasn't sure.
Generally speaking when you want to optimise an application that relies on a database which is the order of the following optimization techniques
a) optimizing the spread of the pysichal elements of the database on different disks of the server b) optimizing the use ot the RAM c) optimizing the SQL d) opimizing the OS
My company is undertaking a database optimization project. Optimization the schema, the code, etc. I would like to ask, if you guys could help out, the following:
1. What risks are there? What are the pitfalls?
2. My company is hesitant to do a database freeze and stop all new development until our vendor (who's restructuring tables and changing database objects) has a stable database for us to obtain, then, and only then can we continue development on this newer copy. My question to this: how can we either reduce the database code freeze or work in parallel?
3. Can anyone point me to other sources of information? Another thread? A book? A URL?
I have this problem with my optimization job seems to fail all the time. I have this set up as a sql maintenance plan and this is run 1 every week. i have checked for things that could comme in conflict but theirs nothing. here is the error i am getting from the job history step.
Executed as user: SAPCORPadminsg. sqlmaint.exe failed. [SQLSTATE 42000] (Error 22029). The step failed.
My company is undertaking a database optimization project. Optimization the schema, the code, etc. I would like to ask, if you guys could help out, the following:
1. What risks are there? What are the pitfalls?
2. My company is hesitant to do a database freeze and stop all new development until our vendor (who's restructuring tables and changing database objects) has a stable database for us to obtain, then, and only then can we continue development on this newer copy. My question to this: how can we either reduce the database code freeze or work in parallel?
3. Can anyone point me to other sources of information? Another thread? A book? A URL?
I work on tables containing 10 million plus records. What are the general steps needed to ensure that my queries run faster? I know a few: - The join fields should be indexed -Selecting only needed fields -Using CTE or derived tables as much as I can -Using good table reference eg select a.x , b.y from TableA a inner join TableB b on a.id = b.id
I will be happy if somebody could share or add more to my list.
Dear all, The below query take 7 min to execute so i want optimize the query.please any suggestions..........
SELECT DISTINCT VC.O_Id C_Id, VC.Name C_Name,VB.Org_Id B_Id, VB.code S_Code,VB.Name S_Name, mt12.COLUMN003 M_D_Code, mt12.COLUMN004 M_D_Name,CQ.COLUMN004 R_Code, CQ.COLUMN005 R_Date, CQ.COLUMN006 Ser,CQ.COLUMN008 R_Nature, CQ.COLUMN011 E_Date,mt26.COLUMN003 W_Code, mt26.COLUMN004 W_Name, mt17.COLUMN005 V_Code,mt17.COLUMN006 V_Name, mt19.column002 I_Code, mt19.column003 I_Name, mt19.COLUMN0001 R_I_No,mt92.COLUMN001 B_Id, mt92.COLUMN005 B_No, CASE mt92.COLUMN006 WHEN '0' THEN 'Ser' WHEN '1' THEN 'Un-Ser' WHEN '2' THEN 'Ret' WHEN '3' THEN 'Retd' WHEN '4' THEN 'Rep' WHEN '5' THEN 'Repd' WHEN '6' THEN 'Con' WHEN '7' THEN 'Cond' ELSE mt92.COLUMN006 END S_C_Type, mt20.COLUMN003 T_G_Code,mt20.COLUMN004 T_G_Name, V.U_Code,V.U_Name, mt19.column005 I_Quantity,mt20.COLUMN003 T_Code, mt20.COLUMN004 T_Name, mt59.COLUMN005 T_Price,VR.code C_L_Code, VR.Name C_L_Name FROM tab90 CQ INNER JOIN tab91 mt19 ON mt19.COLUMN002 = CQ.COLUMN001 LEFT JOIN tab92 mt92 ON mt92.COLUMN002 = CQ.COLUMN001 LEFT JOIN tab93 mt93 ON mt93.COLUMN004 = CQ.COLUMN001 INNER JOIN tab12 mt12 ON mt12.COLUMN001 = CQ.COLUMN003 LEFT JOIN tab26 mt26 ON mt26.COLUMN001 = CQ.COLUMN009 LEFT JOIN tab20 mt20 ON mt20.COLUMN001 = mt93.COLUMN005 LEFT JOIN tab59 mt59 ON mt59.COLUMN002=mt20.COLUMN001 LEFT JOIN tab17 mt17 ON mt17.COLUMN001 = CQ.COLUMN010 INNER JOIN VM V ON V.UOM_ID = mt19.COLUMN004 INNER JOIN tab19 mt19 ON mt19.COLUMN001 = mt19.COLUMN003 INNER JOIN vOrg VR ON CQ.COLUMN007 = VR.Org_Id INNER JOIN vOr VB ON CQ.COLUMN002 = VB.Org_Id INNER JOIN vOr VC ON VB.Top_Parent = VC.Org_Id WHERE CQ.COLUMN005 Between '02/01/2007' and '08/25/2008' And VC.O_Id in ('fb243e92-ee74-4278-a2fe-8395214ed54b')
I have recently taken up performance optimization activity for our database. Can any one suggest a really good source for articles/tutorials/guides etc. on Performance optimization for SQL server 2005.
LATEST column value changes for Row 1 since there is a repetition of value 124, meaning this row is no longer the latest.
NEW COLUMN value changes for ROW 2 since there it is no longer new; we already have an occurrence of 124 in the first row.
I m not sure if i can solve this query using any option other than cursor. it will be like taking first row --> comparing it with all the other rows and then moving further.
Plz. suggest me if there is a better approach for doing this
Hi, I have Search Critieria which makes use of "LIKE" statement to get records.THis is very simple search just making use of LIKE statment on two Columns of the table.
Example : SELECT ID from tblName WHERE ID LIKE '%PID_01%' AND LID LIKE '%CR_03%'
This works fine and also performace is also good when we have hundreds/thousandsof records. But when records is of lakhs,i feel using LIKE statment will reduce the performance of our search Query.
SO how can we good performance in search ...? I need to optimize my search which result good performace when we have lakhs of records ....?