I have a TableA with around 10 columns with varchar and numeric
datatypes
It has 500 million records and its size is 999999999 KB. i believe it
is kb
i got this data after running sp_spaceused on it. The index_size was
also pretty big in 6 digits.
On looking at the tableA
it didnot have any pks and hence no clustered index.
It had other indices
IX_1 on ColA
IX_2 on ColB
IX_3 on ColC
IX_4 on ColA, ColB and ColC put together.
Queries performed in this table are very slow. I have been asked to
tune up this table.
I know as much info as you.
Data prior to 2004 can be archived into another table. I need to run a
query to find out how many records that is.
I am thinking the following, but dont know if i am correct ?
I need to add a new PK column (which will increase the size of the
tableA) which will add a clustered index.
Right now there are no clustered indices
2. I would like help in understanding should i remove IX_1, IX_2, IX_3
as they are all used in IX_4 anyway .
3. I forget what the textbox is called on the index page. it is set to
0 and can be set from 0 to 100. what would be a good value for it ?
I have a legacy prod db of 102Gb which contains 101Gb of free space. The db comprises 2 Primary (.mdf's) on one drive and over 200 (.ndf's) residing on one logical drive, (4 physical drives with RAID 5). ie. Each user table object has its own (.ndf). Some of these tables have been allocated up to 15Gb.
Limited documentation suggests that performance benefits can be achieved by allocating a maximum size to certain key tables rather than setting the table properties to autogrow.
No mention is made of grouping these files into filegroups to potentially achieve the same performance gains.
Transportability of this db is now a problem since no other test/dev/prod boxes contain enough free space.
I propose to re-create the db and have the user data tables reside within the (.mdf) - possibly even group highly active tables within filegroups.
Are there any benchmark tests or KB articles to indicate/support the original strategy of using multiple (.ndf's) over data tables within one (.mdf)??
I apologize since this seems to be a fundamental question, but I did try to search and there seems to be something wrong - after clicking on the "search" button, the page just will not update... I even tried "advanced search" but no luck there, either.
Anyway, my question is about writing multiple data rows to tables in a SQL Server database. Currently I'm using embedded SQL from a VC++ 2005 .Net program to write data to a SQL 2005 Express server (either on the same PC or on another, via TCP). As data becomes available, I issue INSERT statements, one by one. One row is about a dozen columns, typically 8 bytes each, for each of the few tables in the database. In most cases, I have several rows of data available at the same time. Those rows come at the rate of around 200-1000 per second for the different tables. With that setup, my SQL server is not able to keep up - the data ends up getting buffered in the program, waiting for the server, the server process uses just about all the CPU cycles it can get, etc.
I'm reading the "SQL in a nutshell" book from O'Reilly but it's mostly language oriented, it doesn't say much about performance improvements. It would seem natural that one could improve the performance. For example, when I do a SELECT query, the data comes very quickly, which makes me think that it is the overhead of making those individual calls with small amount of data. I would think that I should be able to request that multiple rows are written at the same time with a single SQL statement, and that this would improve the performance. However, I have no idea how to do that.
I did try one thing - enlisting the sequence of INSERTs into a single transaction, thinking it will all get buffered and only get written to the server after the Commit command is issued. I did that but it doesn't seem to help. What should I try to alleviate this problem? One thing to consider: although I use SQL Server for testing, I am trying to keep compatibility with other databases, e.g., ODBC to any available data source, including MS Access over Jet. I would love it if the solution to this were compatible (i.e., not involve any Transact SQL or other vendor-specific tricks). I thank you in advance for your assistance!
ENVIRONMENT: I have SQL Server6.5 running under a dedicated NT Server. NT configuration includes dual pentium 200Mhz processors, 256MB RAM and RAID system. The Database size is 1GB with actual data size about 500MB.
PROBLEM: I have an application which uses lots of joins to get the results. My select query is running too slow even when I run it on the server.
I updated the statistics and rebuild all the indexes on the tables used by the query.
Any suggestions on using SQL Trace and tuning the server/database are welcome.
a problem occurs when i run a large database, the running speed is very slow.
for example, if i want to seek the record of 1500 employees with 15 per person(average) within 1 year(12 months), that mean i have to find record for 1500 * 15 * 12 time.
so, could i find a way to solve this problem? is this call tune the sql server/index/view?
what different of tune the sql server with index and view?
I have written a query which fetches data from a table with huge amount of data. This query is actually being used in a stored-procedure and it takes a lot of time, hence slows down my SP.
Here is the query that I'm builiding in the stored-procedure: ------------------------------------------------------------- SET @selectLeadsInPeriodString = ' SELECT LM_Dealer, LM_Brand, SUM(LM_ImpressionCount) FROM LM_ImpressionCount_Dealer' IF(@timeCriterion IS NULL OR LTRIM(RTRIM(@timeCriterion)) = '' OR LTRIM(RTRIM(@timeCriterion)) = 'NULL') SET @selectLeadsInPeriodString = @selectLeadsInPeriodString + ' WHERE (CONVERT(varchar,[LM_ImpressionCount_Dealer].[LM_ImpressionDate],102) >= ''' + CONVERT(varchar,@startDateTime,102) + ''' AND CONVERT(varchar,[LM_ImpressionCount_Dealer].[LM_ImpressionDate],102) <= ''' + CONVERT(varchar,@endDateTime,102) + ''')' ELSE IF(@timeCriterion = 'CurrentMonth') SET @selectLeadsInPeriodString = @selectLeadsInPeriodString + ' WHERE MONTH([LM_ImpressionCount_Dealer].[LM_ImpressionDate]) = MONTH(GETDATE())' ELSE IF(@timeCriterion = 'PreviousMonth') SET @selectLeadsInPeriodString = @selectLeadsInPeriodString + ' WHERE MONTH([LM_ImpressionCount_Dealer].[LM_ImpressionDate]) = MONTH(GETDATE()) - 1' ELSE IF(@timeCriterion = 'YearToDate') SET @selectLeadsInPeriodString = @selectLeadsInPeriodString + ' WHERE ([LM_ImpressionCount_Dealer].[LM_ImpressionDate] >= cast((''1/1/''+cast(year(getdate()) AS varchar(4))) AS datetime) AND CONVERT(varchar,[LM_ImpressionCount_Dealer].[LM_ImpressionDate],102) <= CONVERT(varchar,GETDATE(),102))'
IF(@brand IS NOT NULL AND LTRIM(RTRIM(@brand)) <> '') SET @selectLeadsInPeriodString = @selectLeadsInPeriodString + ' AND [LM_ImpressionCount_Dealer].[LM_Brand] = ''' + @brand + ''''
SET @selectLeadsInPeriodString = @selectLeadsInPeriodString + ' GROUP BY LM_Dealer, LM_Brand'
The variables used in the query formation above are passed as input parameters to the SP. The table being queried has columns 'LM_Dealer', 'LM_Brand', 'LM_ImpressionCount' and 'LM_ImpressionDate'. Also, the table has a non-clustered index on the column LM_ImpressionDate.
With all this information, can anyone suggest as to how I can optimize the query above.
I have SQL desktop version installed and for the last few days it has really slowed down. I have ran many anti-virus etc. and all is okay on that front.
Any tips regarding how I can tune things up? What should I look for and how do I go about it. Deleting LOG etc. etc???
If anyone is able to provide advice for tuning the below query in sqlserver, it is much appreciated. In addition any index suggestions arealso appreciated as I have access to the tables. Thank you.select a.id, isnull(b.advisement_satisfaction_yes, 0) asadvisement_satisfaction_yes,isnull(c.advisement_satisfaction_no, 0) as advisement_satisfaction_no,casewhen isnull(b.advisement_satisfaction_yes, 0) >isnull(c.advisement_satisfaction_no, 0) then 'YES'when isnull(b.advisement_satisfaction_yes, 0) <isnull(c.advisement_satisfaction_no, 0) then 'NO'when isnull(b.advisement_satisfaction_yes, 0) =isnull(c.advisement_satisfaction_no, 0) then 'TIE'end as Satisfied_With_Advisementfrom aleft Join(select id, count(answer_text) as Advisement_Satisfaction_yes from awhere question = 'The level of Academic Advisement I received from theUniversity staff during this course was appropriate.'and answer_text = 'yes'GROUP BY id) bon a.id = b.idleft join(select id, count(answer_text) as Advisement_Satisfaction_NO from awhere question = 'The level of Academic Advisement I received from theUniversity staff during this course was appropriate.'and answer_text = 'NO'GROUP BY id) con a.id = b.idwhere question = 'The level of Academic Advisement I received from theUniversity staff during this course was appropriate.'
How can l trim the code and make the procedure run faster ???????????????
CREATE Procedure Disbursements_Cats (@startdate datetime,@enddate datetime) As Begin SELECT Transaction_Record.loan_No AS loan_no, Transaction_Record.transaction_Date AS Transaction_Date, Transaction_Record.transaction_type AS Transaction_type, Transaction_Record.transaction_Amount AS Transaction_Amount,
Product.product AS Product, Product_Type.product_Type AS product_type, Product_Type.loan_Type AS Loan_type,
Customer.first_name AS first_name, Customer.initials AS initials, Customer.second_name AS Second_Name, Customer.surname AS surname, Customer_identification.idno AS ID_No,
Bank.Bank_name AS Bank_Name, Bank_detail.Account_no AS Account, Bank_detail.Branch AS Branch
FROM Transaction_Record CROSS JOIN Bank_detail CROSS JOIN Bank CROSS JOIN Customer CROSS JOIN Product CROSS JOIN Loan_Type CROSS JOIN Product_Type CROSS JOIN Customer_identification End; GO
hi,I have a problem asked by one of my senior person and finding theanswer .What is the step by step procedure for tune a large sql query.OR how do we tune a large SQL query with somany joins
HELP!!!I am trying to fine tune or rewrite my SELECT statement which has acombination of SUM and CASE statements. The values are accurate, butthe query is slow.BUSINESS RULE=============1. Add up Count1 when FIELD_1 has a value and FIELD_2 is NULL, or bothhave a value.2. Add up Count2 when FIELD_2 has a value and FIELD_1 is NULL.4. TotalCount = Count1 + Count2 -- (Below, basically had to reuse theSQL from both Count1 and Count2)3. Add a NoneCount when both FIELD_1 and FIELD_2 are NULL.SQL Code========SELECTSUM(CASEWHEN ((FIELD_1 IS NOT NULL AND FIELD_2 IS NULL) OR (FIELD_1 IS NOTNULL AND FIELD_2 IS NOT NULL))THEN 1ELSE 0END) AS Count1 ,SUM(CASEWHEN (FIELD_1 IS NULL AND FIELD_2 IS NOT NULL)THEN 1ELSE 0END) AS Count2,SUM(CASEWHEN (FIELD_1 IS NULL AND FIELD_2 IS NOT NULL)THEN 1ELSE (CASE WHEN ((FIELD_1 IS NOT NULL AND FIELD_2 IS NULL) OR FIELD_1IS NOT NULL AND FIELD_2 IS NOT NULL) THEN 1 ELSE 0 END)END) AS Total_Count,SUM(CASEWHEN ( FIELD_1 IS NULL AND FIELD_2 IS NULL)THEN 1ELSE 0END) AS None_Count,FROMTABLE_1
I've create a bunch of views to expose a logical model of the underlying database of an application server.
To enforce the security control, I've also created a CLR UDF to call the application server's API for security check and audit log.
For example, we have a table, tblSecret, and the view, vwSecret, is,
SELECT
Id, ParentId, Description, SecretData FROM tblSecret WHERE udfExpensiveApiCall(Id) = 1
The udfExpensiveApiCall will return 1 if the current user is allowed to access the SecretData else 0. The CLR UDF call is very expensive in terms of execution time and resources required.
Currently, there are millions rows in the tblSecret.
My objective is to tune the view such that when the view is JOINed, the udfExpensiveApiCall will be called the least number of time.
SELECT
ParentId, SecertData FROM vwParent LEFT JOIN vwSecret ON vwSecret.ParentId = vwParent.ParentId WHERE vwParent.StartDate > '1/1/2008'
AND vwSecret.Description LIKE '%WHATEVER%'
Is there any way to specify the execution cost of the CLR UDF, udfExpensiveApiCall, such that the execution plan will call the UDF while it is absolutely necessary?
I want to tune the indexes on my database and I am trying to use the SQL Server Profiler to collect data for the Index tuning wizard to analyze. My question is what do I need to trace with the profiler so that the Index tuning wizard can work? I am looking at the trace properties in Profiler at the Events, Data Columns, and Filters tabs but I have no idea of what I need to capture.
i have a report that runs on a huge table rpt.AgentMeasures , it has 10 months worth of data (150 million records as of today and will keep increasing). i have pasted my proc below , the other tables that are joined to this huge table do not have more then 3k records.This report will be accessed by multiple users (expecting 20 ppl). as of now this reports runs for 5 mins if i pull for 1 month worth of data. if it is wise to use temp tables.
Hi,I have a small theoretical issue.I have one table, which is prettyu large. There is lot of evaluationsrunning on this table, that's why, each process need to wait foranother to be finished. Sometimes, for some critical functions, ittakes to long time.I don't think that I can speed up processes, by changing the indexes onthe tables (to increase scan time for example), because this issomething what I was experimenting with already, and it was not enoughtgood.My question is, will it improve performance, if I will create secondtable, exactly like this one, and I will split some evaluations, thatthe one, which defenately need to run on the source table will run onthe first one, and the second evaluations, will run on the other one.To keep data consistance between this two tables, I was thinking baouttrigger on insert on the mother table, which will transport the data toanother one.Second part is: to improve selects on the table, should I set indexeswith option of Fill factor as possible close to 100% or as possibleclose to 0%. Or maybe should I set the pad index option?What about clustered indexes. Is it better to use them if I would liketo increase performace for selects?Thanks in advanceMateusz
I've read that table variables give better performance than temporary tables as they are kept in memory and don't need to be recorded in transaction logs ect however I have a stored proceedure which takes 0.183 seconds to execute but when I change the one temporary table used in the proceedure to a table variable the execution time increases to 0.223.
Not much of an increase I admit but it just seems contrary to what I've read.
I want to get the best performance possible so can someone explain to me what is going on ?
I find that joining to the same table twice is OK, but as soon as you do it 3 or more times you get a massive performance hit. Does anyone know the reason for this? Whats special about 3? What's the best approach to do this sort of thing? (I've used the SQL Server 2005 Tuning Advisor to add indexes for the query).
Rather than: Select ..., sum(a1.<column>), sum(a2.<column>), sum(a3.<column>) from master_table left join table_1 a1 on ... left join table_1 a2 on ... left join table_1 a3 on ... group by ...
I have to select all the table and filter it using case: Select ..., sum(case when table_1.<column> = '...') as a1, sum(case when table_1.<column> = '...') as a2, sum(case when table_1.<column> = '...') as a3, from master_table left join table_1 group by ...
I know we are not allowed to benchmark SQL Server but..... It would be nice to have material to present which demonstrates the performance gains using a queue compared to insert/delete in a SQL table.
Logically it seems faster to use a queue due to the conversation grouping locking and the service broker itself. But there seems to be some overhead involved just to manage these queues that the service broker has to perform.
I am sure we are not unique with the choice to figure out if we will get a boost in performance using SQL a queue between services rather than a table to queue data. What is available to help understand the performance gains of using a queue?
Hi, I have a denormalized table (done so with reason) with around 40 columns. I would never have to retrieve data for all of those columns together. I haven't done any performance measurements yet but just wondering if anyone has ready answer to this: Will there be a performance degradation if I retrieve data from a table with many columns, even if not all columns are referred in the query? (for making it simple, lets assume that all or varchar type of columns, I just want to find out if performance degrades if there are too many columns in table)
I will be receiving data from a company called Loan Performance, that has one file/table that will hold 1 billion rows. They send data by period, and I plan to load the data via BCP via NT/DOS scripts. The 1 billion rows represents data for 200+ periods.Are the following design plans feasible1. Partition table by period value, I'm not sure of the max number of partitions per table in 2005, but I think we have periods data back to 1992 and a new one gets created every month, so the possibility of having > 1000 partitions exists. I plan on just pre-creating partitions for future data, instead of dynamically creating when a new period is sent.2. Load data via BCP in DOS shell scripts that will drop index (by partition), BCP in data, and they re-create indexes by partition, is this possible ? and will I see a performance increase as opposed to one huge table (I'm pretty much sure I will). There is usually one periods data present per day, but sometimes the vendor resends all data (would get loaded on weekend).I'm a bit unsure of where to start being I never worked with this amount of data. I worked with partitioning in Oracle a long time ago.I plan on having an 2XQuadCore 2.66Ghz CPU with 32GB of RAM and SQL2005EE 64Bit connected to 1 Terabyte SAN Disk.Thanks all,PMA
I created two tables one is based on partition structure and one is non-partition structure.
File Groups= Jan,Feb.....Dec Partition Functions='20060101','20060201'......'20061201' I am using RIGHT Range in Partition function. Then I defined partition scheme on partition function.
I have more than 7,00,000 data in my database. I checked filegroups and count rows. It works fine.
But When I check the estimation plan time out for query it is same for both partition table and non partition table.
I created two tables one is based on partition structure and one is non-partition structure.
File Groups= Jan,Feb.....Dec Partition Functions='20060101','20060201'......'20061201' I am using RIGHT Range in Partition function. Then I defined partition scheme on partition function.
I have more than 7,00,000 data in my database. I checked filegroups and count rows. It works fine.
But When I check the estimation plan time out for query it is same for both partition table and non partition table.
Hello -- thank you for taking the time to read this.
I have a very large table that is used both for archives and new information. To get the current information, the table is queried by many different users at various polling periods. The SELECT required includes about fifteen JOINS, and only returns about 200 rows at any given time.
So I got to thinking if it might be faster to periodically run the big query as a SELECT INTO into a smaller table and letting the polling clients query the smaller table with SELECT *. Periodically, the smaller table would be DROPPED and refereshed with another SELECT INTO.
Trouble is, the data would have to be updated once every 30 seconds, and there are inbound polls coming at the rate of about 200 per minute. It got me to thinking what might happen if a client attemtped to query the smaller table when it was in the process of being dropped and refilled.
So my question is three-part:
1) assuming a larger table of about 500,000 records and only 500 pertinent at any given time, is there any real potential of performance enhancement by switching to a SELECT INTO table?
2) if so, is there a chance of a client failing a query if the inbound query somehow collides with the DROP/SELECT INTO procedure?
3) if so, is there any way to prevent it or a better way of doing this?
Thanks again for reading, and in advance for any help you can provide. I apologize if I sound like a dummy - it's hard to fake intelligence!
Working with SQL Server 2000, I have a table with the following structure: ID (INT) userID (INT, foriegn key) productID (INT) productQTY (DECIMAL(5,2)) purchaseDate(smalldatetime)
I have about a 1000 users, entering about 20-30 rows per day each, i.e ~20,000 - 30,000 new rows per day. The table might be queried with a simple "SELECT" for the products a user ordered per day or per time frame (purchaseDate column). My question (finally) is - when should I expect to see performance degradation? Is there anything I can do to prevent it (i.e splitting this table somehow to several tables)?
Monthly, I copy a table from one database to another database. I delete the original table and copy the table back speed the performance of the query on the order of 10 to 1. Why does this work?
Detail: I have a legacy table that a small application queries about once a month. The table was poorly designed and the query runs a date range comparison on one field and has a sub query that runs string comparison against six fields. I cannot change the calling app or table design. When the app calls the query, the call times out due to the inordinate length of time. To fix this until next months query, I copy the table out, delete the original and copy back. What changes when the table is copied to another database and then copied back? The performance of the query changes from 10sec to 1.
Hello all,I've following problem. Please forgive me not posting script, but Ithink it won't help anyway.I've a table, which is quite big (over 5 milions records). Now, thistable contains one field (varchar[100]), which contains some data inthe chain.Now, there is a view on this table, to present the data to user. Theproblem is, in this view need to be displayed some data from this onelarge field (using substring function or inline function returningvalue).User in the application, is able to filter and sort threw this fields.Now, the whole situation starts to be more complicated, if I would likecombine this table, with another one, where is one additional much morlarger field, from which I need to select data in the same way.Problem is: it takes TO LONG to select the data according to userrequest (user access view, not table direct)Now the question:- using this substring (as in example) is agood solution, or beter todo a inline function which will return me the part of this dataset(probably there is no difference)- will it be much faster, if i could add some fields in toSource_Table, containing also varchar data, but only this part whichI'm interested in and binde these fields in view instead off usingsubstring function?Small example:CREATE TABLE [dbo].[Source_Table] ([CID] [numeric](18, 0) IDENTITY (1, 1) NOT NULL ,[MSrepl_tran_version] uniqueidentifier ROWGUIDCOL NULL ,[Date_Event] [datetime] NOT NULL ,[mama_id] [varchar] (6) COLLATE Latin1_General_CI_AS NOT NULL ,[mama_type] [varchar] (4) COLLATE Latin1_General_CI_AS NULL ,[tata_id] [varchar] (4) COLLATE Latin1_General_CI_AS NOT NULL ,[tata_type] [varchar] (2) COLLATE Latin1_General_CI_AS NULL ,[loc_id] [nvarchar] (64) COLLATE Latin1_General_CI_AS NOT NULL ,[sn_no] [smallint] NOT NULL ,[tel_type] [smallint] NULL ,[loc_status] [smallint] NULL ,[sq_break] [bit] NULL ,[cmpl_data] [varchar] (100) COLLATE Latin1_General_CI_AS NOT NULL ,[fk_cmpl_erp_data] [numeric](18, 0) NULL ,[erp_dynia] [bigint] NULL) ON [PRIMARY]GOcreate view VIEW_AllDataasselect top 100 percentisnull(substring(RODZ.cmpl_data,27,10),'-') as ASO_NO,(RODZ.mama_type + RODZ.mama_Id) as MAMA,isnull(substring(RODZ.cmpl_data,45,5),'-') as MI,isnull(substring(RODZ.cmpl_data,57,3),'-') as ctl_EC,isnull(substring(RODZ.cmpl_data,60,3),'-') as ctl_IC,RODZ.Date_Event as time_time,RODZ.sn_no as SNFROMSource_Table RODZ with (nolock)goThanks in advanceMateusz
Right now, a client of mine has a T-SQL statement that does thefollowing:1) Create a temp table.2) Populate temp table with data from one table using an INSERTstatement.3) Populate temp table with data from another table using an INSERTstatement.4) SELECT from temp table.Would it be more efficient to simply SELECT from table1 then UNIONtable 2? The simply wants to see the result set and does not need tore-SELECT from the temp table.
We change from SQL Server CE 2.0 to SQL Server CE 3.0, and we got our customer complaining about the performance lost in the process that makes 50 updates in the Pocket PC with windows mobile 5.0.
Our customer says that when they used the SQL Server CE 2.0 the process was quicker.
Process:
We need to do an update to 50 rows every time we lose the focus. Those 50 update are creating a deterioration of performance.
There is any Patch to add to SQL Server CE 3.0? There is any problem updating 50 times on a row losing performance?