SQL Server 2008 :: MAXDOP And Cost Threshold For Parallelism Settings?
Jul 2, 2015
Referencing an article regarding MAXDOP and cost threshold for parallelism from Brent Ozar's website: [URL] .....
We have a 2 physical CPUs that are 4 cores each with hyper threading enabled. When looking through the task manager, under the performance tab, I see 16 CPU threads.We have set the MAXDOP value is set at 4.
Reading further, cost threshold for parallelism setting is recommended at 50 to start with.
I right in thinking that if the estimated subtree cost is higher than the cost threshold for parallelism then it will use a parallel plan? If so, I've read the cost threshold is measured in minutes but is the subtree cost measured in something else, the mysterious cost number? And if so, how are the two compared?
I have got a question on max degree of parallelism and CPU cores.
If max degree of parallelism = 1, this signifies that sql will use serial execution plan (unless u change it in query level with MAXDOP hint). In serial plan, will the query use all CPU cores (say in my server I have 16 core processors)?
If in serial execution plan only one thread works, then what the other threads doing ? Idle (I may have a defined max server worker thread = 32767(by default)
Unable to create a relationship between this parameters.
Script to Reverse Engineer / Script out your EXISTING database mail settings?
I set up a profile to use gMail, and it seems logical for me to export out the settings to a script, then run the script on my laptop, other servers, etc.
There's no built in option, so I figured i'd ping the forum before i do it myself.
There are example scripts where you fill in the blanks, examples how to set up dbmail, but i did not find anything that scripts out existing settings.
I need to build TSQL query to return the Last unit Cost from my table of movement of goods SL (on CTE) but the MAX(Datalc) must be Less or Equal to my HeaderInvoice.
This is my script:
With MaxDates as ( SELECT ref, MAX(epcpond)[Unitcostprice], MAX(datalc) MaxDate FROM sl
the problem I have right now is that the Unitcostprice of my table of goods movements has a top date greather than the date of my bill.
invoice date : 29.01.2015 unitcost on invoice line = 13,599722 Maxdate (CTE) : 19.03.2015 unitCost from my table of movement of goods = 14,075
That ´s not correct because the MAxdates > invoice date and the unitCost of 14,075 is the cost on 19.03.2015 and not just before my invoice date.
I have written ETL software that runs on SQL Server. We are running it for the first time on a 4cpu (2 x dual core) machine on sql server 2005.
One of the things this software does is perform a 'select * from tablename' to validate that the tables passed to it as parameters exist. This has worked fine on previous releases and on single cpu machines because what the optimiser decides to do is to return just the first page of data and then fetch more. I guess it even works in 2005 standard edition.
However, 2005 enterprise edition allows parallelism. And what the optimiser is deciding to do with such a query is to parallelise it and fetch all rows and then give the result back to the program. So, instead of seeing a fraction of a second to return the first page of data we are seeing up to 90 seconds and the database goes and fetches 15M rows in parallel.
Obviously, what we would like to do is to somehow tell the optimiser that this set of programs should not perform any parallel queries. Or, we would like to turn parallelism off on the specific tables we are dealing with for the period of running these ETL programs....they have no need of parallel processing at the database level for virtually all the calls that are performed.
Would someone please be so kind as to advise us if we can do something like pass a parameter to ODBC to stop parallelism or if we can issue commands against specific tables to stop parallelism for a period and then turn it back on?
When you execute the below query limiting MAXDOP to 1(serial execution), the CPU_Utilized_in_Seconds reported by sys.dm_exec_query_stats is accurate - it will nearly match your wall clock execution time.
>> in theory sys.dm_exec_query_stats always works: --example provided by www.sqlworkshops.com --reset cache to collect fresh set of statistics dbcc freeproccache go --execute a sample query serially that takes x amount of seconds select max(t1.c2 + t2.c2) from tab7 t1 cross join tab7 t2 option (maxdop 1) go --now query sys.dm_exec_query_stats to find CPU Utilized by the above query select (total_worker_time * 1.0) / 1000000 as CPU_Utilized_in_Seconds, * from sys.dm_exec_query_stats cross apply sys.dm_exec_sql_text(sql_handle) where text like '%select max(t1.c2 + t2.c2) from tab7 t1 cross join tab7 t2%' and text not like '%sys.dm_exec_query_stats%' --to eliminate our probe go >> CPU_Utilized_in_Seconds will be around 6 to 18 seconds based on your CPU speed - which is what you expect
But when you execute the query without limiting MAXDOP to 1, say 0(parallel execution), the CPU_Utilized_in_Seconds reported by sys.dm_exec_query_stats is inaccurate - will not match your wall clock execution time. >> in practice sys.dm_exec_qyery_stats does not always works: --example provided by www.sqlworkshops.com --reset cache to collect fresh set of statistics dbcc freeproccache go --execute a sample query in parallel that takes x amount of seconds select max(t1.c2 + t2.c2) from tab7 t1 cross join tab7 t2 go --now query sys.dm_exec_query_stats to find CPU Utilized by the above query select (total_worker_time * 1.0) / 1000000 as CPU_Utilized_in_Seconds, * from sys.dm_exec_query_stats cross apply sys.dm_exec_sql_text(sql_handle) where text like '%select max(t1.c2 + t2.c2) from tab7 t1 cross join tab7 t2%' and text not like '%sys.dm_exec_query_stats%' --to eliminate our probe go >> CPU_Utilized_in_Seconds will be around 0.00xxxx seconds - which you do not expect!
You can read my full atricle at www.sqlworkshops.com/dm_exec_query_stats.htm
sqlworkshops www.sqlworkshops.com
Usually CPU intensive query execute in parallel. Most customer use default configuration where 'max degree of parallelism' is set to '0' where it is more common for CPU intensive queries to execute in parallel.
A customer tells you they have high CPU utilization on their server and asks you to identify the issue. Without knowing that sys.dm_exec_query_stats reports incorrect CPU utilization when a query executes in parallel, you might query sys.dm_exec_query_stats and tell your customer that there is no query that is CPU intensive. Sooner or later the customer might find the query that you missed to point out.
Now you see the theoretical explanation and practical usage!!
We are just finishing our migration to SQL 2012. In our old environment, the instance which held our SharePoint databases also served other applications. We did not experience any performance related issues in the past due to this.
SharePoint basically requires MAXDOP to be 1, which is correct on the old server. Since this configuration may not be ideal for other applications that may be put within our environment, we our entertaining the idea of isolating SharePoint into its own instance, probably on the same box.
My manager wants me to come up with performance trace data to better prove that we need to go this route since we apparently have had issues in the past by blindly following Microsoft's best practices.
1.MAXDOP configuration - I understand this may be a 2 pronged approach that would require looking at various execution plans and CPU related counters in Perfmon. SharePoint likely requires a maxdop of 1 due to the nature of the application (lots of concurrent processes). What is the best way to show this need graphically?
2. Memory configuration for multiple instances - Does the Total Server Memory reveal all the memory that a given SQL instance is utilizing? Should I use this counter to identify appropriate min/max memory configurations for multiple instances on a single cluster?
The problem with the perfmon approach is that it's scope is limited to just the server. Since our SharePoint environment is currently being shared with other applications, I understand that I may have to utilize DMV statistics to narrow down my analysis.
I am collecting cost and other information for my manager who is willing to set up an "in house" web dev dept. Will SQL server express always be free? What does it lack from the regular version of SQL server? How would it compare with other DB's available for web use. This isn't a retooled Access thing, is it?
i am trying to find it, the only thing the MS site shows is enterprise licensing. is the developer edition still going to be like $100 or whatever it was?
is there any threshold manager in the ms sql server? like in the transaction log, where you can add a stored proc that will dump the tran log everytime a threshold is hit.
I have a table with 188376 rows and the data size = 3012 KB, index size = 5884 KB . LE threshold max is set to 2000 and LE threshol percent to 20% I have an index on that table and observed that it is not getting used. I would like to know whether sql optimizer uses the index based on the cost of the query plan or does the table scan once the LE thresholdlimit is reached overriding the optimized plan.
Guys I am really stuck on this one. Any help or suggestions would beappreciated.We have a large table which seemed to just hit some kind of threshold.They query is somewhat responsive when there are NO indexes on thetable. However, when we index email the query takes forever.FACTS- The problem is very "data specific". I can not recreate theproblem using different data.- There is only a problem when I index email on the base table.- The problem goes away when I add "AND b.email IS NOT NULL" to theinner join condition. It does not help when I add the logic to the"WHERE" clause.DDLCREATE TABLE base (bk char(25), email varchar(100))create clustered index icx on base(bk)create index ix_email on base(email)CREATE TABLE filter (bk char(25), email varchar(100))create clustered index icx on filter (bk)create index ix_email on filter (email)QuerySELECT b.bk, b.emailFROM base b WITH(NOLOCK)INNER JOIN filter f ON f.email = b.email--and f.email is not nullData Profile--35120500, 35120491, 14221553SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)FROM base--16796199, 16796192, 14221553SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)FROM baseWHERE email IS NOT NULL--250552, 250552, 250205SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)FROM filter--250208, 250208, 250205SELECT COUNT(*) ,COUNT(DISTINCT bk), COUNT(DISTINCT email)FROM filterWHERE email IS NOT NULL
Hi allI have a large data set of points situated in 3d space. I have a simpleprimary key and an x, y and z value.What I would like is an efficient method for finding the group ofpoints within a threshold.So far I have tested the following however it is very slow.---------------select *from locations a full outer join locations bon a.ID < b.ID and a.X-b.X<2 and a.Y-b.Y<2 and a.Z-b.Z<2where a.ID is not null and b.ID is not null---------------If anyone knows of a more efficient method to arrive at this results itwould be most appreciated.Thanks in advanceBevan
I am doing workload analysis on SSAS - Tabular (2012), I have perfmon logs captured and want to run through PAL. I am looking out for threshold file for SSAS tabular 2012/2014.Â
The below stored procedure is used to create a vertical benchmark line on the X-Axis which has a hour scale. I use the stored procedure to find out which temperature crosses or equals the threshold temperature (340), then plot the vertical benchmark line at the hour the first temperature is equal to or greater than 340 degrees and less than 1000 degrees.
The logic below works if the temperature is equal to or greater than 340 degrees and less than 1000 degrees. THE ISSUE is I have 8 temperatures if they don't cross the threshold of 340 degrees I need to set a default value for my vertical line. In other words if the temperature is 180 and my threshold is 340 then set my vertical line on the highest temperature close to 340.
I tried removing my Where clause (but then it breaks the logic for those temperatures that are equal to or greater than 340). I tried using Case When but this didn't give me what I want either. I tried UNION as well. All giving me results I don't want.
Here is what I am looking for:
This first example is one where there was a temperature that was equal to or greater than the threshold of 340 degrees. This is CORRECT
If 8 temperatures did not equal or cross the threshold then give me the hour of the highest temperature close to the threshold but do not return 0.
For Example:
temp1 92 temp2 108 temp3 0 temp4 284 <<< this is the closest to the threshold so give me the hour when this occurred. temp5 2192 *Remember I can only count temperatures less than 1000 degrees. Anything above 1000 degrees mean there is nothing in the oven. So it is false/positive. temp6 102 temp7 0 temp8 12
Code: CREATE PROCEDURE [dbo].[AgeScoreCardThreshold_JJ_12232013] -- Add the parameters for the stored procedure here @LicenseNumber int = NULL, @Lot varchar(50) = NULL
According to BOL you can configure an Alert to notify you when the blocked process threshold has been exceeded:
SQL Server 2005 Books Online
blocked process threshold Option
Use the blocked process threshold option to specify the threshold, in seconds, at which blocked process reports are generated. The threshold can be set from 0 to 86,400. By default, no blocked process reports are produced. This event is not generated for system tasks or for tasks that are waiting on resources that do not generate detectable deadlocks. For more information about deadlock detection, see Detecting and Ending Deadlocks.
You can define an alert to be executed when this event is generated. So for example, you can choose to page the administrator to take appropriate action to handle the blocking situation.
Can someone provide some direction on exactly how this is done? Does it require a Service Broker and queue?
I know you can change the max degree parallelism server wide, but can you do it on the fly for one query? I know... trust the query processor but when I turn it off for this one sp, my query goes from 3 seconds to 0 and I got this ex-MS guy in here telling me there is a way, but he does not remember how.
I want him to simplify the sp or have his project's DBA do it, and I even offered to take a hack but.... you know.
Does anyone know about sqlserver's Parallelism. a query without parallelism takes much less time as the one with parallelism, in my case it's 6 times faster without parallelism. If that's the true. What do we need parallelism for? Any ideas Thanks
I have a function that returns a table of information aboutresidential properties. The main input is a property type anda location in grid coordinates. Because I want to get only acertain number of properties, ordered by distance from thelocation, I get the properties from a cursor ordered by distance,and stop when the number is reached. (Not really possible todetermine the distance analytically in advance.) The cursor alsoinvolves joins to a table of grid coordinates vs. postcodes (theproperties are identified mainly by postcode), and to a tablethat maps the input property type into what types to search for.Opening the cursor typically results in the creation of six toeight parallel threads, and takes approx 1 second, which is abouthalf of the total time for the function.Recently the main property table grew from 4 million to 6.5million records, and suddenly the parallelism is lost. Takingthe identical code and executing it as a script gives parallelism.Turning it into a SP that inserts into a #temp table and thenselects * from that table as the last statement also givesparallelism. But when it's in the form of a function, there isonly one thread -- and the execution time has gone from ~2 secto ~8 sec. I updated the statistics on the table, but stillno parallelism.I could turn it into a SP easily enough, but that would involvea change to the C++ program that calls it, which takes a whileto get through the pipeline. In the meantime, is there some wayto induce the optimizer to use parallelism? It used to.
hi,i've set 'max degree of parallelism' to 1 because some sql request hanged.Now when i connect, how can i set the parallelism to 4 for a session.Is there a command like this :'alter session set max degree of parallelism 4' ?ThanksPaul
If SQL Server is designed for multi processor systems, how can runninga query in parallel make such a dramatic difference to performance ?We have a reasonably simple query which brings in data from a few nonecomplex views. If we run it on our 2x2.4Ghz Xeon server it takes 6minutes plus to run. If we run this on the same server withOPTION(MAXDOP 1) at the end of the same query it takes less than asecond.Examining the execution plan, the only difference I have been able tosee is that parallelism is taking up 96% of the run time when usingtwo processors. This drops when using the one so a sort takes up thevast majority of the time for the query to run.OK, so running in parallel should mean that it's run in various partsand then 'joined up' later for performance gains, but how can it getit so wrong (timewise) ?If this is the case, will I see a significant difference changing ourserver to use a single processor, which seems completely the wrongapproach (or should I do this on each query in each app - eek) ?Do we have a problem that we don't know about that causes it to takethis long ?What can we do ? Ideally, using both processors would seem to bepreferrable.
I would just like to confirm something with you guys...
Am I correct in saying that you dont need multiple connections to the same DB in a SSIS package in order to achieve parallel processing across multiple SQL tasks. In other words, I have 2 SQL tasks executing different stored procedures on the same DB that I want to run in parallel. They should be able to share one connection and still process in parallel, correct?
With that in mind, would the processing be faster if they each had their own connection?
Microsoft SQL Server 2008 R2 (SP2) - 10.50.4000.0 (X64) Â Jun 28 2012 08:36:30 Â Copyright (c) Microsoft Corporation Express Edition with Advanced Services (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1) (Hypervisor) Â This is just an UAT server which has OS and hardware detail below:-
OS :- Windows Server 2008 R2 Standard SP:- SP1 Processor :- Intel(R) Xeon(R) CPU Â X5650 @2.67GHz 2.66 GHz RAM : - 4 GB Bit - 64 bit
I want to set the value to max degree of parallelism, what value should i configure for the same?
Below is the snap property of SQL instance >> Processor
My site works fine in VWD2008 express, but I get this error when I try to use it on my live website. An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections. According to this article: http://support.microsoft.com/kb/914277 I am supposed to:
1. Click Start, point to Programs, point to Microsoft SQL Server 2005, point to Configuration Tools, and then click SQL Server Surface Area Configuration. Ok, there is no such program in this folder. The only thing in there is "SQL Server Error and Usage Reporting"... The other thing I am greatly concerned with is this: All is want is for my webpages to be able to access my database for user authentication. I DO NOT want to grant the internet rights to remote connect to my database.
I have a table that is made up of the sum of medical, mental health and pharmacy claims. I would like to query that to find instances when the sum of the three claims types are greater than a predetermined threshold.
For example: Patient 1 Medical = 10,000 (could be 10 records at 1,000 each) Patient 1 Mental Health = 5,000 Patient 1 Pharmacy = 15,000 Patient 2 Medical = 1,000 Patient 2 Mental Health = 0 Patient 2 Pharmacy = 500
Threshold is 25,000
If I queried the above sample table I would get one record: Patient 1 30,000 - because 10,000+5,000+15,000 = 30,000 and is greater than the threshold.
I am not sure that a having clause would work though.