I understand that the number of records allowed in a SQL server is limited by storage capacity, but at what point will there be a noticable decline in performance? Assuming each row has, say, 10 fields, and the only operation being performed on the table are selects (no inserts, updates, etc).Will performance begin to drag at 50,000 records? 100,000? 1,000,000? 5,000,000?I'm trying to wonder just how scalable SQL server is, and perhaps explore some alternatives if tremendous amounts of data will bring down performance. Currently using 8.0, not 2005.Thanks in advance.
Is there any way to run a stored procedure in parallel to another one? i.e. I have a stored procedure that sends an email. I then scan a table and send any unsent emails. I do not want the second part to slow the response to the user.
I have another post here regarding SQL 2005 running a query 50% slower than on 2000. It was discovered that 2005 runs the query in series whereas 2000 runs it in parallel.
Even with "Cost Threshold For Parallelism" set to a default value – 0, 2005 still executes my query in series. Does anyone know how to force a query to run in parallel in SQL 2005. I specifically want to set it at the database level.
I have a SQL 7 db with a union query (view), and I'm getting the error, "Thequery processor could not start the necessary thread resources for parallelquery execution." This union query has been in place for about two years nowwith no problems until just now, though I haven't changed anything. Also, Ihave a local copy of the database on my machine, and the query runs fine.As noted, I haven't changed anything in the query, nor in the SQL settings.There is a network administrator, so it's possible that he may have changeda setting, but I don't know what. The query is reproduced below. Any ideasas to what's going on would be appreciated.NeilMain query:SELECT Tmp.INVCUST, Tmp.SDNBR, Tmp.SDBOOK, Tmp.SDIVCLN,Tmp.SDPAID, Tmp.SDPRICE, Tmp.SDCOPIES, Tmp.Location,INVTRY.AUTHILL1, Tmp.INVDATE, INVTRY.SaleSrc,INVTRY.HoldInitFROM (SELECT INVDATE, INVCUST, SDNBR, SDBOOK, SDIVCLN,SDPAID, SDPRICE, SDCOPIES, 'P' AS LocationFROM vwInvoiceDetUNION ALLSELECT INVDATE, INVCUST, SDNBR, SDBOOK, SDIVCLN,SDPAID, SDPRICE, SDCOPIES, 'N' AS LocationFROM vwInvoiceDetNUNION ALLSELECT INVDATE, INVCUST, SDNBR, SDBOOK, SDIVCLN,SDPAID, SDPRICE, SDCOPIES, 'M' AS LocationFROM vwInvoiceDetM) Tmp INNER JOINdbo.INVTRY ON Tmp.SDBOOK = dbo.INVTRY.[Index]vwInvoiceDet:SELECT tabInvoice.INVDATE, tabInvoice.INVCUST,SALEDET.SDNBR, SALEDET.SDBOOK, SALEDET.SDINVNUM,SALEDET.SDPRICE, SALEDET.SDPAID, SALEDET.SDCOPIES,SALEDET.SDIVCLN, tabInvoice.INVNBR, SALEDET.SDIDFROM dbo.tabInvoice INNER JOINdbo.SALEDET ONdbo.tabInvoice.INVNBR = dbo.SALEDET.SDNBR(vwInvoiceDetN and vwInvoiceDetM are similar to vwInvoiceDet.)
Is is possible to get the iterations in a foreach loop to run in parallel? What I need to do is to spawn an arbitrary number of parallel execution paths that all look exactly the same. The number is equal to the number of input files, which varies from time to time. Any help is appreciated!
we're accessing a SQL Server as a source for some SSIS packages using quite complex SQL commands. We have dataflows getting data from up to 10 queries. The problem is that SSIS starts all these queries in parallel using up all the memory of the server (the source SQL server, not the server SSIS is running on). So the queries are very slow. Is there any way to force SSIS to start the queries after each other?
I already browsed the web for some answers on that and I'm not very optimistic... Maybe the only solution is really to feed the result of the query in raw files and process them later...
I have been trying to the query optimizer to generate a parallel execution plan but no matter the MaxDOP (0) or Cost Threshold (5) settings I use it will only execute in serial.
UPDATE [dbo].[Targus_201412_V7_B] SET [URBAN] =( CASE WHEN [METRO_STATUS] = 'Urban' THEN 1 ELSE 0 END)
I've got an SSRS report that is set up using a data-driven subscription to supply input parameters to the stored procedure that is called to generate the report results.
I was wondering if there is any way to specify the execution processing method (running the reports in parallel or serially). The subscription that we have set up appears to be running all of the reports in parallel which is causing massive load on our servers.
I've been working with SQL Server 2005 for a while now and I've noticed some odd behavior that I want to bounce of other members of the community. I should preface that I've been a forum viewer (and occasional contributer) here at SQL Team for a while and I've naturally developed a keen sense for optimizations.
Fundamentally, longer stored procedures with perfectly fine/optimized execution plans are inconsistent with real world performance. In some of these cases, a low subtree cost on a 4 core machine with 16gb of ram and 2 15 drive SAS arrays with little load takes excessively long to run or in some cases doesn't complete.
This isn't due to blocking or resource bottlenecks as I'm quite familiar with built in tools to troubleshoot and resolve those issues. In all cases, I am able to rearchitect the stored procedure into a higher subtree cost variant and get reasonable performance, but it's frustrating to have to redo work and there seems to be no common theme other than longer multi-statement procedures.
I've used SQL Server 2000 extensively and did not notice this level of inconsistency in performance with that product version. Just wondering if others in the community have experiences similar or if I'm just crazy.
I am using SQL2005 EE with SP1. The server OS is windows 2K3 sp2
I have a table-valued function (E.g. findAllCustomer(Name varchar(100), gender varchar(1)) to join some tables and find out the result set base the the input parameters.
I have created indexes for the related joinning tables.
I would like to check the performance of a table-valued function and optimize the indexing columns by the execution plan.
I found the graphic explanation only show 1 icon to represent the function performance. I cannot find any further detail of the function. (E.g. using which index in joinning)
If I change the function to stored procedure, I can know whether the T-SQL is using index seek or table scan. I also found the stored procedure version subtree cost is much grether that the table-valued function
I would like to know any configureation in management studio can give more inform for the function performance?
For example in a Select Statement we have many tables and we have Where Clause with many conditions with AND operations. Do the SQL SERVER would apply the Where clause after all fetch or can dynamically decide about to include the related Tables from Select Statement Orderly with respect to where clause predicates? (SQL SERVER would not fetch data of those tables for its Select, where the AND condition in Where clause fails or by logic would be fruitless/not-related.)
after moving off VS debugger and into management studio to exercise our SQLCLR sp, we notice that the 2nd execution gets an error suggesting that our static SqlCommand object is getting reused from the 1st execution (of the sp under mgt studio). If this is expected behavior, we have no problem limiting our statics to only completely reusable objects but would first like to know if this is expected? Is the fact that debugger doesnt show this behavior also expected?
Has anyone encountered cases in which a proc executed by DTS has the following behavior: 1) underperforms the same proc when executed in DTS as opposed to SQL Server Managemet Studio 2) underperforms an ad-hoc version of the same query (UPDATE) executed in SQL Server Managemet Studio
What could explain this?
Obviously,
All three scenarios are executed against the same database and hit the exact same tables and indices.
Query plans show that one step, a Clustered Index Seek, consumes most of the resources (57%) and for that the estimated rows = 1 and actual rows is 10 of 1000's time higher. (~ 23000).
The DTS execution effectively never finishes even after many hours (10+) The Stored procedure execution will finish in 6 minutes (executed after the update ad-hoc query) The Update ad-hoc query will finish in 2 minutes
Hi I am slowly getting to grips with SQL Server. As a part of this, I have been attempting to work on producing more efficient queries. This post is regarding what appears to be a discrepancy between the SQL Server execution plan and the actual time taken by a query to run. My brief is to produce an attendance system for an education establishment (I presume you know I'm not an A-Level student completing a project :p ). Circa 1.5m rows per annum, testing with ~3m rows currently. College_Year could strictly be inferred from the AttDateTime however it is included as a field because it a part of just about every PK this table is ever likely to be linked to. Indexes are not fully optimised yet. Table:CREATE TABLE [dbo].[AttendanceDets] ([College_Year] [smallint] NOT NULL ,[Group_Code] [char] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,[Student_ID] [char] (8) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,[Session_Date] [datetime] NOT NULL ,[Start_Time] [datetime] NOT NULL ,[Att_Code] [char] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ) ON [PRIMARY]GO CREATE CLUSTERED INDEX [IX_AltPK_Clust_AttendanceDets] ON [dbo].[AttendanceDets]([College_Year], [Group_Code], [Student_ID], [Session_Date], [Att_Code]) ON [PRIMARY]GO CREATE INDEX [All] ON [dbo].[AttendanceDets]([College_Year], [Group_Code], [Student_ID], [Session_Date], [Start_Time], [Att_Code]) ON [PRIMARY]GO CREATE INDEX [IX_AttendanceDets] ON [dbo].[AttendanceDets]([Att_Code]) ON [PRIMARY]GOALL inserts are via an overnight sproc - data comes from a third party system. Group_Code is 12 chars (no more no less), student_ID 8 chars (no more no less). I have created a simple sproc. I am using this as a benchmark against which I am testing my options. I appreciate that this sproc is an inefficient jack of all trades - it has been designed as such so I can compare its performance to more specific sprocs and possibly some dynamic SQL. Sproc:CREATE PROCEDURE [dbo].[CAMsp_Att] @College_Year AS SmallInt,@Student_ID AS VarChar(8) = '________', @Group_Code AS VarChar(12) = '____________', @Start_Date AS DateTime = '1950/01/01', @End_Date as DateTime = '2020/01/01', @Att_Code AS VarChar(1) = '_' AS IF @Start_Date = '1950/01/01'SET @Start_Date = CAST(CAST(@College_Year AS Char(4)) + '/08/31' AS DateTime) IF @End_Date = '2020/01/01'SET @End_Date = CAST(CAST(@College_Year +1 AS Char(4)) + '/07/31' AS DateTime) SELECT College_Year, Group_Code, Student_ID, Session_Date, Start_Time, Att_Code FROM dbo.AttendanceDets WHERE College_Year = @College_YearAND Group_Code LIKE @Group_CodeAND Student_ID LIKE @Student_IDAND Session_Date <= @End_DateAND Session_Date >=@Start_DateAND Att_Code LIKE @Att_CodeGOMy confusion lies with running the below script with Show Execution Plan:--SET SHOWPLAN_TEXT ON--Go DECLARE @Time as DateTime Set @Time = GetDate() select College_Year, group_code, Student_ID, Session_Date, Start_Time, Att_Code from attendanceDetswhere College_Year = 2005 AND group_code LIKE '____________' AND Student_ID LIKE '________'AND Session_Date <= '2005-11-16' AND Session_Date >= '2005-11-16' AND Att_Code LIKE '_' Print 'First query took: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VarCHar(5)) + ' milli-Seconds' Set @Time = GetDate() EXEC CAMsp_Att @College_Year = 2005, @Start_Date = '2005-11-16', @End_Date = '2005-11-16' Print 'Second query took: ' + CAST(DATEDIFF(ms, @Time, GETDATE()) AS VarCHar(5)) + ' milli-Seconds'GO --SET SHOWPLAN_TEXT OFF--GOThe execution plan for the first query appears miles more costly than the sproc yet it is effectively the same query with no parameters. However, my understanding is the cached plan substitutes literals for parameters anyway. In any case - the first query cost is listed as 99.52% of the batch, the sproc 0.48% (comparing the IO, cpu costs etc support this). BUT the text output is:(10639 row(s) affected) First query took: 596 milli-Seconds (10639 row(s) affected) Second query took: 2856 milli-SecondsI appreciate that logical and physical performance are not one and the same but can why is there such a huge discrepancy between the two? They are tested on a dedicated test server, and repeated running and switching the order of the queries elicits the same results. Sample data can be provided if requested but I assumed it would not shed much light. BTW - I know that additional indexes can bring the plans and execution time closer together - my question is more about the concept. If you've made it this far - many thanks.If you can enlighten me - infinite thanks.
I am working on SQL Server 7.0. Every weekend we go for reindexing of some tables. I want to know if it is possible to run the re-indexing of tables in parallel so that I can save time.
Our database is of size 80GB and one table is around 22GB. Rebuilding of index on this table takes a lot of time and we are unable to index the other tables.
hi, we currently use the Database Maintenance Plan to do backups for our SQL Server 2000 databases. I notice that the database are backed up one after the other.
I would like to know how to run the backups in parallel rather than sequentially. To do this, is there any dependency on the number of CPUs?
I created the package to download 4 ftp files at once. I set the MaxConcurrentExecutables for the SSIS package to 4. So in BIDS in downloads 4 files at the same time.
However, when I started the job I noticed that only 3 files were downloaded at the time (looking at temp files in download directory)
Solution: Sure enough after digging around for awhile - in Step properties for SSIS package - there is execution tab - and "Maximum Concurrent Executables" was -1 (which for some reason defaults to 3 concurrent processes even on our dual CPU server) - so after chanign that value to 4 - tada - all 4 files in parallel
Assuming I have a line, is there a function I can call to create a parallel line at a given distance away.i.e - with the below I would want to draw a parallel line to the one output.
Hi ,I need to place the results of two different queries in the same resulttable parallel to each other.So if the result of the first query is1 122 343 45and the second query is1 342 443 98the results should be displayed as1 12 342 34 443 45 98If a union is done for both the queries , we get the results in rows.How can the above be done.Thanks in advance,vivekian
Hi,All. I'm writing test cases on C# for a few methods that make changes in database.To prevent making changes I used BeginTransaction-Rollback,everything was good.But this doesn't work if tested method has BeginTransaction-Rollback code itself.An error appears in NUnit: System.InvalidOperationException : SqlConnection does not support parallel transactions. Do smb know how to solve the problem?
I have several packages within secuence containers and into one main dtsx package with a checkpoint configuration and when I run it some succeed and some don´t. The problem is that when I rerun it checkpoint doesn´t seem to work ´cause some of the successful packages are rerun as well (and not skipped as it should be...) In other words, the process does not begin on the point of failure..
Seems to be that packages that finish after the failure point (and succeed) are not registered in the checkpoint file, then when I rerun the main package these succeeded packages are rerun too....
Here's my case, I have written a stored procedure which will perform the following: 1. Grab data from a table using cursor, 2. Process data, 3. Write the result into another table
If I execute the stored procedure directly (thru VS.NET, or Query Analyser), it will run, but when I tried to execute it via a scheduled job, it fails.
I used the same record, same parameters, and the same statements to call the stored procedure.
what the ideal CPU count and Max Degree of Parallelism are for a 3rd party database server.The server has 12 CPUs, 32GB RAM and all database sizes add up to < 30GB so they can all fit in memory (I tried to force this by doing a select * from every table). On certain payroll days, the CPU gets maxed out to 100% for a few seconds.
MAXDOP was originally set to the default 0. We later changed it to 8 based on several 'best-practices' articles. However the vendor suggests to change it to 1 (no parallelism), while others suggest changing it to 4, so that one run-away query doesn't hog most of the CPUs.
I'd like to find out how many CPUs are actually being used by queries. There is a Degree of Parallelism event in URL.... The BinaryData column says :
0x00000000, indicates a serial plan running in serial. 0x01000000, indicates a parallel plan running in serial. >= 0x02000000 indicates a parallel plan running in parallel.- What does "parallel plan running in serial" mean ?
I see a lot of 0x01000000, and a few 0x08000000's in my trace.How can i determine whether one query is hogging CPUs and if reducing it to 4 will work?
SQL Server 2000 SP3ALast week one of our processes starting issuing or suffering deadlockdetected errors every 15 minutes or so.I have read several articles at MS on the subject. I set a couple ofstartup parameters related to producing deadlock detection informationand ran SQL Profiler. I found the SQL statements being issued by thedeadlocked statements. In every deadlock the same UPDATE statementappears however the data values being searched on are different. Thebest I can tell from trying to query the actual data each update hitsonly one or very few rows. No indexed column is updated so the indexesshould not be the source of conflict.Looking at the query I noticed that the query does not have anavailable index and Query Analyzer shows that the full table scan isbeing done in parallel.My question: Does SQL Server change or modify its locking rules whenqueries are converted to be ran using parallel processing? If so, doyou have a reference?Here is the deadlock entries posted to the error log:SPID=167ResType:LockOwner Stype:'OR' Mode: IX SPID:63 ECID:0 Ec:(0x65971510)Value:0x3c577e60 Cost:(0/0)Input Buf: Language Event: UPDATE Station_Upload setStation_Accept_Status = 'ACC',HeadStatus ='ACC',LastProcessedSta='110',HeadPartType='1' WHERE Part_Serial_No ='SCH1119323' AND Station = 'H110'SPID=63ResType:LockOwner Stype:'OR' Mode: IX SPID:167 ECID:0 Ec:(0x65801510)Value:0x3c27d060 Cost:(0/0)Input Buf: Language Event: UPDATE Station_Upload setStation_Accept_Status = 'ACC',HeadStatus ='ACC',LastProcessedSta='70',HeadPartType='1' WHERE Part_Serial_No ='SCH1119060' AND Station = 'H070'I have suggested adding an index to support the query.Any ideas?Thanks -- Mark D Powell --
I have seen a number of posts regarding parallel development of SSIS packages and need some further information.
So far we have been developing SSIS packages along a single development stream and therefore have managed to avoid parallel development of our packages.
However, due to business pressures we will soon have multiple project streams running in parallel, and therefore multiple code branches, as part of that we will definitely need to redevelop the same SSIS packages in parallel. Judging from your post above and some testing we have done this is going to be a nightmare as we cannot merge the code. We can put in place processes to try and mitigate this but there are bound to be issues along the way.
Do you know whether this problem is going to be fixed? We are now using Team Foundation Server but presumably the merge algorythm used is same/similar to that of VSS and therefore very flaky?
However, not only are we having problems with the merging of the XML files, but we also use script tasks within the packages which are precompiled, as the DTSX files contain the binary objects associated with the script source code, if two developers change the same script task in isolated branches the binary is not recompiled as the merge software does not recognise this object.
Do you know whether these issues have been identified and are going to be fixed to be in line with the rest of Microsoft Configuration Managment principles of parallel development?
I have done a search and have read some of the posts, but am left more confused than before. I am fairly new to SSIS. Here is my situation and what i am trying to accomplish.
I have a package that has a sequence container, in which there are multiple SQL tasks (about 20) running in parallel. I have checkpoints enabled, and FailPackageOnFailure enabled as well. If the package fails, when i re-run the package it will run the last task as well as all the other tasks. What I am looking to accomplish is when the package is re-run, have the SQL tasks that failed ran and not the previous successful tasks.
I think the best way would be via disabling tasks on successful completion of a task, where it writes the name of the SQL task to a temp table, but I am skeptical.
Can anyone point me in a direction to help me accomplish what I am looking for please.