How To Process A Partitioned Cube In Multiple Threads
Dec 21, 2004
Our company is in the retail business, thus, the window for processing cubes is very small during Christmas season (only 4 hours each day).
To speed things up, we have partitioned our cube at monthly level so, potentially, 12 threads can be run simultantsly. However, when I looked at DTS, I am not so sure whether or how it can accomplish that task. Has anyone tried this before or is aware of another third party tool can do the trick?
do i need to use specially synchronized code if i have multiple threads inserting, updating and reading rows to and from the same database?? in this case, i know that no 2 threads will try to insert or update the exact same row into the DB, however, multiple threads might try to read the same row from the database.
Hi, I'm trying to stress test my web application, but when I get high load, the queries that used to take 10-20 ms starts taking 500 - 2000+ ms. Or to put it another way, when i run them single threaded i can do about 43000 a minute, when they are run in paralell it drops to about 2500 a minute.
What can i do about this ?
There are severeal queries thats affected, but here is one example: update [user] with (ROWLOCK XLOCK) set timestamp = getdate() where userid = 1''
btw: im running sql server 2005 sp 1. The stress test is run on 3 machines total (web, sql and client) the client is simulation 400 users, cliking a page as soon as the last one is loaded, ie there will always be 400 page requests.
I have been asked by developers if there is any advantage in processing multiple clustering models simultaneously by using AMO and multiple threads as against processing one after another.
I have limited experience with Analysis Services but based on my reading I don't see this method providing any advantage.
Does anyone have any recommendations or advice? The system Enterprise Edition running on an x86 Server with 2 dual core processors and 4GB of RAM. Would the answer alter if the server running x64 version of SQL Server and Windows.
I found a peculiar thing today while working with SQL Mobile in a multithreaded application (VS2005, application for Pocket PC 2003).
I created a class which has one SqlCeConnection object. Every time I call a function to insert/select/delete something from the local db, I open the connection, execute the query an then close the connection again.
But when I'm calling a function from the db class in thread 1 and in the meantime call a different function (from the same db class of course) in thread 2, things go wrong. Because when function 1 wants to close the connection, function 2 is still using the connection and it will crash my application with a native exception (0xC0000005: access violation).
I can see why the error is happening, but shouldn't there be a nice .NET handled exception instead of a native exception which grinds my app to a hold?
(A workaround I use now is to use multiple connection objects instead of one, but I thought I'd give this feedback anyway)
I am searching for some information on achieving performance improvement by spawning multiple threads in single Stored procedure or rather say within Single Database connection. We have a batch process that updates around 200 tables and each table update takes around 2 mnts. I am trying to optimize this by running these updates in parallel rather than sequential. These all tables are mutually exclusive. I have written a stored procedure which updates these tables in loop. Concern is that every update statement waits for other to get over. I am calling this Sp from Java application. One crude way will be opening multiple connections to database each running separate T-SQL statement. It comes with lot of overhead in opening connections .Is there any way I can force explicitly in T-SQL stored procedure to spawn a new thread for every Update statement. In case I try to do same process from a Java connection.. is there a way I can open multiple threads for each statement under same database connection.
we are queirying an stored procedure multiple times same time,from our application. In this case, few processes executing successfully and few getting failed with error "50000 error executing the stored procedure" and if we run thesame process again its getting executed sucessfully.Does the MySQL cannot handle multiple threads same time?
We are using SQL 2005, Visual Studio 2005, SSIS, and SSAS. We have built our Dimensional model in SQL 2005, we have build our packages to complete full refresh of Dimensional model using SSIS. We have built SSAS cube using VS 2005. We built source, data source, cube and dimensions using auto build. We processed cube in VS 2005 by right clicking on solution and click process. Cube was built in Analysis Services. We made some schema changes to model then data changes in SSIS packages. We then pulled up cube in VS 2005 right click on solution and process. Cube is being re-built. After completion we check cube using Proclarity and Excel 2007 and notice the schema changes and data changes did not take. We dropped cube, then deleted data source, dimensions, and cube then re-created data source view and cube auto build then process and now have new schema changes and data changes. Why is process not working to re-build schema and data changes when we have process FULL selected, Changes only. We even tried rebuild, deploy, and process. What is it we are missing or not doing correctly?????????
I am using distributed transactions where in I start a TransactionScope in BLL and receive data from service broker queue in DAL, perform various actions in BLL and DAL and if everything is ok call TransactionScope.Commit().
I have a problem where in if i run multiple instances of the same app ( each app creates one thread ), the threads pop out the same message and I get a deadlock upon commit.
My dequeue SP is as follows:
CREATE PROC [dbo].[queue_dequeue] @entryId int OUTPUT AS BEGIN DECLARE @conversationHandle UNIQUEIDENTIFIER; DECLARE @messageTypeName SYSNAME; DECLARE @conversationGroupId UNIQUEIDENTIFIER;
GET CONVERSATION GROUP @conversationGroupId FROM ProcessingQueue; if (@conversationGroupId is not null) BEGIN RECEIVE TOP(1) @entryId = CONVERT(INT, [message_body]), @conversationHandle = [conversation_handle], @messageTypeName = [message_type_name] FROM ProcessingQueue WHERE conversation_group_id=@conversationGroupId END
if @messageTypeName in ( 'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog', 'http://schemas.microsoft.com/SQL/ServiceBroker/Error' ) begin end conversation @conversationHandle; end END
Can anyone explain to me why the threads are able to pop the same message ? I thought service broker made sure this cannot happen?
There is a function called "proactive caching" in Analysis services. It can: ----Automatic synchronization with the relational database ----No more explicit "cube processing
But I cannot have the latest data in the cube even I set the proactive mode as "real time"
Do I need SSIS to process cube in this case?
Following is the procedures I have done: 1. test the data 1.1 use the bi dev studio to browser the cube, ensure no new data are there 1.2 process the the cube and browser the data, ensure new data are there 1.3 delete new data from source database and reprocess the cube, ensure no new data are there 1.4 add new data again
2. configure the proactive setting of cube 2.1 use sql server management studio to open the cube and open the properties window 2.2 in the option of "proactive caching" select "low-latency MOLAP" (even real-time ROLAP later), then click ok
3. configure the proactive setting of cube 3.1 open the patitions view properties window
3.2 in the option of "proactive caching" select "low-latency MOLAP" (even real-time ROLAP later), then click ok
3.3 in the notification tab, select "sql server " and specifiy tracking tables to the "fact table", which is a view to get data from real fact table.
4. wait a period of time...
5. test the data again 5.1 use the bi dev studio to browser the cube, but no new data are there (even I selected real-time ROLAP later). I even tried the reconnect and refresh options in the tool bar
So my questions are : 1. Did I do the right thing to achieve the target "Automatic synchronization with the relational database "
2. Can I monitor the procedure of synchronization, such as monitoring the log of processing, viewing the schedule setting and status of the process?
I am unable to call a package with a cube processing task... it will not execute. I have even tried to simply call a package to process foodmart on my own machine and it will not run. The package when run manually executes fine.
I want to make a package in SSIS for automatic process of my data cube providing some log informations (two INSERT statements to my log table with actual date and result of operation succesful/unsuccesful). I tried to set data source to analysis services, I found my cube but I don't where I can add my cube to project and how can I desingn it. Can anybody tell me how to??? Thanks
I'd like to get simple and clear explanation of the cube in data mining, and 3 notions we encounter a lot : Build, Deploy, and Process.
(1) What is the cube that is created when we deploy a mining solution/project? I wonder what type of cubes they are because although the dialog on deploy/process show that cube, after successful deployment we still don't see the cube in Cubes folder of the project.
(2) Why the SQL Server created that cube? Even though we process only one table and only use case-table (without nested table)
(3) Can someone explain these 3 concepts with CLEAR differences between them? (A) Build (B) Deploy (C) Process
As far as I know, the stages are like that : build, then deploy, then process. Also, it seems to me that those operations do not create objects inside 'Relational' database, but create objects (binary and text, with text files usually in XMLA programming language) in the related project's folders and subfolders. Any good explanation is appreciated.
I want to process my cube using Process Data and Process Index instead of the Process Full. However, after configuring the 2 Analysis Services Processing Tasks (one for process data and the other for process index) and were executed sequentially (process data first then process index), I got this error:
Errors in the metadata manager. The process type specified for the CASES cube is not valid since it is not processed
Have I done the right thing?
The reason why I prefer using the Process Data and then Process Index, it's because it is much faster than the latter.
In the ECASE table there is trigger to get the max value of case_id column in ecase based on project and increment one to that case_id value and insert into ecase table .
When we insert a new record to the ECASE table this trigger calls and insert the case_id column value.
When i run with multiple threads , the transaction is rolled back because of trigger . The reason is , on the project table the lock is happening while getting the max value of case_id column based on project.
I made a cube with time dimension with hieracly year/month/date/hour the problem is that dimension is growin to fast. In older version of MSSQL (2000) the same dimension doesn't grew so much. Any ideas? The table is big (may be around 1 500 000 rows per month) now it contains around 4 500 000 rows.
I want to find a way to get partition info for all the tables in all the databases for a server. Showing database name, table name, schema name, partition by (maybe; year, month, day, number, alpha), column used in partition, current active partition, last partition (for date partitions I want to know if the partition goes untill 2007, so I can add 2008)
all I've come up with so far is:
Code Block
SELECT distinct o.name From sys.partitions p inner join sys.objects o on (o.object_id = p.object_id) where o.type_desc = 'USER_TABLE' and p.partition_number > 1
My database's design is set out here. In summary, I'm trying to model a Stock Exchange for a Technical Analysis application written using Visual C++. In order to create the hierachy I'm using a Nested Set Model. I'm now trying to write code to add and delete equities (or, more generically, nodes) to the database using a form presented to the user in my application. I have example SQL code to create the necessary add and delete procedures that calculate the changes to the values in the lft and rgt columns, but these examples focus around a single table, where as my design aggregates rows from multiple tables using UNION ALL:
Code Snippet CREATE VIEW vw_NSM_DBHierarchy -- Nested Set Model Database Hierarchy AS SELECT clmStockExchange, clmLeft, clmRight FROM tblStockExchange_ UNION ALL SELECT clmMarkets, clmLeft, clmRight FROM tblMarkets_ UNION ALL SELECT clmSectors, clmLeft, clmRight FROM tblSectors_ UNION ALL SELECT clmEPIC, clmLeft, clmRight FROM tblEquities_
Essentially, I'm trying to create an updateable view but I receive the error "UNION ALL View is not updatable because a partitioning column was not found". I suspect that my design in wrong or lacks and this problem is highlighting the design flaws so any suggestions would be greatly appreciated.
Now I have a different constellation: Integration Services run on one server, in version 2014, the Analysis Services instance to process the cube database on runs on another server, version 2012.I tried several different combinations of SSIS version and Analysis Management Objects version, and got several errors while running the process package (e.g. object reference not set to an instance of an object, cannot find AnalyisServices.dll..)
Is this combination 2014/2012 possible at all?I assume the BIDS version has to be for SQL Server 2014, as I want to run SSIS packages on a 2014 server, is that correct? Does it matter at all, can I also deploy 2012 packages?Which version of Analysis Management Objects do I have to use? I assumed I have to use version 11.0 here, because I want to process a 2012 cube?If it is possible to use the "old" 11.0 version of AMO, do I have to do anything so that it can be found by the SSIS package running on the server (it was built on my local computer, there I have all SQL Server versions from 2005 to 2014 installed in parallel), or do I just have to copy it to the appropriate SQL Server folder?
I am recently encountering proble with SSAS cube,In a day cube is going to offline for several time and unable to browse it and after some time automatically cube is getting online.I am unable to figure it out what is happening.
FYIP..For every 15 Min cube will be Proccessed Full.
In DTS designer I need to solve the following problem:
Problem: - Import x-number of flat files into SQL Server one at a time.
Issues:
- I can't call the package recursively to process the files because I don't want to eat up computer resources. (there is a possiblity that there could be hundreds of files that need to be processed and I wouldn't want to have hundreds of package instances running).
- The files names cannot be read into a list then processed because while the list is being processed more files could be "dumped" into the folder by the main frame. And, all files must be processed.
I thought about scheduling a job at the end of the package to run the package again, however, I haven't figured out how to do that and am still looking. DTS is new to me.
We have a scenarion in a batch job. There are 3 sp's which are executed for every record in a table. After the execution of first sp the second sp executes depending upon the result of first sp. Simillarly for 2nd and 3rd sp.
Now if any sp execution fails than the whole transaction for that record in the table has to be rolled back.
Can this whole process of executing the multplie sp's insides a single transaction be accomplished using service broker with either a single queue or multiple queues?
sp 1 (1 record)
__________ l_____________
l l l sp2 ( 3 records for 1 record in sp1)
Simillalry for the one record in sp2, sp3 executes for multiple records.
Or in other words if processing of any message in a queue fails all the messages that have been processed already shoould be rolled back and no further execution should happen?
Also i would like to know can a conversation group be rolled back if processing of any message in the conversation group fails. I am asking this as we can club sp2 and sp3 together to get the results directly and than try for parallel processing.
i have a table with rows of file names and paths. what i'm trying to do is process each file and store it in my sql database. i want to store the files as binary files (they are word and excel and pdf files) anyone know a way to do this? it would especially be useful if i could do this with a console application so i can schedule it
I have two process steps in a package and one failure step. Both process steps have On Failure workflows to the failure step. The problem I'm having is that it seems if two or more workflows go to the same failure step then neither go to the failure step, but it does go if only one process step is attached to it.
I know I can create duplicate steps for each process step, but I was hoping to be able to do it this way FMI.
I am just starting out using CUBEMEMBER/CUBEVALUE formulas in excel linked into a sql olap db - using this method for some custom reports where pivot tables are not suitable. The time dimension values include Months, Quarters and Years and the CUBEMEMBER formulas like
=CUBEMEMBER("OLAPCUBE","[Time].[Time].[Year].&[2015].&[1].&[1]") work fine - 1st quarter 1st month etc.
Is there a straightforward notation to aggregate months or do I need to use a plus sign to add a number of CUBEMEMBER formulas together.In other words - Is there an easier way of for say jan to july 2015 totals than
We've had a problem for a few months now that has completely stumped us. We are running a heavily cursored massive data manipulation process on a 32 bit SQL Server instance running on a virtual machine, running ontop of VMWare, with the following specs
Processors: 2x2674MHz processors Memory: 4GB RAID 10 disk config
When we run our process on this machine, in total it runs in 30 hours.
When this process is run on another 32 bit server with the following specs
Processors: 8x3658MHx processors Memory: 8 GB SAN w/ RAID 5 disk config
It runs 25% slower
But here is the real kicker. When this process is run on a 64 bit server with the following specs
Processors: 8x3658MHz processors Memory: 8 GB SAN w/ RAID 5 disk config
It runs 75% slower.
This process consists solely of stored procedures written in TSQL. The weird thing is that on our smaller server, the CPUs' % utilization are evenly balanced (at 20-30%) when this large data manipulation process is running. However on the bigger servers, SQL Server latches onto a single processor and doesn't load balance across other processors. Such that what we're seeing is that only one processor out of the eight will be utilized and it will be throttled at 90% while the other 7 are at zero.
The default configuration settings in all three places.
Has anyone ever seen any behavior like this, where only one processor gets used by SQL Server during processing? Granted our processes are single threaded b/c they are using cursors but, it seems that the single thread shouldn't be restricted to one processor.
When I make a call to GetSchemaDataset with a restriction of a cube name with a space in the name of the cube the call fails. Following is a sample of the code: adoRestriction = new AdomdRestriction("CATALOG_NAME", "Contoso Telecom_Contoso"); adoRestrictions.Add(adoRestriction); dataSet = conn.GetSchemaDataSet("MDSCHEMA_CUBES", adoRestrictions); I am running SQL Server 2005 Analysis Services SP2. Is there some way to qualify the cube name in the restriction or is this just a bug? Thanks.
I'm trying to troubleshoot a SQL problem that we are having and I'm having difficulty with identifying the guilty process.
Using NT performance monitor I am monitoring all active Threads on the system and I have noticed that one particular SQLSERVR thread (then number obviously changes with each server restart) is hogging 100% CPU.
Is it possible to find out what process a particular thread number relates to ?
As far as I can tell the SQL SPID (from Enterprise manager) does not correlate to a SQL Thread.
I have an app that is critical to our business. It handles and syncronises several SQL Servers, checks integrety etc. I need to make the app so it can run a few things at once. Does anyone have any experience with this? Currently we use Delphi and ADO. I have been fiddling with DMO to get more performance - I am not sure ADO is very quick for some I tasks I need to do.
I suppose my main question *really* is does ADO/DMO multi-thread and has anyone tried it. If not how do people do it?