I have a dimension table that uses slow chaning type 1 and 2€¦
Source Table
CD Category Object_key
-1 Not Provided NULL
1 CADA 1002
2 DFG 1889
3 GHJ 456
4 CACA 54635
I am using ObjecT_key as a business key
Destination ( Dimenstion)
Key Category business_key
-1 Not Provided NULL
1 CADA 1002
2 DFG 1889
3 GHJ 456
4 CACA 54635
5 Not provided NULL
Problem is When a business_key is NULL , slow changing dimension wizard doesn€™t recorgize whether it€™s a new or existing value, so it keep trying to add a new row..
It has been suggested that when loading dimensions, the first step should be to clear the current dimension table and reload in total from the data source. I use integer identity PK fields in all my dimension tables.
When loading my fact table I'm only loading rows that have been added or changed since the last SSIS run.
Wouldn't this introduce the possibility of mis-matched key relationships if, for some reason, the order of the rows in the dimension table changed and a fact row that doesn't get modified in this SSIS run pointed to a surrogate key in a dimension table that was changed?
I suppose in most cases a change in the OLTP that caused a reordering of the dimension table - say a customer was dropped - would show up as a modification in the fact load and cause the row to be updated. However if someone changed the indexing in the OLTP it might cause the customer order to chagne causing them to be loaded in a different sequence with different surrogate keys in the dimension table.
It just concerns me that a possibility for corruption seems to exist in this type of scenerio. It seems like one should load the dimensions and facts in the same manner.
I am trying to perform an incremental (ProcessAdd) load on a dimension using SSIS and the Dimension Processing data flow component. Whenever I try the load it fails with the following error:
The PrimeOutput method on component "OLE DB Source" (1) returned error code 0xC02020C4
OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1055719414,0x,Parser: An error occurred during pipeline processing. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1055719414,0x,Internal error: The operation terminated unsuccessfully. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1055719414,0x,Errors in the OLAP storage engine: An error occurred while the 'ORDER LINE STATUS' attribute of the 'DIM ORDER LINE' dimension from the 'ACRDvel' database was being processed. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1055719414,0x,Errors in the OLAP storage engine: An error occurred while the dimension, with the ID of 'Q DIM ORDER LINE', Name of 'DIM ORDER LINE' was being processed. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1055719414,0x,Errors in the high-level relational engine. The data source view does not contain a definition for the 'Q_DIM_ORDER_LINE' table or view. The Source property may not have been set. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1055719414,0x,Internal error: The operation terminated unsuccessfully. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1073450974,0x,SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "Dimension Processing" (112) failed with error code 0x80004005. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:44 PM,12/14/2007 7:44:44 PM,-1073450975,0x,SSIS Error Code DTS_E_THREADFAILED. Thread "WorkThread0" has exited with error code 0x80004005. There may be error messages posted before this with more information on why the thread has exited. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:46 PM,12/14/2007 7:44:46 PM,-1071636284,0x,The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:46 PM,12/14/2007 7:44:46 PM,-1073450952,0x,SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "OLE DB Source" (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure. OnError,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:46 PM,12/14/2007 7:44:46 PM,-1073450975,0x,SSIS Error Code DTS_E_THREADFAILED. Thread "SourceThread0" has exited with error code 0xC0047038. There may be error messages posted before this with more information on why the thread has exited. OnInformation,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:46 PM,12/14/2007 7:44:46 PM,1074016264,0x,Post Execute phase is beginning. OnInformation,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:46 PM,12/14/2007 7:44:46 PM,1074016265,0x,Cleanup phase is beginning. OnInformation,*******,Package,{62F748F0-EFC6-4D7A-9873-B31144C54873},{6AA40F9D-7E89-4D68-B2CD-31A43EA6B80D},12/14/2007 7:44:46 PM,12/14/2007 7:44:46 PM,1074016267,0x,"component "Dimension Processing" (112)" wrote 756860 rows.
Apparently this translates to the error DTS_E_ADDROWTOBUFFERFAILED.
This incremental load is trying to add several hundred thousand rows and it appears to get to around 10K or so and then fail. I can't seem to find any KB articles related to this problem. Does anyone have a clue what may be happening here?
I do have an incremental load of the same type that is working but I am continually getting the following warning when the load is running:
"The buffer manager detected that the system was low on virtual memory, but was unable to swap out any buffers. 6 buffers were considered and 6 were locked. Either not enough memory is available to the pipeline because not enough is installed, other processes are using it, or too many buffers are locked."
My server has 8GB of memory and is running WS 2003 x64 and SSAS 2005 x64
I'm loading a fact table that has several geographic attributes - some are at the state level, some are at the county level, and then some are drilled farther in that that. I understand the basic concept of the dimension with the ragged hierarchy, but unsure of how to load to the fact table using lookups based on these geographic units. For example, if my geographic dimension contains 200 records for the state of Wyoming, basically a record for each fine-grain place (i.e. city/town), then how do I go about doing a county lookup. Wyoming only has 23 counties, but because of the repetitive nature of the dimension attributes that are not at the finest grain, I'll get more records in the lookup than I need. This activity repeats of course while I move up the geographic scale to state, then country. How do I configure/fill my dimension to handle these differing scales of data?
When i add a dimension to the cube dimension without any relation in my dimension usage to any measure group my units are going down.However when i remove the dimension from the cube am getting the correct values.
Need some help building a query that does the following :
I have 2 Time Dimensions ; Time (Transdate) and ClosedDate (ClosedDate)
In my report/query, if [Time].CurrentMember = [Time].[YMD].[YMD].[2006].[200610].[20061031] I want to FILTER out all ClosedDate < [ClosedDate].[YMD].[YMD].[2006].[200610].[20061031]
Both Time Dimensions are Year -> Month -> Day and have the same Members.
I have every option available, using calculated Members and/or Measures to do this.
The report I'm creating is Aging of Receivables : Balance / 30 days / 60 days / etc.. But for the Aging, I need to filter like explained above.
As I mentioned in my previous post, my "real" case is a cube with 79 dimensions, most of which virtual, have been added for convenience.
Think for instance about a time dimension... Wouldn't it be nice to get a matrix with years horizontally, months vertically and displaying say the number of order you had for each cell in the resulting grid. Ok, maybe you can do this with MDX but not in Excel, unless you create virtual dimensions for the Year and Month levels.
That's all good, but as it is, in my real case, I end up with four date dimensions for which I have to provide:
YQMD (Year, Quarter, Month & Day) hirarchized dimension YWD (Year, Week, Day) YQMD (Fiscal calendar) Year Quarter Month Week Day (as 1, .. 31) Week Day (for periodicity analysis over a week's time) Date (as individual day - this is the backbone for the virtual dimensions)
It turns out this makes 40 dimensions by itself. For the sake of it, I grouped them by 4 hierarchies, although I've seen no specific functionality off of this in the data browser or Excel, so it really seems to be only for "show".
Now in my previous post I explained how I "spread" my session count to calculate a conversion rate. Given the number of dimensions I have (very high segmentation at the order level, very limited segmentation at the session/visit level), this means my calculated cell formula looks like this (hold your breath, it's ugly):
If you read all this, you can see already the cryptic dimension names like "FRD", "FSD" and so forth... that's because with the real names ("First Refund Date", "First Ship Date", the query processor errored out... visibly there is a limit in the size of the formulas you can post!
Is there no other way to achieve this result? Basically I mean to say: if the session count is not defined at your level along this dimension, go to the root of the dimension to get the value there, this along a slew of dimensions, many of which are inherently dependent because of the usage of virtual dimensions (therefore if I wish to go to the root of my "First Refund Date" for instance, I wish to do so along all sub-dimensions... Heck, as far as I'm concerned this is conceptually only ONE dimension, just with various views upon it...
Using hierachies I was sort of hoping for the ability to have something like: [FRD].[All Hierachies].[All]
Am I just asking for too much or do I just not know (quite probable) the magic keyword that can do this?
Where this is becoming quite critical is that I actually have a calculated cell that goes as follows (abridged):
SubCube: similar as before
Calculation: [Measures].[Order Count]*CalculationPassValue((...set of all un-tied dimension roots..., [Measures].[Distributed Marketing Cost])/CalculationPassValue((...set of all un-tied dimension roots..., [Measures].[Order Count])
Now the purpose of this is to distribute external costs at the order level. In short, say that you know you spent $10,000 globally promoting a specific group of websites in commission money (you pay $1 for each order). This formula allows me to get that a specific website, with 20 orders incurred an additional marketing cost of $20. That's actually the object of my next thread's question (spreading a multiplication through the aggregations)
For the purpose of this thread I am just concerned about the size of my formulas. Renaming the dimensions seemed to "buy" me some margin and I was actually quite surprised to find that the formula still fitted in, and works, but it is only a matter of time until I have to add more dimensions and the whole thing blows up in my face. Adittionally this is obviously not pleasant to look at and maintain.
I am trying to make a time dimension in analysis services. Is it possible to include the hours of a day in the dimension ? Is necessarily the day level the lowest of the hierarchy?
We deal with multiple vendors who provide us information via text/xml files. Vendor A may provide financial data, vendor b provides litigation data, vendor c provides ratings data. Our current structure has databases for each vendor with its own company table which basically makes all this data disconnected. Of course each vendor has its own proprietary company id to make records unique.
All of the data is based on companies so the grain of data would be at a company level. I would like to be able to link this information together by creating a dimensional model that has a single company table (DimCompany) and has facts populated based on the type of data we receive. Would this be the right sequence of events?
1. My initial load (historical) would have to look at all these data sources and create one company record in my DimCompany table. This table would then link to all other fact tables to provide a single view of company info. I would imagine this would have to be a fuzzy lookup since one company will be in all sources.
2. On subsequent loads (incremental) I would probably have to do a lookup of companies in the dimension via the proprietary code and add if the company wasn't there.
Any advice on tackling this issue would be greatly appreciated especially if SSIS was used in the process.
I have Created Dimension Security by restricting the user to see only his information whoever login to the system by creating a new role. It is working fine by checking the cubes by changing the role through change user option but when the cube was integrate it to any Analysis services reporting tool(Excel, Proclarity) inspite of the roles mentioned it displays the records of all the user.
Can anyone please help me out in this situation???
I am new to data mining and have a question about OLAP dimensions built from models. Do you know if you can use a dimension that was created by the mining model wizard in the same cube that is being used as the source for mining model it self? I keep getting an error about a dependency loop and just want to make sure that I am not trying to do something impossible.
If this is illegal, do you know of a way of doing this without essentially having one cube dependant on another via a mining model and dimension? I tried to use the wizard to create a mining off of a table but it did not give me the option of creating a dimension from it.
how do i join a dimension on itself? i'm pretty sure this isnt really how MDX is supposed to work, but I've run into a case where i'm hoping this can be done in MDX. i'll try to simplify as much as possible. if my Customer dimension contains these attributes:
- [FullName] - [ID] - [ParentID]
Is it possible to have a query return a customer's FullName, ID, and FullName of the ParentID.
Example with values:
FullName, ID, ParentID Jim, 1, 1 Jane, 2, 1
A query on ID=2 getting (FullName, ID, FullName of ParentID) should return: (Jane, 2, Jim)
What are the pros and cons to using SSAS to create time dimensions based on a date field in the fact table as opposed to a stand alone time dimension table.
I can see many problems with loading a time dimension table. The date is from the same table as most of my fact data. I have a column in my OLTP that sets last change date so I can tell if my fact data is an insert or update but it wouldn't tell me if the date column had been changed, just some column in the table. I'm going to have several thousand sales on any given day so I'll be reading a lot of rows just to put one row into the dimension.
From a SSIS point of view I'd think leaving the date in the fact table would be better.
I hope to pass a dimension such as int A[5] to a stored procedure A[0]=10 A[1]=20 ... Can I do that in sql server 2005? if so , please give me a sample, thanks!
I am a newbye with Analysis Services and am desperately trying to find a way to include a calculation between one of my measures (Teus), divided by the vessel capacity, where vessel is one of my dimensions (and is therefore not depended on other dimensions...)
Any ideas how I could implement that ?? This would help a lot, thanks for your help,
I am new to Analysis services. and I have figure out the following things:
I have a cube with multiple dimensions. I want to display all the options in cube along with the count of the rows for that combination. I am unable to understand how to count the dimensions and apply to column.
For example, I have say dim1, dim2, dim3 dim4, name as dimensions and cnt as count of the rows. Now I want to write an mdx expression that returns me dim1, dim2, dim3, dim4 with count(name fields) for above combination so the output will be as follows:
name field1 name field2 dim1 dim2 dim3 dim4 200 100
how do I do it?
Also how do we accept input parameters for mdx expression. for example in the above thing, if I have to accept an input for dim1 and display the output values for dim1 how do I do it?
Analysis manager won’t let me create dimension unless fact table filed has value init. Other words, how do I create MT dimension(dimension without member)
I have a fact table which contains the transaction date, ProductID, QTySold, TotalSaleAmount, etc...
Since I am new to OLAP therefore I need help to now create the table for TIME on which I will be basing my time dimension... I have read a few articles and have gathered that at the end of the exercise my fact table should have a 'timeid' column which will be linked to the same column in the table being used for the time dimension...
I have gone through the tutorial of MS-Analysis Service and FoodMart example have some idea about what he structure of this table will be.
My questions are:
1. I need guidance on how to create the table for time. One option is to just copy the table used in the Foodmart example but thought that might work but my concept will not be clear
2. The structure of the table to be used for time dimension is quite clear (i think this part is easy). What I want to understand is that how do i POPULATE this table which data? Can some one provide me with scripts, SPs, or whatever to do this.... This is the area where I am lost...
3. How will I enter the "TheDate" column in my fact table and link it with my table for time dimension...
Looking forward to someone's help.
BTW, I would like to share a very good article which i recently found in one of the newsgroups. Some of you might like appreciate it too: http://www.sqljunkies.com/Article/D1E44392-592C-40DB-B80D-F20D60951395.scuk
1) We have numerous fact tables with surrogate keys which reference just one dimensional surrogate key. How does this work?
2) Are the ‘facts’ feeding data TO the ‘dimensions’ (back end warehousing)? Or are the ‘Dimensions’ feeding facts to the ‘facts’ tables for lookups!?
Nb: Im very inexperienced at database design.
Im really also using this thread to get contacts for future harder questions!
Hello: Very soon my company will be moving to a 4-4-5 reporting schedule. Basically, what this means is that the first month of the quarter will have 4 weeks, the second will have 4 weeks, and the third will have 5 weeks. Therefore, for the 2007 the dates for Jan, Feb and Mar will be as follows: Jan - 1 - 27 Feb - 28 - 24 Mar - 25 - 31
Currently, I have an SSIS package creating a record for each day in the Time Dimension. Is there any script out there that will help me build a Fiscal calendar such as the one described above?
I realize that this is not a direct SSIS question but I figured that some of you might have encountered this situation and hence my post.
Hi I am working as datawarehous architect with a large concern and i designed a datamart witha fact table surrounded by 5 dimension tables. My PM who do not know datwarehousing has abruptly changed my design and instead of 5 dimension tables has increased to 17 tables which just 1 column each in them. This is hell of a design because if it is a single column dimension then there can not be any hierarchy in the dimension table , better will be to push this column as a fact. What should I do now. Moreover he has asked me to not to use SSIS and code everything in stored procedure. Please guide. Jigjan
I have 1 report with 2 charts, both charts have their own dataset. The two datasets are mdx queries on 2 different cubes, but some dimensions have the same name.
Now I want to have 2 differenent selectable parameters for the [dim time] dimension. One for the first query in the first cube and the second for the other query in the other cube .
So I check in the mdx query builder, both dimensions as parameter, but because both dimensions have the same name, i have only one selectable [dim time] -parameter in my report.
While populating dimsnion table the requirement is that the 1st record in all dimension should be " Not Available"
and if the fact table has a null value for that dimension it shouldbe tied to " Not Available" record in dimension Please help hwo to do both the things. JJ
I am unable to see my DSV when I try to edit my dimension. I have added some new fields in my dimension table and now want to use them in my cube dimension.
some of our Drill down has many values and we would like to display only top 10 ... I know how to create CALCULATEDMEMBER for the cube ..but that becomes a measure and will NOT be in the drill down so,
how to create Top 10 for Raw Url
Drill Down Groups Top 10 Urls ( this has 10000 values€¦€¦.. can we just display top 10 here? Right now the report never comes back because of the count€¦ or if there is a better way let me know
I'm developing a report with Samsung products sales compared to other producers' sales.
I have the following dimensions:
Dim Period (date-time) Dim Region (region where the products are sold) Dim Product Group (notebooks, CRT,...) Dim Producer (Samsung, LG, ..., Other producers)
Report should contain Samsung sales on the first row, other companies (LG, Dell,...) sales on subsequent rows and "Other producers" sales on the last row, i.e.:
Please note, Dim Producers may be enalrged, so I can't just enumerate all Dim Producer members in the query.
I tried to use field ProducerType, which I set to 1 for Samsung, 2 for other companies (LG, Dell, ...) and 3 for "Other producers", after that I ordered by this field, however it didn't position producers as I expected.
I have a dimension table with effective dating; I'm loading historical transactional data and want to associate the correct surrogate key from the dimension with the fact table transaction. My dimension table has a start date but no end date, the end date is assumed by the start date of the next record with the same id. So I want the surrogate key which is the most current but whose date is not before the transaction date. I know I need a subquery but not too sure how to write it.