I am currently looking at the capabilities in SSIS from the point of view of an ETL developer who has worked with other products eg. Informatica, Cognos DecisionStream and one of things I note is a lack of support for dimensional hierarchies.
It appears that MS have assumed that SSIS users will automatically use SSAS.
We use Hyperion Essbase. Other sites have Cognos or Business Objects for their OLAP/BI.
I would like to be able build multi-level dimension hierarchies directly from within SSIS.
As I mentioned in my previous post, my "real" case is a cube with 79 dimensions, most of which virtual, have been added for convenience.
Think for instance about a time dimension... Wouldn't it be nice to get a matrix with years horizontally, months vertically and displaying say the number of order you had for each cell in the resulting grid. Ok, maybe you can do this with MDX but not in Excel, unless you create virtual dimensions for the Year and Month levels.
That's all good, but as it is, in my real case, I end up with four date dimensions for which I have to provide:
YQMD (Year, Quarter, Month & Day) hirarchized dimension YWD (Year, Week, Day) YQMD (Fiscal calendar) Year Quarter Month Week Day (as 1, .. 31) Week Day (for periodicity analysis over a week's time) Date (as individual day - this is the backbone for the virtual dimensions)
It turns out this makes 40 dimensions by itself. For the sake of it, I grouped them by 4 hierarchies, although I've seen no specific functionality off of this in the data browser or Excel, so it really seems to be only for "show".
Now in my previous post I explained how I "spread" my session count to calculate a conversion rate. Given the number of dimensions I have (very high segmentation at the order level, very limited segmentation at the session/visit level), this means my calculated cell formula looks like this (hold your breath, it's ugly):
If you read all this, you can see already the cryptic dimension names like "FRD", "FSD" and so forth... that's because with the real names ("First Refund Date", "First Ship Date", the query processor errored out... visibly there is a limit in the size of the formulas you can post!
Is there no other way to achieve this result? Basically I mean to say: if the session count is not defined at your level along this dimension, go to the root of the dimension to get the value there, this along a slew of dimensions, many of which are inherently dependent because of the usage of virtual dimensions (therefore if I wish to go to the root of my "First Refund Date" for instance, I wish to do so along all sub-dimensions... Heck, as far as I'm concerned this is conceptually only ONE dimension, just with various views upon it...
Using hierachies I was sort of hoping for the ability to have something like: [FRD].[All Hierachies].[All]
Am I just asking for too much or do I just not know (quite probable) the magic keyword that can do this?
Where this is becoming quite critical is that I actually have a calculated cell that goes as follows (abridged):
SubCube: similar as before
Calculation: [Measures].[Order Count]*CalculationPassValue((...set of all un-tied dimension roots..., [Measures].[Distributed Marketing Cost])/CalculationPassValue((...set of all un-tied dimension roots..., [Measures].[Order Count])
Now the purpose of this is to distribute external costs at the order level. In short, say that you know you spent $10,000 globally promoting a specific group of websites in commission money (you pay $1 for each order). This formula allows me to get that a specific website, with 20 orders incurred an additional marketing cost of $20. That's actually the object of my next thread's question (spreading a multiplication through the aggregations)
For the purpose of this thread I am just concerned about the size of my formulas. Renaming the dimensions seemed to "buy" me some margin and I was actually quite surprised to find that the formula still fitted in, and works, but it is only a matter of time until I have to add more dimensions and the whole thing blows up in my face. Adittionally this is obviously not pleasant to look at and maintain.
I am currently trying to develop an application which would help in retrieving data from cubes (Microsoft Analysis Services)! The user would not be accessing the Business Intelligence Studio, etc. but would be viewing the data from a custom made application developed in VB.Net2005.
While implementing this, I want to populate the drop-down-list in the VB(.Net) Form, by retrieving the various hierarchies in the dimensions of the cubes (along with dimensions if possible). This should be done dynamically and in real-time!
Can you please help in implementing this? Any code/method, etc would be highly appreciated!
Thanks in advance.
Best wishes!
(Software : SQL Server 2005 Enterprise Edition (with Analysis Services and BI Studio), VS.Net 2005 Enterprise Edition, ADOMD.Net)
When i add a dimension to the cube dimension without any relation in my dimension usage to any measure group my units are going down.However when i remove the dimension from the cube am getting the correct values.
I'm still fairly new to cubes, so bear with me. I'm trying to figure out if I should include aggregate data (e.g. total employees per facility) in a dimension table or if I should use the finished cube to get the counts (MDX?). Thanks for any help that you can provide.
What is the best way to move data from Online system tro data warehouse? I have created 3 dimension tables(product,date and customer tables) and I wanna create fact table and get foreign keys from dimension tables. What is the best method to do that in SSIS?
I am new at SSIS and I am trying to create a Datawarehouse using SSIS. I have the data files as flat files I have the Dimensional Model ready on Paper and Now I need to use the SSIS for the ETL process.
I am trying to figure out how to make dimension tables in SSIS? I mean I want to create the 5 Dimension tables and then create a Fact table out of it but I cant understand where to start? Can any one tell me how we create Dimesion tables in SSIS. Like one of the dimesion tables I need to create uses 2 flat files and is like a flattened dimension, How would I create this in SSIS?
Even if there is any tutorial which shows this step by step do let me know. I would really appreciate any guidance on this.
I have a cube with a Period Dimension, in which the hierarchy is as follows:
Year Quarter Month Day
Users access the data via a pivot table.
The problem is that whenever a new month comes in, the pivot table doesn't automatically display data regarding the new month. Can someone please help me here.
Hi there, my question is really simple. I want to setup an automatic task in SSIS that drops the tables in the target database and substitutes them with tables from the source database. We are talking about two or three dimension tables and one fact table. The dimension tables are pretty small. The fact table will contain, at maximum, 300,000 rows and 12 columns. I do not use delta or flag historisation btw. What tasks in SSIS would you suggest to use?
I have situation where I get data from SRC Flat file and have to load Dimensional table and also fact table, using same data flow(have no other choice since I have to unpivot some src data). Since I have to load both tables in same data flow, I have to have a way to put load ordering constraint (I know informatica allows that). Does any one have any idea on how this can be done in SSIS?
Apologies if this has been raised in the past, but 6 hours of web searching today hasn't turned up anything!
I'd like to use the Slowly Changing Dimension (SCD) Wizard to keep track of tables in my relational database. This means 200+ tables. I don't want to step through the UI Wizard for each table. Ideally I'd like to be able to create the SCD transformation in code, but I can find no good examples for doing this. The MSDN examples here are too brief and don't allow me to expand out to the level I need.
As in any database, columns come and (very rarely, go), and having a programmatic solution to this would mean that I could be flexible and cope with these situations.
So, my question is: Has anyone implemented SCD functionality in code, or have any code examples that do this, that I might learn from. Or, any tips/pointers if I'm barking up totally the wrong tree.
I have an Integration Services package that loads new data into tables that are dimension tables wi my cube. The same situation exists for my fact table. If I perform an "Analysis Services Processing Task" for the dimensions ,cube and measures, will that move the new data into my cube or do I need to perform the "Dimension Processing Destination" data flow task prior to this? Is the initial processing task good enough?
I am modelling cube in SSAS. Cube has around 20 dimensions and 6 fact tables. Some of the dimensions are common among the fact tables. e.g. Time dimension. Fact_PNL has 3 date columns for those we have 3 role playing dimensions in the dimension usages.
Another fact table has 5 date columns for them as well we have separate role playing dimensions in dimension usage tab. We have a common dimension Company which is foreign key in all fact tables. We might need to combine the data from multiple facts to get final output.
Should i create 6 role playing dimension for each of the fact table or use the same dimension for all fact tables?
Role playing dimensions should be created when we have multiple columns pointing to the same dimension ?
I created a Fact Table with 3 Keys from dimension tables, like Customer Key, property key and territory key. Since I can ONLY have one Identity key on a table, what do I need to do to avoid populating NULLs on these columns..
Hi, Nwebie question: is it possible to have a fact that will behave diffeerntly as the user is viewing it from different hierarchies? say i got creatin measures of a product inventory. i got a location hierarchy, as as user is 'moving up' in it (say,up by city, region, state, country...)i want to do a sum. but i also got time hierarchy, and as user begins to group by weeks, months, years... a sum makes no sense..i want LastNonEmpty or something like that. Is it possible?
Hey all. I have a query where I am basically querying an organizational chart. The table storing this information is basically a two column table of parent/child pairs. So you might have:
Querying this table should return all child columns that flow up to the top, so querying 1 would return 1, 2, 3, 4, 6, 7, 8, 10, 11.
I used a sample from MSDN (http://www.sqljunkies.com/Article/D7CAED46-CCAC-4FF7-B528-B2E9A274B71C.scuk) that does just this, except that when querying a value that returns more than 1000 records - I am experiencing way too long response times. The MSDN sample uses temp tables and inserts values into a temp table as it moves through the records and finds a match.
Anyone have an ideas on another way to accomplish the same thing? This is an important part of my security model as users that login should only have access to data that falls within their parent/child heirarchy.
i guess really i'm looking for opinions and / or experiences of modelling organisational hierarchies.... anyone?
i'm in the process of creating a logical model (very early stages) of a new datawarehouse. The current OLTP schema stores it as a self referencing key (i.e. parent_group_id). The performance problems involved in aggregating to different levels of the hierarchy causes no end of complaints from customers, as i'm sure you can imagine. So, now we are modelling a datawarehouse for their reporting requirements, i obviously want it to be better
so far i've considered... 1) keeping it as it is 2) allocating a 'node' id and storing the left / right nodes in the hierarchy tree. we've used before when reporting over the OLTP system with report developers building it 'on the fly' as part of their report. (better explanation than mine... http://www.dbpd.com/vault/9811/kamfn.shtml ) 3) an intermediate table storing its 'position' in the hierarchy? not really played with this yet but mr kimball seems to like it see... http://www.dbmsmag.com/9809d05.html 4) denormalising it completely. i.e. level_1, level_2, level_3, Level4.... for every fact? the advantage of this is that it makes aggregation at any level really easy. the problem is that each of our customers has a different organisation hierarchy, with a different number of levels, and given the OLTP schema there is no limit to the number of levels in a hierarchy
Is there a good approach to modelling many heterogeneous entity typeswith that have some attributes in common?Say I have entities "employees" which share some attibutes (e.g.firstname, lastname, dateofbirth) but some subsets of employees (e.g.physicians, janitors, nurses, ambulance drivers) may have additionalattributes that do not apply to all employees. Physicians may haveattributes specialty and date of board certification, ambulancedrivers may have a drivers license id, janitors may havepreferredbroomtype and so on.There are many employee subtypes and more can be dynamically addedafter the application is deployed so it's obviously no good to keepadding attributes to the employees table because most attributes willbe NULL (since janitors are never doctors at the same time).The only solution I found for this is a generalization hiearchy whereyou have the employee table with all generic attributes and then youadd tables for each new employee subtype as necessary. The subtypetables share the primary key of the employee table. The employee tablehas a "discriminator" field that allows you to figure out whichsubtype table to load for a particular entity.This solution does not seem to scale since for each value of"discriminator" I need to perform a join with a different table. Whatif I need to retrieve 1,000 employees at once?Is that possible to obtain a single ResultSet with one SQL statementSQL?Or do you I need to iterate look at the discriminator and thenperform the appropriate join? If this kind of iteration is necessarythen obviously this generalization hierarchy approach does not work inpracticesince it would be painfully slow.Is there a better approach to modelling these kind of heterogeneousentities with shared attributes that does not involve creating a tablefor each new employee type or having sparce tables (mostly filled withNULLS)I guess another approach would be to use name/value pairs but thatwould make reporting really ugly.Seems like a very common problem. Any ideas? Is this a fundamentallimitation of SQL?Thanks!- robert
I have a date dimension and Fiscal Year is a Hierarchy in it, like April, August and October for different fiscal years. I need to populate this in a report (SSRS) in a combo box so that the user can pick the fiscal year he/she wants. When done, I want to know what is the Fiscal year start month is, like April for April Fiscal year etc.
I have an existing MDX query returning the correct resultset. However, I want to add a filter so that for a combination of value in Hierarchy1 and Hierarchy2, the data is filtered out.
For example I have data like
H1 H2 Amount 1 1 100 2 1 50 3 1 45
I am getting a value of 100+50+45=195.
I want to filter the data for the combination H1=3 and H2=1. Expected result would be 100...
I want to be able to implement infinite levels of Hierarchies in SQL Server (2012), in addition to being able to address issues like same child having more than one parent (eg. An Employee could end up having 2 different managers - eg. Project Manager, Delivery Manager).
One way is to have self referencing table (where each row has a parent id , referencing to a parent record - but this would not work in cases where a child has more than 1 parent).
Are there other more efficient ways? What is the best way to implement this?
I've built a robust looking Calendar dimension but that's easy because its all so well structured. Every date belongs to a particular week and every week belongs to a particular month and every month belongs to a particular quarter etc. January 1st is always going to be a member of week 1, month 1, quarter 1.But here's the rub, real life isn't as predictable and stable as a calendar.
Lets say you have an table containing personal information. You might have gender and marital status in there and even though they both relate to one person, they're unrelated to each other so how do you put those in a hierarchy? Which one goes first and which second because you can get combinations of either.
I am unable to crosstab different alternate hierarchies of an MSAS cube from a single dimension, I'm using Cognos Powerplay (7.3 web and client) to browse the data. v 7.1 displays the same behaviour.
When I try to create the crosstab the display replaces both rows and columns to the alternate path, it overwrites the original path. Can anyone tell me if this is a Cognos issue or an Analysis services one and if there is any way around it. The only solution I have at the moment is to bodge it by creating seperate dimensions for each alternate hierarchy however this is ugly and difficult to use.
I am stuck in a situation where I want to use YTD for three different calendars of our company and don't want to create three different YTD calculations. However I want to make this work for any measure not for a particular measure
If I create one YTD and try to use in context of three calendars in SCOPE statements then it does not give my right results. Following is my syntax but It does not work.
I have some confusion on crossjoin function within MDx.while I try to crossjoin the different level sets of same Hierarchy. It shows error as
For example. ‘The Customer Geography hierarchy is used more than once in the Crossjoin function.’ select { {[Customer].[Customer Geography].[Country].&[United States]}* {[Customer].[Customer Geography].[State-Province].members}} on 0 FROM [Adventure Works] WHERE Measures.[Internet Sales Amount]
Cannot we Cross joins across user defined hierarchies ,or they aren't supported .?Coz I really need to implement as above MDx within my real Cube.I try to implement by making as another Hierarchy Member but it doesn’t gives the value result as what we want/need.with
member [Customer].[Country].[United States ]as [Customer].[Customer Geography].[Country].&[United States] select { {[Customer].[Country].[United States ]}* {[Customer].[Customer Geography].[State-Province].members}} on 0 FROM [Adventure Works] WHERE Measures.[Internet Sales Amount]
Need some help building a query that does the following :
I have 2 Time Dimensions ; Time (Transdate) and ClosedDate (ClosedDate)
In my report/query, if [Time].CurrentMember = [Time].[YMD].[YMD].[2006].[200610].[20061031] I want to FILTER out all ClosedDate < [ClosedDate].[YMD].[YMD].[2006].[200610].[20061031]
Both Time Dimensions are Year -> Month -> Day and have the same Members.
I have every option available, using calculated Members and/or Measures to do this.
The report I'm creating is Aging of Receivables : Balance / 30 days / 60 days / etc.. But for the Aging, I need to filter like explained above.
Does anyone know if it's possible to grant a user the ability to manage jobs in server agent besides giving them SA rights? I none of the server roles beside SA seem to be able to work.
I am currently working on a large project that will be developed using MS SQL. I have been considering two options when it comes the creation of the database. The first option is to create a sepereate table for each of my clients. The tables would be labeled by their Id number then the tables reqular name. Fore example : 781_Users.
The other option is to create a different database for each of the clients.
I just need the data to be seperated between clients, it cant mix. Please just give me your opinion on which option you would consider if you were operating a large project.