This is my first task in attempting to populate a fact and dimension table from SSIS.
I have a Fact table Sales and dimension tables Customer and Location. The data I am getting to fill this structure is in one file. where each record contains the sales information as well the customer information and location details on the same row.
I am using the SSIS to fill this structure by using the slowly changing dimension for the Customer dimension.
I am filling the customer dimension by using the slowly changing dimension. If I have 2 records having same BusinessKey but each with a different first name, where first name is set as a changing attribute, it is creating the customer twice in the table. But shouldn't it create one record with the most recent first name? or am I miss using the SCD.
I have another conceptual question, I am not sure which is the best way to fill my fact and dimension. Should I fill the customer and location dimensions first through 2 different loops on the data and then fill the fact table and map to corresponding dimensions? Or should I do like a for each loop on each record and for each record fill the dimensions and fact simultanously?
I made an illustration, that describes the szenario. There is one Master, one dimension and a second facts table. The Master Table holds up all objects. Sometimes I have additional Information for some objects in Master Table. They are stored in extra tables like the table Optional_facts_for_villas.
My question: How can I combine these two fact tables in Cube Designer without doing a Full outer Join in my Database?
The reason why I want it this way, not all optional_fact_tables are necessary now for OLAP, but maybe later. So, I thought, easiest way would be, adding them later to the cube, without changing the database. Sorry for my english, its not the best :)
I'm defining a mining structure against an OLAP dimension. The continuous value that I'm using both as input and for forecasting represents the time to complete a certain process.
There's something that strikes me as if it could be a problem, but I'm not sure. Our fact table has multiple columns (with multiple correponding measures in the cube). The "time-to-complete" measure is only populated on some of the fact rows - the rows that represent completion information. Other rows represent other information, and the "time-to-complete" value is set to 0. This works fine for cumulative time-to-complete and average time-to-complete, but it seems like it could mess up data mining. Will those 0-value facts skew the mining results? I'm not seeing a way to filter out those entries and only include the non-zero facts in the mining processing.
Or perhaps I'm totally misunderstanding something, which is quite possible. :)
I have some source tables like Customer, Order, ship, item, invoice. Among these source tables, I have to create 5 dimension tables and 1 Fact called orderFact using sql server queries just to test data. So i have created 5 dimensions and pulled dimension keys from each dimension and loaded into fact using join. For measures I have joined those 5 sources created a Rawfact table which have all measures.
Now loading into fact I have joined Rawfact with all dimensions and get keys and for measures i directly pulled from rawfact. Is this process right or we can do it by some other method?
And I want to avoid any Cartesian product for below queries. What I can do to avoid this?
Dimension: DimCustomer, DimOrder, DimShip,DimItem, DimInvoice and Fact is FactOrder:
Loading Rawfact:
select o.ord_id, o.full_order_value,o.open_order_value,o.div_code, o.order_type_code,o.order_status,o.order_date, s.num_of_pallets,s.num_of_cartons,s.shipment_value,s.ppd_coll,s.ship_status, i.invoice_amt, it.net_weight, it.gross_weight,it.warranty_days,it.item_type,it.item_num, c.terr_code, c.largest_bal,c.last_amt_pay,c.last_inv_amt,c.num_invoice_paid,c.cust_num, from order o
Hi, I have noticed the the cubes that we have here use shared dimensions. For almost all cubes(5-6) there are at least 4-5 common dimensions. According to what I have been preached so far, the shared dimensions are so that you can reuse them. That is not what is practised here. for example. cube1 has somedim1, dim2_c1, dim3_c1... cube2 has xyzdim1_c2,xyzdim2_c2,dim3_c2..
dim3_c1 and dim3_c2 are the same dimensions, one for each cube. I don't know if I am missing something. Shouldn't the use the same dimensions? Could there be any reason for this. pls. advice.
I am new to SSIS and I am investigating using the Slowly Changing Dimension transform.
The data source that I receive is a daily snapshot of the external source system table. I need to store the history of the entity attributes (Type 2 SCD) and I am using the Start / End Date mechanism.
When an entity (identified by the business key) is no longer received in the source snapshot, I would like the data flow to update the End Date of the current row to show that the entity has now expired.
Does anyone have any suggestions for a good way to achieve this ?
NB: Changing the source system extract to include and flag expired entities is not an option for me.
I have a question regarding a proble with two dimensions I built.
The first is named Account and contains approx 40k records. The second dimension is named Contact and contains the emps from the Account dim and contains approx 58k records. In the cube I also have two measures, One is a count of courses a Contact has taken. The second measure is a count of certifications a Contact may have earned.
The Account dim table has an AccountKey primary key and the Contact dim table has a ContactKey primary key and an AccountKey foreign key to the Account table. The key fields are not operational keys. They are surrogates. Both Contact and Account dim tables have as the first 10 or so records values that are used as parent groupings in the Cube dimension. For instance.
key = 1, name value = 'A-C'
Each proceeding value has the parent grouping's key value set as its parentkey.
The fact table contains both the AccountKeys and ContactKeys and an ItemId that corresponds to a specific course or certification. This itemid is used for the measures in the Cube
That's the background... here's my problem.
Using BIDS or Mgmt Studio, whenever I add the root dimAccount level (actual account names) as a row and then add the root Contact level (Contact names) as another row and drill down to a specifc Accounts contacts, everything locks up. I have only one measure in the data pane. The fact table only has about 20k records in it. I would think this should return data instantly. If I browse the cube with any other comination of dimensions besides the Contact and Account dimensions, the cube runs fine. It is just the combination of Account and Contact. I am getting really frustrated as I cannot figure this out.
I am rusty at SSAS so forgive me if I left out any pertinent info.
I am relatively new to SSIS/SSAS. I have searched the forums but cannot find an answer to my question.
I created a cube in SSAS and have deployed it. Now I am trying to use SSIS to populate the cube. I have setup a DS that points to the SSAS instance - it uses OLEDB Provider for Analysis services 9.0.
When I try to use a data flow task OLE DB source to truncate the dimension/cubes I do not see the DS in the list to select?
I am finding it hard to get into the SSIS way of organizing the processing.
I have a problem where I have 3 three measures in a virtual cube: "Actual", "Budget" and "Full Year Budget".
The dimensions I have are: - Account No_ / Name - Cost Code - Sub Cost Code - Time/Dates - Budget Name
Both "Actual" & "Budget" measures need to be filtered/dimensioned by: - Account No_ / Name - Cost Code - Sub Cost Code - Time/Dates (exclusive to "Actual", "Budget")
Thus have put these in one cube
AND "Full Year Budget" needs to be filtered/dimensioned by: - Account No_ / Name - Cost Code - Sub Cost Code - Budget Name (exclusive to "Full Year Budget")
THUS have put this as one cube…
I then created a virtual cube, with the 2 cubes thinking that the dimensions I created in the original cubes would only filter the measures of the original cube measures in the virtual cube. ...BUT all dimension filters in the virtual cube filter all measures in the virtual cube, irrespective of which dimensions were created with the original cubes.
I am building a health care application that marries transaction-level data (health care services provided) with person-level characteristics that have a time-dimension. The person-level characteristics are diseases that the person has (these disease all have a start and some have an end date). The diseases are stored in a table in which the foreign keys are a person-identifier, a time identifier (month/year) and a surrogate for the disease. Persons can have more than one disease at a time (the diseases are NOT mutually-exclusive). There are no measures in this table. The transaction table has a foreign key for person and time (month/day/year), a procedure code (the type of service rendered) and money (the cost of the services).
How do I answer the following questions:
What is the total cost of care (the sum of all service costs) last year for persons with "disease A"?
What is the total cost of care last year for persons with "disease A" AND "disease B"?
What is the total cost of care last year for persons with "disease A" OR "disease B"?
I've tried a factless fact table but can't get it to work. If anyone has the right solution and can communicate to me before I slit my wrists, I would be greatly appreciative!!!
When I open my Cube the default dimension on Rows is the dimension name that is first on the alphabetic list and on the columns it defaults to the time dimension.
I need to specify a specific dimension to be shown on rows and columns when the cube is viewed.
Hi All, any advice or help greatly appreciated, I need to Process Dimensions and Cubes Overnight, what is the best and most reliable way of achieving this.
Is there a way to define measure group on fact1-details-table using TimeDim. Date info is only in fact1 table and not in details table.
Is namedquery joining the 2 fact tables the best solution in this case? There is so much redundancy using one fact table, so underlying sqldb uses 2 tables.
I am trying to calculate the median value using one of the measures and a dimension value.
Time is a measure in my cube and OpId is one of the dimensions.The result is as follows:
opid time median
1 55
2 23
3 23
Total 23
The Time here for Op Id 1 is the aggregation for all the rows whose OpId is 1.I want the median of the values whose OpId is 1 which is not showing at the moment.
What I am getting here is the median for all of the OpId but what I really want is the median for each of the individual Opid's as well.
I am using a calculated field Median with the following expression.
I'm new to MDX, and most of the time I customize existing queries rather than writing new ones. I currently have a MDX query like this
SELECTÂ [Measures].[Fees Billed] Â Â Â Â Â Â Â Â Â on 0, except([Age].[Day Buckets].members, {[Age].[Day Buckets].[All], [Age].[Day Buckets].&[Unknown]}) Â on 1 FROMÂ Â Â Â MyCube WHERE ([Fiscal Period].[Fiscal Year].&[2015], [Customer].[City].&[Auckland] )
Which brings the fees billed by age buckets where the customer's city is Auckland. I also have another dimension called [Sales Agent] with a member [City] in it, and there is a member in [Customer] called [Customer].[Sales Agent]
I am trying to retrieve the same information where the customer's sales agent's city is Auckland rather than the customer's city.
If it is SQL, I will join Customer and SalesAgent on Customer.SalesAgentUno = SalesAgent.SalesAgentUno and bring in the desired data. Any way in MDX to do this?
I have a fact table with a create time dimension and an expiration time dimension. I'd like to have a calculated member that would compare the (count for create time) / (count with that expiration time). I already have these counts as measures.
I would be able to put the create time dimension in the "row fields" area, and see the ratio (calculated above) over the different create time periods.
Can someone point me in the right direction on how I would create that kind of calculated member? What would the MDX look like?
I have questions about Slowly Changing Dimensions. I am quite confused about when should we use type 1 ( changing), type2 (historical), or type3( fixed) for the dimensions in each table? Is there any good suggestions on that?
Thank you in advance and I am looking forward to hearing from you.
I need to create a package that updates the dimensions and cube data from a data warehouse on daily basis. I was going to create a Data Flow that takes the data from the DW source then put it as input to a Process Dimension destination to update the dimensions and use a Process Partition destination in the same manner to update the cube, but then I came across the Analysis Services Processing Task which seems to do the job as well. I am kinda confused which way to go. Any recommendations?
I created a report model based over the cube. The problme is, it retrieves all dimensions from the cube regardless those are checked or not even in the prespective.
It does work with Measure but not with Dimensions.
Appreciate if some one can drop a line here and help me out.
I need to show the dimensions of my model like columns in the result. I have this query
with member [Measures].[Customer] as [Customers].[Customer].CURRENTMEMBER.Name member [Measures].[UCs] as [UCs].[UC].CURRENTMEMBER.Name member [Measures].[Order Type] as [Order Types].[Order Type].CURRENTMEMBER.Name member [Measures].[UC Dates] as [UC Dates].[UC Date].CURRENTMEMBER.Name
I have an archive of an Analysis Services database that was created on a server that is not accessible to me. I also have a copy of the source SQL Server database that it uses as a data source. I have restored both of these to my server. I have figured out how to change the data source to point to my server for the fact tables referenced, but I can't figure out how change the data source for the shared dimensions. I would like to be able to do work on this version of the database, but I get errors when I try to browse the dimension data because it can't connect to the original data source. Any ideas?
Actually I want to do distinct sum on a measure group, please find the below table as sample
XL Measure group LKÂ Â Â Â OKÂ Â Â Â Â Amount 1Â Â Â Â Â Â Â 10Â Â Â Â Â Â Â Â 100 1Â Â Â Â Â Â Â Â 11Â Â Â Â Â Â Â Â 100 3Â Â Â Â Â Â Â Â Â 30Â Â Â Â Â Â Â 250 3Â Â Â Â Â Â Â Â Â 31Â Â Â Â Â Â Â 250 3Â Â Â Â Â Â Â Â Â 32Â Â Â Â Â Â Â 250
For the above measure group two dimensions have relationships, One is L dimension which is having relationship with XL on LK and One is O dimension which is having relationship with XL on OK. If I drag L dimension attributes  it should show results as below
LK LName Amount 1    A        100 3    C         250
But above results are coming as below
LK LName Amount 1    A        200 3    C         750
If I drag O dimension attributes along with L dimension, it should show results as below.
LK  LName  OK     OKName  Amount 1        A        10      XYZ        100 1        A        11      UVW       100 3        C        30      PQR         250 3        C         31     KLM        250 3        C         32     TUV        250
I used formula Measures.Amount/Measures.Count, this formula is not showing correct results when I don't drag any dimensions, it is showing results for All member as 425, but it should show as 350.
So I made a same change ([L].[LK].Currentmember, Measures.Amount)/([L].[LK].Currentmember,Measures.Count), this worked fine but performance is very low and so stopped working on this.
Atlast I did the measure group like this
LK    OK     LAmount  OAmount 1       10        100       100 1        11        0           100 3         30       300       300 3         31       0           300
I want to show Measures.LAmount when only L dimension is querying and want to show OAmount when both L dimension and O dimension are querying. Is this possible ?
Is it possible to filter out a measure only at the intersection of Two dimension members? I have a date dimension, Â a Hospital dimension and a wait time measure.
For Example, is it possible to filter out Wait time for Bayside Hospital for the Month of June 2015?
I want Wait time to continue to be displayed for all other months and roll up into the totals without the filtered value.
I have make a calculated member for previous period of an given date range. The previous period is the same date range from the previous year, and I have managed to achieve that with the calculated member:
Create member currentcube.[Measures].[PrevPeriod] as (ParallelPeriod( [Start Date].[Cal Hierarchy].[Year], 1, [Start Date].[CAL Hierarchy].CurrentMember), [Measures].[Count]);
This member returns the correct result as long as my query uses the time dimension, which makes sense... but I also need to show results sliced by other dimensions in bar charts that do not display the time dimension. For example, I have a dimension with only 3 members called [Region].[Area].[AreaName].
The result set for the bar chart needs to look like this:
[AreaName] | [Count] | [PrevPeriod] East           |   43     |      56 West          |   53     |      95
But the [PrevPeriod] only returns values if I include the time dimension. I essentially need to sum the results of the time dimension/AreaName/[PrevPeriod] tuple down to just Areaname/[PrevPeriod] for whatever date range may be involved.
I don't know if this is significant to the issue, but the client tool that generates the bar charts builds the query with the date range as a subcube in the FROM statement. If the [PrevPeriod] is outside of the subcube that is still OK, as long as the time dimension is included in an Axis on the final select statement, so at least I know I am not suffering from the members inside the subcube. I've also found in SSMS that it makes no difference if I make the query a subcube, or put the date range in a where clause instead; I still get NULL for [PrevPeriod] without the dates.
I can't imagine that this is an unusual situation, so I hope I've explained it adequately! What is the recommended technique for summarizing a Parallelperiod by dimensions without displaying the time/dates ?
I have a cube with 2 many-to-many dimensions where a special mdx query needs about 5 seconds. When I resolve the many to many relationships by multiplying the data in the fact table the query needs 21 seconds.
In general do many-to-many dimensions slow down query performance of a cube?
Without the many-to-many dimensions of course the fact table has much more rows. Could this be the reason for the performance loss?
how to tweak query performance of a cube in general?
-- Error 1 OLE DB error: OLE DB or ODBC error: The query has been canceled because the estimated cost of this query (628) exceeds the configured threshold of 300. Contact the system administrator.; 42000. 0 0
Error 2 Errors in the OLAP storage engine: An error occurred while the dimension, with the ID of 'Revenue Labeled Prod ~MC-LPROD', Name of 'Revenue Labeled Prod ~MC-LPROD' was being processed. 0 0 --
I don't understand why this happens. Previously I've created one cube and 4 dimensions. I got the same error before. I think the cube was the culprit so I removed the cube and dimensions. After removing them, I build the project. Successful. But fail again when processing the mining model. The mining model was fairly simple, only 3 columns (one key time, one key, one input as well as predicted column, using Microsoft Time Series algorithm).
Why the estimated cost is even higher when I created another project using only one table (Revenue, the same fact table)?
Error 1 OLE DB error: OLE DB or ODBC error: The query has been canceled because the estimated cost of this query (1493) exceeds the configured threshold of 300. Contact the system administrator.; 42000. 0 0
I worked in local machine, there should be no network-related issue when querying. The machine is 2-processor Xeon 2.4 GHz with 3 GB memory.
How to solve this problem? I have checked the Properties of Analysis Service. I have set higher value for timeout in ODBC Administrator.
I would like to know if there is an easy way to load Hierarchical Dimensions with Type 2 using SSIS ( or in general).
Here is my example:
There is a Hierarchy Product Group <- Product Class <- Product. (In words, Product rolls up to Product Class and Product Class Rolls up to Product Group). Say Product Group is Type 1 and the other two are Type 2. Every night, I receive a feed for each of these tables only if the record is new or a change to an existing record.
Now, when a Product Class Record is changed to assign it to a different group then I receive the feed only for Product Class (but not for Product). To load this record, I expire the old record and create a new entry with a new Surrogate Key. Then how do I automatically cascade this change to Product and make a Type 2 change to use the new Surrogate Key from Product Class?
I have created a cube in Analysis Services with a time dimension named Time. The data is only needed at the month level, so the fields in the table that the dimension is based on are [DTE MM] and [DTE YR]. In Visual Studio 2005 or in SQL Server Mgmt Studio I can browse the dimension and I see that Time has the following attributes: Year and Month both Regular attributes and Time (Key attribute). There is also a hierarchy named [Year - Month] and if I browse that it looks good dril down from All to Year to Month.
However, when I point my Reporting Services Datasource to the cube, and start Query Builder, I see two dimensions, [DTE MM] and [DTE YR] and no Time dimension.
This is causing me huge problems in my report. I need to use a date range in the query. To do this, I created Query Parameters and refer to Report Parameters that will be passed in. For example, I use ="[DTE YR].[Year - Month].[Year].&[" + Parameters!StartYear.Value + "].&[" + Parameters!StartMonth.Value +"]" (thanks to Simon Philips) as the query parameter for the Start Year Month, but if I use the [DTE YR].[Year - Month] in my SELECT, the data is not sliced in my query results. That is to say that the Year and Month show correctly, according to the date range, but the Measure is equal for each month and is equal to the total for the cube. To slice the data, I have to use [DTE YR].[Year].[Year] and [DTE MM].[Month].[Month] in my select, but then the data is shown for every month, i.e. the date range is ignored.
What am I doing wrong, or is this a quirk of RS that I can work around?