Tricky Schema Question - Dimension Can Split And Combine Over Time
Jul 20, 2005
Hi all,
I'm working on the schema for a database that must represent data about stock
& bond funds over time. My connundrum is that, for any of several dimension
fields, including the fund name itself, the dimension may be represented in
different ways over time, and may split or combine from one period to the
next.
When querying from the database for an arbitrary time period, I need the data
to be rolled up to the smallest extent possible so that apples can be compared
to apples. For instance, if the North America region becomes 2 regions, USA
and Canada, and I query a time period that spans the period in which this
split occurred, I should roll up USA and Canada and for records in the period
that has both, and I should call the result something like "(North
America)(USA/Canada)" in the output. The client specifies that the dimension
output must represent all dimensions that went into the input.
Of course, I have to account for more complex possibilities as well, e.g.
Fund-A splits into Fund-B and Fund-C, then Fund-C merges into Fund-D producing
(Fund-A/Fund-D)(Fund-B/Fund-C/Fund-D)(Fund-B/Fund-D)
I can think of several ways to handle this issue, and they're all
extraordinarily complex and ugly. Any suggestions?
Need some help building a query that does the following :
I have 2 Time Dimensions ; Time (Transdate) and ClosedDate (ClosedDate)
In my report/query, if [Time].CurrentMember = [Time].[YMD].[YMD].[2006].[200610].[20061031] I want to FILTER out all ClosedDate < [ClosedDate].[YMD].[YMD].[2006].[200610].[20061031]
Both Time Dimensions are Year -> Month -> Day and have the same Members.
I have every option available, using calculated Members and/or Measures to do this.
The report I'm creating is Aging of Receivables : Balance / 30 days / 60 days / etc.. But for the Aging, I need to filter like explained above.
Argent Need to split olap dimension members into three different dimension members example account dimension member = 11111200900 account_num.member = 11111 item_num.member = 200 sub_item = 900
I€™ve created with the help of some great people an SSIS 2005 package which does the follow so far:
1) Takes an incoming txt file. Example txt file: http://www.webfound.net/split.txt
The txt file going from top to bottom is sort of grouped like this Header Row (designated by €˜HD€™) Corresponding Detail Rows for the Header Row €¦.. Next Header Row Corresponding Detail Rows
€¦and so on
http://www.webfound.net/rows.jpg
2) Header Rows are split into one table, Maintenance Detail Rows into another, and Payment Detail Rows into a third table. A uniqueID has been created for each header and it€™s related detail rows to form a PK/FK relationship as there was non prior to the import, only the relation was in order of header / related rows below it when we first started. The reason I split this out is so I can massage it later with stored proc filters, whatever€¦
Now I€™m trying to somehow bring back the data in those table together like it was initially using a query so that I can cut out each of the Header / Detail Row sections into their own txt file. So, if you look at the original txt file, each new header and it€™s related detail rows (example of a cut piece would be http://www.webfound.net/rows.jpg) need to be cut out and put into their own separate txt file.
This is where I€™m stuck. How to create a query to combine it all back into an OLE DB Souce component, then somehow read that souce and split out the sections into their own individual txt files.
The filenames of the txt files will vary and be based on one of the column values already in the header table.
Needs to b converted to this (ignore the underscore, used for spacing):
---M-------|-------M & Rx----|---Rx--
The time spans can slide either way.
Data example:
MemberID Eff_Date Term_Date Med_COB Rx_COB 1 20050101 20050912 Y N 1 20050310 20051120 N Y 1 20060101 <null> Y N 1 20060101 <null> N Y
Resulting Records need to be in this format:
MemberID Eff_Date Term_Date Med_COB Rx_COB 1 20050101 20050310 Y N 1 20050311 20050912 Y Y 1 20050913 20051120 N Y 1 20060101 <null> Y Y
Any help with this problem would be greatly appreciated. We are running SQL2K. I like most people,would like to stay away from cursors and loops if possible.
I was working with Microsoft Time Series model (MTS) with some data, when in the mining model viewer, decision tree tab, I realized that the key time variable that I define, it was acting like a split variable.
So, I ask you, this is possible?, because, for me, this should not happen€¦.
After, I review the Data Mining Tutorial by Seth Paul, Jamie MacLennan, Zhaohui Tang and Scott Oveson, and I found, in the Forecasting part, that the key time variable (Time Index) it was acting like a split variable too, in for example, M200 pacific:Quantity and R250 Europe:Quantity.
So people, it€™s possible that a key time variable act like a split variable in a MTS model?
We have a piece of software and database for student registers.
One of the biggest problems is the database has been difficult to work with, as I have had to work out the date using dateadd functions and combine fields from different tables.
I've now got the date correctly, but I want to add the time to the time part of a datetime field.
The time field is already stored in a datetime field, but the date in this field is always 1899-12-30.
I am trying to make a time dimension in analysis services. Is it possible to include the hours of a day in the dimension ? Is necessarily the day level the lowest of the hierarchy?
What are the pros and cons to using SSAS to create time dimensions based on a date field in the fact table as opposed to a stand alone time dimension table.
I can see many problems with loading a time dimension table. The date is from the same table as most of my fact data. I have a column in my OLTP that sets last change date so I can tell if my fact data is an insert or update but it wouldn't tell me if the date column had been changed, just some column in the table. I'm going to have several thousand sales on any given day so I'll be reading a lot of rows just to put one row into the dimension.
From a SSIS point of view I'd think leaving the date in the fact table would be better.
I have extracted the data from Point of Sales. it has two columns one for date and one for time in the database. I need to combine both column into single column with the format "mm/dd/yyyy hh:mm:ss" before i import the data into the SQL server for my BI project.
Example: Data extract from Point of Sales FDATEFTIME 20060114063616 20060115070743 20060116071020
How can i combine those two when i import into SQL server like below: FDATE 01/14/2006 06:36:16 01/15/2006 07:07:43 01/16/2006 07:10:20
I have a fact table which contains the transaction date, ProductID, QTySold, TotalSaleAmount, etc...
Since I am new to OLAP therefore I need help to now create the table for TIME on which I will be basing my time dimension... I have read a few articles and have gathered that at the end of the exercise my fact table should have a 'timeid' column which will be linked to the same column in the table being used for the time dimension...
I have gone through the tutorial of MS-Analysis Service and FoodMart example have some idea about what he structure of this table will be.
My questions are:
1. I need guidance on how to create the table for time. One option is to just copy the table used in the Foodmart example but thought that might work but my concept will not be clear
2. The structure of the table to be used for time dimension is quite clear (i think this part is easy). What I want to understand is that how do i POPULATE this table which data? Can some one provide me with scripts, SPs, or whatever to do this.... This is the area where I am lost...
3. How will I enter the "TheDate" column in my fact table and link it with my table for time dimension...
Looking forward to someone's help.
BTW, I would like to share a very good article which i recently found in one of the newsgroups. Some of you might like appreciate it too: http://www.sqljunkies.com/Article/D1E44392-592C-40DB-B80D-F20D60951395.scuk
Hello: Very soon my company will be moving to a 4-4-5 reporting schedule. Basically, what this means is that the first month of the quarter will have 4 weeks, the second will have 4 weeks, and the third will have 5 weeks. Therefore, for the 2007 the dates for Jan, Feb and Mar will be as follows: Jan - 1 - 27 Feb - 28 - 24 Mar - 25 - 31
Currently, I have an SSIS package creating a record for each day in the Time Dimension. Is there any script out there that will help me build a Fiscal calendar such as the one described above?
I realize that this is not a direct SSIS question but I figured that some of you might have encountered this situation and hence my post.
Hi, I got inventory fact table. For the past two weeks, I got on a daily level; beyond that, weekly level, and beyond that monthly. I need to tie it all in to the time dimension of course €“ and the problem is, how do I do it on different granularity? As far as time dimension, tn the datamart, I got tables dim_date with key column date_id (int) , and correspondingly dim_week with week_id(int) and dim_month with month_id(int).
What I€™ve done so far, is created a time dimension from dim_date table (meaning granularity=daily) and simply tied in all the inventory €“ daily, weekly, monthly, on the day level (it all has date_id field in it, even the weekly and the monthly. Its simply the day of the end of week or end of month) I didn€™t tie anything to dim_week and dim_month. Does that makes sense? The result is kind of strange. I cant upload an image here, but€¦ well it seems ok, I got year, week (€˜GL Week€™) and then €¦ this is the annoying thing: why am I getting a €˜date€™ column, when I only want it by week or by month? I can€™t make that column disappear (e.g when in time hierarchy I only group by month, still a €˜day€™ column will be there, and will show 4 days€¦ the 4 €˜end of week€™ that comes from €˜date_id€™.) How do I make it go away??
When i add a dimension to the cube dimension without any relation in my dimension usage to any measure group my units are going down.However when i remove the dimension from the cube am getting the correct values.
Im trying to design my time dimension and need to add a field tohandle null dates in the fact. So if at the time of ETL the date isntknown, referential integrity will be preserved. Kimball suggestsinsterting a record in the time dimension to handle this with adescription of 'Date not available' or something like that. Howeverif users are doing inner joins on the dimension they will obvisouslybe pulling the datetime field..what should be in the datetime fieldfor this particular record?
Hi All, I've looked around but can't seem to find an answer for this. I have a cube that has a fairly large time dimension (going back to 1949) as the data demands this. When a user is browsing the cube and applies a filter or adds the a time heirarchy as a value it's always sorted from oldest to newest. Whilst the need is there to look at data from 1949 most people want to look at the last few years. The problem is they have to scroll right down the list to get to this. Is there a way of having the most recent years of the time dimension appear first in these lists to make them more accessable?
Good morning.I am importing an XLS file into one of my tables. The fields are:Date Id Time IO12/22/2006 2 12:48:45 PM 912/22/2006 16 5:40:55 AM 112/22/2006 16 12:03:59 PM 2When I do the import, I get the following:Date Id Time IO12/22/2006 12:00:00AM 2 12/30/1899 12:48:45 PM 212/22/2006 12:00:00AM 16 12/30/1899 5:40:55 AM 112/22/2006 12:00:00AM 16 12/30/1899 12:03:59 PM 2Here are my doubts:1. Would it be better to combine the Date & Time fields into onecolumn? If so, how?2. What issues or problems might I have when I program SQL reports, ifI leave the fields as they are?Any comments or suggestions will be very much welcomed.Cheers mates.
I am importing an XLS file into one of my tables. The fields are:
Date Id Time IO
12/22/2006 2 12:48:45 PM 9
12/22/2006 16 5:40:55 AM 1
12/22/2006 16 12:03:59 PM 2
When I do the import, I get the following:
Date Id Time IO 12/22/2006 12:00:00AM 2 12/30/1899 12:48:45 PM 2 12/22/2006 12:00:00AM 16 12/30/1899 5:40:55 AM 1 12/22/2006 12:00:00AM 16 12/30/1899 12:03:59 PM 2
Here are my doubts:
1. Is it be better to combine the Date & Time fields into one column? Advantages/Disadvantages? 2. If I don't combine them, should I use varchar or datetime data type? 2. What issues or problems might I have when I program SQL reports, if I leave the fields as they are?
Any comments or suggestions will be very much welcomed.
I have a table which contains all the transaction details for which I am trying to create a CUBE... The explanation below in brackets is only for clarity about each field. Kindly note that I am using the following table as my fact table. Let's call it tblFact
This table contains fields like transaction date, Product (which is sold), Product Family (to which the Product Belongs), Brand (of the product), Qty sold, Unit Price (at which each unit was sold), and some other fields.
I have created a Product dimension based on tblFact. I don't know if this is a good idea or not :confused: Is it okay and am I on the right track or should I base my Product Dimension on some other table (say tblProducts and then in the Cube editor link tblProducts with tblFact based on the ProductID field? Please guide.
Now coming to my last question: Currently I am also creating my Time Dimension based on tblFact. Is this also a wrong approach? 1. Should I instead create the Time Dimension based on a different table (say tblTime) and again link up tblTime and tblFact in the Cube editor?
2. if yes, then how do I create tblTime from tblFact in a manner that it only contains all the transaction dates.
3. Assuming that I should take the tblTime approach, then should this table (tblTime) also contain the ProductID - representing the Product which was sold for each date in tblTime?
I realize that this is a lenghty post but reply and more importantly guidance from the experienced users will be greatly appreciated becuase I have recently started learning/playing around on the OLAP side of things and I know that this is the time that I get my foundations correct otherwise I'll end up getting used to some bad practice and will find it difficult to change my approach to cube designing later down the road.
So many thanks in advance and I eagerly look forward to reply from someone.
I made a cube with time dimension with hieracly year/month/date/hour the problem is that dimension is growin to fast. In older version of MSSQL (2000) the same dimension doesn't grew so much. Any ideas? The table is big (may be around 1 500 000 rows per month) now it contains around 4 500 000 rows.
I have an Analysis Services Cube that I would like to report on. However, the Time Dimension currently only has four columns, Day of Month, Month(name) , Year, and DateKey (DateTime representation at midnight for every day). Thus when I drag the month attribute onto the report, it is sorted April - August - December - etc. instead of Jan - Feb - Mar. How do I fix this? I remember reading something in the MSDN Library about it but I can't find it again now.
We have a set of cubes and dimensions, and we're experimenting with data mining against the cubes (primarily for forecasting applications). We have a custom time dimension (which we call calendar), not generated by the BIStudio wizard. The dimension has year/month/day/hour/... attributes. But when I try to add this Calendar dimension to the mining structure as a nested table using BI studio, it only shows the Year attribute, not the others. Other dimensions seem to show all the attributes.
Is there something we've done wrong in defining our time dimension? What determines which attributes show up as available for selection in BI studio?
There are some 55 members in the arrival year level of the time dimension (1995 - 2055). I am trying to find a way to restrict the number of years returned by this mdx query to the current year - 5. Any help will be appreciated.
WITH MEMBER [Measures].[ParameterCaption] AS '[TIME DIMENSION].[ARRIVAL YEAR].CURRENTMEMBER.MEMBER_CAPTION' MEMBER [Measures].[ParameterValue] AS '[TIME DIMENSION].[ARRIVAL YEAR].CURRENTMEMBER.UNIQUENAME' MEMBER [Measures].[ParameterLevel] AS '[TIME DIMENSION].[ARRIVAL YEAR].CURRENTMEMBER.LEVEL.ORDINAL' SELECT { [Measures].[ParameterCaption], [Measures].[ParameterValue], [Measures].[ParameterLevel] } ON COLUMNS, [TIME DIMENSION].[ARRIVAL YEAR].allmembers ON ROWS FROM [TOURISM CUBE]
This is a cross-post from the Office PPS Planning site: http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=3379691&SiteID=17 I was hoping there may be some additional MDX resources here.
I'm trying to determine if the currentmember of the time dimension is > or < the value of today. I want to use this to change the the value of a member that I display on columns from actual (for prior to today) to forecast (after today). I have this MDX:
with
member [measures].[PYCY] as IIf(VBA!CInt(Format(VBA!Now(),"MM"))<10, VBA!CStr(VBA!CInt(VBA!Year(VBA!Now())-1)), VBA!CStr(VBA!CInt(VBA!Year(VBA!Now())+0)))
member [measures].[CYNY] as IIf(VBA!CInt(Format(VBA!Now(),"MM"))<10, VBA!CStr(VBA!CInt(VBA!Year(VBA!Now())+0)), VBA!CStr(VBA!CInt(VBA!Year(VBA!Now())+1)))
member [measures].[NYNY2] as IIf(VBA!CInt(Format(VBA!Now(),"MM"))<10, VBA!CStr(VBA!CInt(VBA!Year(VBA!Now())+1)), VBA!CStr(VBA!CInt(VBA!Year(VBA!Now())+2)))
member [measures].[test2] as Format(Now(), "yyyyMM")
member [measures].[test4] as [Time].[Day View].CurrentMember.Member_Key
member [measures].[diff] as DateDiff('m', Format(Now(), "yyyyMM"),[Time].[Day View].CurrentMember.Member_Key)
The error is: #Error Execution of the managed stored procedure DateDiff failed with the following error: Exception has been thrown by the target of an invocation.Argument 'Date1' cannot be converted to type 'Date'..
Why can't DateDiff calculate the difference between 200805 & 200610?
The following question might sound a bit stupid but I'm not a database expert so hopefully nobody minds me asking it.
Here's what I did:
1. I created an SSIS package that is supposed to import new data into my data warehouse as it becomes available.
2. Since I need to maintain some of the history I use the Slow Moving Dimensions part (set the history flag on input fields) but run into an error condition while running the package. The message basically says that I'm about to create a duplicate record which is not allowed.
Now on some records the package is supposed to archive history by populating the (new) fields. In order to keep the record unique (primary key constraint) thought, do I need to make the (new) fields primary keys as well?
So I guess I'm struggling with a more basic concept;)
I would appreciate if somebody could shed some light on this.
I have created a cube in Analysis Services with a time dimension named Time. The data is only needed at the month level, so the fields in the table that the dimension is based on are [DTE MM] and [DTE YR]. In Visual Studio 2005 or in SQL Server Mgmt Studio I can browse the dimension and I see that Time has the following attributes: Year and Month both Regular attributes and Time (Key attribute). There is also a hierarchy named [Year - Month] and if I browse that it looks good dril down from All to Year to Month.
However, when I point my Reporting Services Datasource to the cube, and start Query Builder, I see two dimensions, [DTE MM] and [DTE YR] and no Time dimension.
This is causing me huge problems in my report. I need to use a date range in the query. To do this, I created Query Parameters and refer to Report Parameters that will be passed in. For example, I use ="[DTE YR].[Year - Month].[Year].&[" + Parameters!StartYear.Value + "].&[" + Parameters!StartMonth.Value +"]" (thanks to Simon Philips) as the query parameter for the Start Year Month, but if I use the [DTE YR].[Year - Month] in my SELECT, the data is not sliced in my query results. That is to say that the Year and Month show correctly, according to the date range, but the Measure is equal for each month and is equal to the total for the cube. To slice the data, I have to use [DTE YR].[Year].[Year] and [DTE MM].[Month].[Month] in my select, but then the data is shown for every month, i.e. the date range is ignored.
What am I doing wrong, or is this a quirk of RS that I can work around?
I have a monthly time period dimension representing average number of students for each month. At the yearly aggregate level I don't want it to sum up the avg number of students from every month because that number is incorrect. I would like it to use the number of students from the most recent month as a roll up. Is that possible to configure in SSAS?
The scenario is the data comes from various sources and its staged into staging database. From this staging database it goes into data warehouse database. Everyday this staging database is truncated and repopulated from various sources. I've a dimension table called DimCustomers which consists of around 300,000 rows and has lots of different types of SCD columns. It takes around 4-5 hours to load data from staging to this dimension table. Currently I'm using a For Loop container which uses a store proc to extract 15000 rows each time and populate my dimension tables. First couple of loops it goes off quickly but as and when the number reaches half of the count it slows down and hence it takes around 4-5 hours to load data.
What would be the best approach to populate this kind of dimension table.
I created a Time table using BIS. I found that the default naming of time members is too long and redundant.
For example, the wizard generated "Fiscal Calendar 2015", "Fiscal Quarter 1, 2015", etc. However, shorter expression like "FY2015", "FQ1 2015", etc would be enough for me.Â
Is it possible to change the default naming rule, or does SSAS works correctly if I update the Time table values using SQL?
if I pass 2014 and 2015 in sub select 171 data is not coming in result. i i pass only 2014 in sub select i get value of only 2014. if I pass 2015 in sub select i didn't get any value.                                                                        Â
I have 2 tables, each with one ID field, a separate Date and Time fields and a number of other fields. The tables contain duplicates on the ID field. I want to do a UNION keeping only the record with the latest Date and Time.
This would work: SELECT MyTab.myKeyField, Max(MyTab.myDate) AS myDate FROM (SELECT myKeyField, myDate from Table1 union SELECT myKeyField, myDate from Table2) AS MyTab GROUP BY MyTab.myKeyField
But is only taking care of Date, not Time (some records have the same date but different times) The other problem is, when I add more fields, I have to include them in the GROUP BY clause, and this way I end up with duplicates (because some other fields have different values)