We have a set of cubes and dimensions, and we're experimenting with data mining against the cubes (primarily for forecasting applications). We have a custom time dimension (which we call calendar), not generated by the BIStudio wizard. The dimension has year/month/day/hour/... attributes. But when I try to add this Calendar dimension to the mining structure as a nested table using BI studio, it only shows the Year attribute, not the others. Other dimensions seem to show all the attributes.
Is there something we've done wrong in defining our time dimension? What determines which attributes show up as available for selection in BI studio?
I am working with several tables, but for now I just mention 4 : one is fact table (named Usage), and 3 dimensional tables Periods, Products, and Regions. The fact table contains references to the dimensional tables. Table Periods contain two other columns month and year.
I created a cube containing columns from those 4 tables. Deployment was successful. Trouble comes when I want to create a mining structure using Time Series containing these columns :
- Period - Amount (of money) - Product name - Region name
When I choose to use cube (instead of table) as source for mining structure, I'm forced to choose only one dimension (among the Periods, Products, and Regions). Whatever dimension I choose I end up being unable to use the column period as the Time-Key column. Effectively I cannot use Time Series method since I cannot use the column period.
(1) Why is this so [why Visual Studio forced us to use only one dimension from the cube] ? (2) Why Visual Studio eliminates the column period, column that has relationship with the time dimension? (3) What is the use of Cube anyway to the mining? Is there still any use for it? (4) What is the solution to that kind of problem I face?
I made a cube with time dimension with hieracly year/month/date/hour the problem is that dimension is growin to fast. In older version of MSSQL (2000) the same dimension doesn't grew so much. Any ideas? The table is big (may be around 1 500 000 rows per month) now it contains around 4 500 000 rows.
I have an Analysis Services Cube that I would like to report on. However, the Time Dimension currently only has four columns, Day of Month, Month(name) , Year, and DateKey (DateTime representation at midnight for every day). Thus when I drag the month attribute onto the report, it is sorted April - August - December - etc. instead of Jan - Feb - Mar. How do I fix this? I remember reading something in the MSDN Library about it but I can't find it again now.
When i add a dimension to the cube dimension without any relation in my dimension usage to any measure group my units are going down.However when i remove the dimension from the cube am getting the correct values.
With SASS Database i have created Data mining Structure Using Time series algorithm, while processing the SSAS db, Data mining taking long time to process, so how we can reduce processing time ???
I'm defining a mining structure against an OLAP dimension. The continuous value that I'm using both as input and for forecasting represents the time to complete a certain process.
There's something that strikes me as if it could be a problem, but I'm not sure. Our fact table has multiple columns (with multiple correponding measures in the cube). The "time-to-complete" measure is only populated on some of the fact rows - the rows that represent completion information. Other rows represent other information, and the "time-to-complete" value is set to 0. This works fine for cumulative time-to-complete and average time-to-complete, but it seems like it could mess up data mining. Will those 0-value facts skew the mining results? I'm not seeing a way to filter out those entries and only include the non-zero facts in the mining processing.
Or perhaps I'm totally misunderstanding something, which is quite possible. :)
I am new to SQL Server 2005 Analysis Services and would like to use the OLAP Cubes as a datasource to build Mining Model . However i would like to use a particular view of the OLAP cube that i have generated to be used as the datasource for the mining model . I find that i am not able to save the Cube View while browsing the OLAP cube in Business Intelligence Studio. Is there a way i can acheive this requirement.
Any ideas regarding this will be really appreciated.
Need some help building a query that does the following :
I have 2 Time Dimensions ; Time (Transdate) and ClosedDate (ClosedDate)
In my report/query, if [Time].CurrentMember = [Time].[YMD].[YMD].[2006].[200610].[20061031] I want to FILTER out all ClosedDate < [ClosedDate].[YMD].[YMD].[2006].[200610].[20061031]
Both Time Dimensions are Year -> Month -> Day and have the same Members.
I have every option available, using calculated Members and/or Measures to do this.
The report I'm creating is Aging of Receivables : Balance / 30 days / 60 days / etc.. But for the Aging, I need to filter like explained above.
I've created a Cluster data mining model and seems to correctly return data. However I've created a data mining dimension and cube but when I go to query the cube it doesn't return any data if I select any members on the data mining dimension. Any suggestions to where I can look to resolve this?
Is it possible to group many time series into clusters by using the clustering algorithm of the SQL server 2005. The same question applies to "association rules" technique. Any examples?
I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?
Thanks a lot for any help and guidance in advance.
I am about to prepare a paper concerning the field of real-time data mining. Real-time here means the process of incremental training of an existing model as soon as the data arrives.
There is a number of papers introducing algorithms for incremental association analysis, incremental clustering etc. Stream mining ís a field which is closely related to that. The main reason for the implementation of incremental algorithms is a) the large amount of data to be mined and b) the high rate of new data that is evolving every day.
Using classical batch mining algorithms, models that are outdated for some reason, would have to be re-trained, which could be very time consuming for billions of records. And once the training is completed, the training would have to be restarted once again because a bulk of new data has been arrived.
The question that I would like to discuss now is: For what real world applications would it be a meaningful or even essential to use real-time training of models?
Two main reasons could determine the answer to that question:
You just want to incorporate new data into existing models in order to increase the prediction accuracy of your model or Your underlying data is subject to more or less massive changes (also refered to as concept drift) and you want to adapt your mining model continuously to that reality.
I'm looking for some examples or ideas where one of these cases apply and it would be a good idea to have incremental mining algorithms involved.
I'm looking forward to inspiring some discussion on that issue.
I am just starting out using CUBEMEMBER/CUBEVALUE formulas in excel linked into a sql olap db - using this method for some custom reports where pivot tables are not suitable. The time dimension values include Months, Quarters and Years and the CUBEMEMBER formulas like
=CUBEMEMBER("OLAPCUBE","[Time].[Time].[Year].&[2015].&[1].&[1]") work fine - 1st quarter 1st month etc.
Is there a straightforward notation to aggregate months or do I need to use a plus sign to add a number of CUBEMEMBER formulas together.In other words - Is there an easier way of for say jan to july 2015 totals than
I have 1 report with 2 charts, both charts have their own dataset. The two datasets are mdx queries on 2 different cubes, but some dimensions have the same name.
Now I want to have 2 differenent selectable parameters for the [dim time] dimension. One for the first query in the first cube and the second for the other query in the other cube .
So I check in the mdx query builder, both dimensions as parameter, but because both dimensions have the same name, i have only one selectable [dim time] -parameter in my report.
I have a fact table with a simple integer lookup key into a basic dimension table. However, some of the fact lookup key fields are NULL. I would like the Analaysis Services reports to show this NULL category. Instead, Analysis Services discards any NULL entries and the records are completely absent from the reports. What is the best way to achieve this?
I'm using SQL-Server 2008, Visual Studio 2013. I've got created Linked Object (Linked Measure) in Cube2 from Cube1. Everything was fine, but I edited Measure in Cube1, as I found documentation there is no ability to refresh Linked Objects so I deleted and recreated Linked Measure on Cube2. After It I can't process Cube2, receiving following errors:
MdxScript(Cube2) (10, 24) The dimension '[Dim]' was not found in the cube when the string, [Dim], was parsed.The END SCOPE statement does not match the opening SCOPE statement.
sorry im new with using Reporting Services and even more inexperienced with using cubes.
My situation is as follows. I perform dynamic grouping (user selects the view via a parameter) Depending on the view selected, I need to change the dimension filter in the dataset.. Is this possible ?
I'm trying to build a association model in the Standard Edition based on an existing cube. I keep getting the error:
Error (Data mining): The 'Product Recommendations' mining model has 60385 attributes. This number of attributes exceeds the attribute limit of 5000 allowed by the current version of the algorithm associated with the mining model.
I created Cube Slice filters and those limit the Customer and Product dimensions (Product is Nested) to well under 5000. The error message also does not change. The number of attributes is equal to the number of rows in the Product dimension, but I expected the cube slice to reduce the number. I tested all the SQL used while it processes and with the MDXFilters the number of rows returned is well under 5000.
So, in short, the final questions is, is it possible to create a mining model in standard edition based on an existing cube where the nested dimension in the model has more than 5000 rows? Is there some other way to filter the query?
I guess my only choice on this if there isn't a way is to extract the data into relational table with only the rows I want to analyze....that's a huge pain and doesn't really make sense when the filters should limit the model size.
What is annoying on this is I can't find one reference anywhere on the microsoft site that this limit even exists within the product...
I would like to know if there is any way to migrate third-party data mining packages with SQL Server 2005 data mining algorithms together then we can have a comparison among all of them to get the best results for training models.
Hoping someone will have a solution for this error
Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.
Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?
I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project:
"Error 6 Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid. 0 0 " I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?
- a data mining structure with about 80 columns. - a data mining model using Microsoft_Decision_Trees with 2 prediction columns.
I thought I would then explore the possibility of have more than 2 prediction columns, in this case 20.
I get an error message and I can't work out : a) if this is because there's a limit to the maximum number of prediction columns and where that maximum is stated. b) if something else has become corrupted c) there's a know bug and if the error message is either meaningful or not.
Either way, I'm unable to complete the data mining wizard
The error message is :Errors in the metadata manager. Either the mining structure with the ID of '[my model Structure]' does not exist in the database with the ID of 'DMAddinsDB', or the user does not have permissions to access the object.
For Example: I have one dimension named as "Name", Under this I have "FirstName" and "LastName" Attributes are there.But when i drag "Name" dimension, By default "First Name" dragged. But i Want "Last Name" should drag.
I have a dimension like Districts, Under that 2 Attributes are there i.e,District ID and Districts. When i drag Dimension "Districts", in OLAP grid it come District ID first. But i want Districts to drag first. How can we sort Attributes(District ID and Districts) for a dimension.
Hi,This is probably a classic scenario with a shared dimension that weneed to use in different cubes, where all fact tables do not offer thesame level of detail. Dimension is snow-flaked.The cube that's causing me troubles was designed by marking the lowestdimension level Diabled and not Visible. This allows me to get rid ofone of the snow-flake tables (the one with the lowest level), thusallowing an INNER JOIN with the remaining table which has a level ofdetail corresponding to the fact table.When processing the cube, I get a 'member with key '[blah]' was foundin the fact table but was not found in the level '[blah]' of thedimension '[blah]'' that seems to indicate that none of my fact foreignkeys exist as primary keys in the dimension table. However if I thenattempt to query the cube, all data seems to be there.Would anybody be in a position (and willing ;-)) to share his/her ownexperience working around a similar issue?Thanks,SRL
Errors and Warnings from Response Internal error: The operation terminated unsuccessfully. The following system error occurred: Errors in the high-level relational engine. A connection could not be made to the data source with the DataSourceID of 'DB LAB2', Name of 'DB LAB2'.
I am using Microsoft_Time_Series and have set HISTORIC_MODEL_GAP to various values (from 1 to 21). I always get this error: Error (Data mining): The 'HISTORIC_MODEL_GAP' data mining parameter is not valid for the 'My Time Series' model.
In Algorithm Parameters window, this parameters is not there by default, so I have to add it.
Implementing data mining Add-in in an academic setting? We need to handle over 150 new students a semester and have their connection to Analysis Services survive for their four years at the college. We are introducing data mining to every freshman business student as a unit within their Intro to Excel class (close to a month of work to give them a sense of what is possible). Other courses later in their curriculum will expand on that introduction.
Once implemented, we would have as many as 900 connections to manage (four years from now). It is possible that multiple sections will be running at the same time, so 40 students may be accessing the data mining tools concurrently.
Is there a way to "bulk establish" the access credentials and establish those databases?