SSIS Data Mining Model Training Transform (Nested Tables)
Oct 26, 2006
I can't figure out how to put nested tables into the Data Mining Model Training Transform (SSIS). I can do a simple case table, but how do you get those nested tables with DM Training Transformation? Any ideas? Samples?
Hi ...I can't figure out how to put nested tables into the Data Mining Model Training Transform (SSIS). Can anybody help me? some example please...!!!?? Diego B.
I dont think we should sample any nested tables for data mining model training? Since I think any nested tables are bound to the case table. Therefore whenever we sample the case table, the nested tables are like any other input attributes within the case table to be rectrieved as inputs accordingly?
Thank you very much for any guidance to clear my confusion.
Dear All, I have a simple mining structure created by the DMX statement below. Then I tried to insert some data with MDX language by extracting data in OLAP. But I got the following error when I execute the insert statement.
Errors in the high-level relational engine. The 'Customer ID' column in the RELATES clause was not found in the results of the OPENROWSET query.
It seems that the append statement can't really recognize the name of the column which should be Customer ID.
How can I fix this problem?
Thanks
Tony Chun Tung Siu
The source code for create and insert is as below.
create mining model customerMiner ( customerID long key , age long continuous, orders table ( orderID long key, goodsID long discrete predict_only ) )using [Microsoft_decision_trees] with drillthrough;
insert into mining model customerMiner ( customerID, age, orders ( skip, orderID, goodsID ) ) shape { openQuery([Simple SSAS], ' select {[Measures].[Customer ID], [Measures].[Age]} on columns, {[Customer].[Customer].[Customer].members} on rows from fi ') } append ( { openQuery([Simple SSAS], ' select {[Measures].[Customer ID],[Measures].[Order ID2],[Measures].[Goods ID]} on columns, [Goods].[Order ID2].[Order ID2].members on rows from fi ') } relate [Customer ID] to [Customer ID] )as orders
Actor train nested table: ID MovieID Gender 1 1 F 2 1 M 3 1 F 4 1 F 5 2 M 6 2 M 7 2 F 8 3 F 9 3 F 10 4 M 11 4 M 12 4 F 13 4 F 14 5 F 15 5 M
We want to build a classifier model in order to predict the Class of a Movie based on the Gender of movie's actors. To deal with the nested table Analysis Services maps each record of the nested table to an attribute of the case table. These attributes are named Actor(n).Gender with n = 1..15, and so they are dependent on the nested table record numbers. Both Microsoft Decision Trees and Microsoft Naive Bayes algorihms use these attributes without any modification.
We are implementing a Relational Naive Bayes algorithm and we are planning to aggregate such attributes in order to make them independent of the nested table record numbers.
Next step we tried to predict some unseen cases and here we face with a very huge problem.
Lets take more two tables of unseen cases:
Movie test table: ID Class 6 + 7 NULL 8 NULL
Actor test nested table: ID MovieID Gender 1 6 F 2 6 M 3 6 F 4 6 F 16 7 F 17 7 M 18 7 F 19 7 F 20 7 F 21 8 M 22 8 M 23 8 F
Predicting the movie 6 Class is not a problem since the movie actors were included in the training dataset and when the records are mapped to attributes because they already exist in the model. But when you try to predict movies (7 an 8) with unseen actors all new attributes are simply ignored in the ALGORITHM:redict call (in_ulCaseValues is zero!) because they do not exist in the model!
I am in the process of creating an Integration Services package to automate the process of training mining models and getting predictions. Until recently, I have been processing the models directly from Business Intelligence Studio without any problems. However, when I try to use the exact same training set as an input to the Data Mining Model Training destination, I get several errors. Here is the output:
[Mining Models [1]] Error: Parser: An error occurred during pipeline processing. [Mining Models [1]] Error: Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation. [Mining Models [1]] Error: Errors in the OLAP storage engine: An error occurred while the 'CPT MODIFIER' attribute of the 'BCCA DMS ~MC-CLAIM LIN~5' dimension from the 'BCCA LRG DMS TEST' database was being processed. [Mining Models [1]] Error: File system error: The record ID is incorrect. Physical file: . Logical file: . [Mining Models [1]] Error: Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation. [Mining Models [1]] Error: Errors in the OLAP storage engine: An error occurred while the 'BILL TYPE' attribute of the 'BCCA DMS ~MC-CLAIM LIN~5' dimension from the 'BCCA LRG DMS TEST' database was being processed. [Mining Models [1]] Error: File system error: The record ID is incorrect. Physical file: . Logical file: . [DTS.Pipeline] Error: The ProcessInput method on component "Mining Models" (1) failed with error code 0x80004005. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
I have not been able to find an answer as to why this is happening. I found a post regarding a similar problem with processing an OLAP cube in SSIS, but it seems that the author of that post never found an answer. Has anyone else here seen similar errors when processing mining models from Integration Services?
Also, if I process the mining models manually then try to run only predictions in SSIS, I get many of the same errors. I'll keep looking into the problem myself, but I would be very grateful if someone in this forum could shed some light on this issue.
I am wondering if it is possible to use SSIS to sample data set to training set and test set directly to my data mining models without saving them somewhere as occupying too much space? Really need guidance for that.
I am trying to model data in analysis services with the Advance Create Mining Model function in the excel addin. I am having trouble creating an association model that works like the Associate button above the Advanced button.
The format of my data is like this
OrderID Product
100 Bike
100 Helmet
100 Shoes
200 Helmet
200 basketball
200 Bat
300 Shoes
300 Socks
The associate button works perfectly since it asks me which column is the transaction id (orderid) and which column I am trying to predict (product). The advanced create mining model asks me to determine what the columns are...
OrderID=key Product=Input+Predict?
When I run the advance create mining model associate, I get a browser that gives me no rules and the support for only one item itemset (each product but no combination of products).
Does anyone know what I have to do to get it to work like the associate button?
I have been trying to use SQL 2005 data mining for about 8 weeks. I am becoming frustrated because I am not able to make progress nor am I able to exploit the power of the system.
I need a training course! I have asked Microsoft in UK for recommendations but they have been unable to help. I have searched for courses in the UK and US without sucess.
I am coming to the Microsoft BI event in Seattle - will there be any opportunities there to get help or find help? (In Seattle I intend to concentrate on the Excel add ins)
I perform data mining on all products and a specific product category. Do I need to create 2 data source views, one for all products and the other one for the specific product category? Afterward, I run the Data Mining Wizard 2 times to create 2 mining structures. I also need to add the same mining model (e.g. Bayes, Cluster) to each of these mining structures. Is there any simple way to do it?
I am wondering is there any way to select only a portion of a data set to train the mining model? In this case, I mean we dont need to split the dataset in advance, what I want to do is being able to select any random portion of a selected dataset to train a mining model. Any advices?
I am looking forward to hearing from you and thanks a lot in advance for your advices and help.
I get the following error when I try to load the mining model in the mining model viewer
Query (1, 6) The '[System].[Microsoft].[AnalysisServices].[System].[DataMining].[NeuralNet].[GetAttributeValues]' function does not exist.
I get a similar error when I try to load the Load Mining Accuracy Chart
Failed to execute the query due to the following error:
Query (1, 6) The '[System].[Microsoft].[AnalysisServices].[System].[DataMining].[AllOther].[GenerateLiftTableUsingDatasource]' function does not exist.
In the Mining Accuracy Chart, the predictable columns of nested tables does not show up in the "Select predictable mining model columns to show in the lift chart" table. The "Predictable column name" is empty.
Predictable columns in the case table shows up, but not the predictable columns in the nested table. What am I missing?
I am trying to write some DMX queries to create and populate a mining model as a SSMS analysis services project. I followed the following steps:
1. Create the mining model using a CREATE MINING MODEL ... query.
2. Followed by the - INSERT INTO MINING MODEL ... query, which fetches prediction data from another mining model to populate the mining model.
3. I now want to use this new model for prediction, which requires processing the mining model first. When I process the model, it throws the following error:
Error (Data mining): The 'XYZ_Structure' structure does not contain bindings to data (or contains bindings that are not valid) and cannot be processed.
Please suggest if I am making a mistake in the above procedure. I will appreciate all help in overcoming this issue.
Auxilliary question: How do I process the mining model programmatically?
I tried to find the graphs I saved from Data Mining Model Viewer, but where is it saved? As we see from the mining model viewer we could save the graphs there by clicking the 'save graph' button, but where is the graph?
Really need help for that.
Thank you very much in advance for any guidance and help for that.
I am new to SQL Server 2005 Analysis Services and would like to use the OLAP Cubes as a datasource to build Mining Model . However i would like to use a particular view of the OLAP cube that i have generated to be used as the datasource for the mining model . I find that i am not able to save the Cube View while browsing the OLAP cube in Business Intelligence Studio. Is there a way i can acheive this requirement.
Any ideas regarding this will be really appreciated.
I would like to develop an application that can create Data Mining structures and a mining model in SQL Server 2005 with VB.NET. I tried the code from book Data Mining with SQL server 2005 in chapter 14 but did not work. Any good idea?
Thank you very much for your help. The errors that I can see in the code that you gave in your answer are the following and they are more or less the same as I had previously
I tried the code but initially I have encounter the following problems.
1. In any line that have the declaration As Server, As Database like in Public Function CreateDatabase(ByVal srv As Server, ByVal databaseName As String) As Database gives me the problem that type Database is not declared the same type Server is not declared and it does not give me any option.
2. In addition to that for As DataSource, As RelationalDataSource, As RelationalDataSourceView, As ScalarMiningStructureColumn, As DataSourceViewBinding, gives me the problem that type is not declared.
3. Finally in mc = New MiningModelColumn("Yearly income", Utils.GetSyntacticallyValidID("Yearly income", Type.GetType(MiningModelColumn))) is not accesible in this context because it is 'Private'. I have some more problems but I thing that by solving the above that I referred I will solve the rest.
We've successfully processed a large decision tree model in SQL Server 2005. When I try to view the tree in the mining model viewer, I get the following error:
TITLE: Microsoft Visual Studio ------------------------------
The tree graph cannot be created because of the following error:
'Exception of type 'System.OutOfMemoryException' was thrown.'.
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%u00ae+Visual+Studio%u00ae+2005&ProdVer=8.0.50727.42&EvtSrc=Microsoft.AnalysisServices.Viewers.SR&EvtID=ErrorCreateGraphFailed&LinkId=20476
The link provides no other documentaiton on the error.
We're using 64-bit SQL on a Dell Workstation running XP-64 with 16GB of memory. From my view of things we aren't close to running out of memory. Since the model processed and the error occurs when viewing the model, is this a problem with Visual Studio and nont necessarily Anlaysis Services?
I am trying to get the training error of the model processed which can reveal how much the model fits the cases. The training error can reveal how many cases (from training set) are classified correctly. The lower traing error is, the more the model fits the training set. (Maybe overfitted) But I found it hard to get. I saw the life chart in AS 2005 which I am not quite understand and don't know how to code it in my program.
Is there some way to getting traing error or predicting error?
I am now using this awful way to get the training error:
select t.*,CollegeTree.CollegePlans as pred
from collegetree prediction join openquery(DSource,'select * from CollegePlans') as t on CollegeTree.StudentID = t.StudentID and ... where t.CollegePlans = CollegeTree.CollegePlans;
and then use datareader.ItemCount to get the count of cases which classified correctly.
Keyword: train error,predict error, data mining, analysis service
I am attempting to create the "Classification - Children at Home" Data Mining Model as described in Larson's book. Each time that I create it.. ONLY the ALL LEVEL is shown and it is impossible to expand the model to look at the Decision Tree, Neural Network, or the Clustering model etc. Drill down is enabled (tried it with and without enabling the drill down). The Children at Home field has been populated with values from 0 - 4. Any ideas would be greatly appreciated. regards Steve
I am attempting to create the "Classification - Children at Home" Data Mining Model as described in Larson's book. Each time that I create it.. ONLY the ALL LEVEL is shown and it is impossible to expand the model to look at the Decision Tree, Neural Network, or the Clustering model etc. Drill down is enabled (tried it with and without enabling the drill down). The Children at Home field has been populated with values from 0 - 4. Any ideas would be greatly appreciated. regards Steve
Hi, I am not getting Mining Accuracy Chart and Min ing Model Prediction Plz tel me how to do.And how to use the filter input data used to generate the lift chart and select predictable mining model columns to show in the lift chart
I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?
Thanks a lot for any help and guidance in advance.
I have MS Time Seeries model using a database of over a thousand products each of which has hundreds of cases. It amazingly takes only a few minutes to finish processing the model, but when I click Mining Model Viewer to view the models, it takes many hours to show up. Once the window is open, I can choose model for different products almost instantly. Is this normal?
I have a very simple SSIS package that is moving data from a DB2 database to a Teradata box. I've run it around 10 times, twice it pushed data over, the balance of the time, it executes with no error, but moves nothing over. In the "incomplete" runs, a command line box pops up for half a second, then the package ends.
Does anyone have ideas as to why this behavior is occurring?
I have an OLE DB Source and i want to transform the data type fields of the table before i export the table in an OLE DB Destination. Is there a way to transform numeric value to float, and numeric to nvchar?