Question On Column Mappings Between Mining Structure And Case Table For Lift Chart
May 3, 2007
Hi, all experts here,
I am a bit confused for the model evaluation (lift chart), should we map all the columns for both the mining structure and the case table? I mean for those predictive models, we have a predict column, shouldnt we ignore the mapping of the predictive column between the mining structure and the case table? But it seemes we are not allowed to miss the predictive column mapping between the mining structure and the case table.
Why is that? Could any experts here give me some explanation on that?
Hope my question is clear for your help.
Thanks a lot and I am looking forward to hearing from you shortly.
Hi, I am not getting Mining Accuracy Chart and Min ing Model Prediction Plz tel me how to do.And how to use the filter input data used to generate the lift chart and select predictable mining model columns to show in the lift chart
Hi, I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output) The Lift Chart/Mining Legend in Analysis Services shows three results Score, Population Correct (%), and Predict Probability (%)
Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?
However, I cant work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:
Model Score Pop Correct Pred Probability Decision Trees 0.83 76.59% 54.28% Neural Network 0.75 67.63% 50.05% Ideal Model 100.00%
Can anyone help with this and give a detailed explanation?
I am having trouble really understanding what makes a model accurate and effective at predicting some attribute. I can't seem to find any clear documentation about the mining legend of the lift chart on the Mining Accuracy Chart tab when working with the Data Mining Structure designer in VS 2005. Specifically, I would like to know more about what numbers in the Score, Population Correct and Predict Probability columns mean, and why they change when you move the vertical gray bar on the Lift Chart. Also, what is generally a good score to be aiming for, provided that it is highly difficult to get 100% accuracy with the kind of data that I am using.
Any more information on this subject is much appreciated. Thank you for your time,
Hello . Because of my graduation project , I interested in data mining application , Adventureworks DW on MS VS 2005 . I opened File->Open->project/solution ->Enterprise -> AdventureworksDW .then I successfully deployed the algorithms decision tree and Clustering . Then I opened tab Mining Accuracy Chart then selected input table "testing" , which I had created before , from vTargetMail . After that , mining structure table and target mail table has automaticaly linked each other .Next , I selected predictive input as 1 , of the predictable row "BikeBuyer" . But , when I clicked "Lift Chart ", I only got a 45 degree line , everytime .. How can I fix it ?
Hi,I am studying data mining features of SSAS and for a workshop I'vecreated 2 views derived from vTargetMail view of AdventureWorksDW.Train data consists every record except those in Pacific, and testview consists only records from Pacific area.1. I've created a mining structure based on Decision Tree and selectedBikeBuyer as predictable column.2. According to input column suggestions, I've selected Age,Eng.Education, NumberCarsOwned, YearlyIncome, CommuteDistance,NumberChildsatHome and TotalChildren as input columns,3. I've modified no other setting, and deployed project.I can get training results in decision tree browser and dependencynetwork (and both seem to give rather logical results) however, when Itry to browse lift chart or classification matrix I get an emptyclass.matr. and a lift chart of a single 45 degree line.Am I missing a step, or must I do some fine-tuning on (what)parameters?Thanks...
Hi, I am studying data mining features of SSAS and for a workshop I've created 2 views derived from vTargetMail view of AdventureWorksDW. Train data consists every record except those in Pacific, and test view consists only records from Pacific area.
1. I've created a mining structure based on Decision Tree and selected BikeBuyer as predictable column. 2. According to input column suggestions, I've selected Age, Eng.Education, NumberCarsOwned, YearlyIncome, CommuteDistance, NumberChildsatHome and TotalChildren as input columns, 3. I've modified no other setting, and deployed project.
I can get training results in decision tree browser and dependency network (and both seem to give rather logical results) however, when I try to browse lift chart or classification matrix I get an empty class.matr. and a lift chart of a single 45 degree line.
Am I missing a step, or must I do some fine-tuning on (what) parameters?
Hoping someone will have a solution for this error
Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.
Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?
Is there a Lift chart viewer like the model viewers that can be embedded within your windows apps ? If not, then can this be easily created via Adomd.net api calls ? Has anyone done this easily in custom DM applications ?
how can i change the language of lift chart for description of vertical axis and horizontal axis and howcan i see the script of this chart for time series algorithm
In the Mining Accuracy Chart, the predictable columns of nested tables does not show up in the "Select predictable mining model columns to show in the lift chart" table. The "Predictable column name" is empty.
Predictable columns in the case table shows up, but not the predictable columns in the nested table. What am I missing?
We're running into an issue where analysts are having problems obtaining lift charts (via the Mining Accuracy Chart UI available in the Visual Studio Analysis Services project) and performing prediction (via the Mining Model Prediction UI).
The issue seems to be related to the underlying analyst security model. Note that this post is related to:
Analysts that work on the same problem will only have access to:
- A sandbox relational database (which contains views into the same source database). The analyst is db_owner of the sandbox database, so she/he can create data transformations required, etc. The sandbox database contains views to the source database, but the analyst only has read-access to the specific data elements needed from the source DB. So, they are very restricted w.r.t. the source database, but are db_owners of their sandbox relational databases. Note that the analyst will connect to he database via Windows Authentication.
- An Analysis Services sandbox database to use for their modeling, etc. In this AS sandbox db, we've created a role called "Administrator" and checked the permissions: Full control (Administrator), Process database, and Read definition. The analyst's windows account is the "user" associated with this role.
Also, in this situation, the SQL Server 2005 Relational Engine and Analysis Services are running on a single machine. The goal of this security model is to provide analysts with the ability to work in their "workspaces" (both SQL and AS), but not to see other analysts work, etc.
Under this model, Analysts are able to deploy mining models when the Data Source object that points to their relational "sandbox" DB is set-up with "Impersonation Information" = "Use a specific user name and password", where the Analyst provides their domain account information.
But, when trying to build a lift chart using the same data source view objects that were used to successfully train the model, the following error is occurring consistently:
Window Title: "Loading Mining Accuracy Chart" Window Text: "Failed to execute the query due to the following error: Execution of the managed stored rocedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of invocation. Either '<domain><login>' user does not have permission to access the '' object, or the object does not exist. Errors in the high-level relational engine. A connection could not be made to the data source specified in the query. Errors in the high-level relational engine. A connection could not be made to the data source specified in the query.."
Since the Analyst was able to build the model with her/his given '<domain><login>' credentials, it is puzzling why the lift chart is failing.
Why the lift chart for my mining models evaluation does not have a random guess model line there? As normal, there should be lines like trained models, ideal model, and random guess model? Why is that? What did I miss? Could any experts here shed me any light on that.
Thanks a lot in advance for your advices and help and I am looking forward to hearing from you shortly.
I have a mining structure that I am using to perform a text-mining classification task. The mining structure contains three models: a decision tree, a naive bayes and a neural network.
Both the decision tree and the naive bayes models process without any problems, but I am having significant difficulties with the neural network model.
Initially when I processed the model, processing would fail altogether with the following error message:
"Memory error: Allocation failure : Not enough storage is available to process this command"
This was remedied by taking the steps prescribed in http://support.microsoft.com/kb/917885 (I upgraded to SQL 2005 SP1 and applied all available hotfixes listed in http://support.microsoft.com/kb/918222/). This got me to the point where the model (seemingly) processed correctly after restricting the MAXIMUM_INPUT_ATTRIBUTES to a relatively low number. So after processing, I went to try and browse the neural network model and view the lift chart...
<error>
"Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Input string was not in a correct format.."
</error>
(see http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=935340&SiteID=17)
Also when I would attempt to view the lift chart and the classification matrix the queries would time out with the following error message:
<error>
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
</error>
Now, my poking around on Technet lead me to believe that this issue could finally be resolved by uprading to the CTP release of SQL server 2005 SP2. Now I am still encountering problems. When I go to browse the model in the Neural Network Viewer, I see the correct drop down menus to select attributes and attribute values in the "Input" and "Output" panes but I see no data displayed in the "Variables" pane at the bottom.
Interestingly, while I cannot view the model contents in the graphical viewer, the mining model contents viewer reveals model contents that look to be pretty normal for a trained neural network.
Attempts to view the lift chart time out with the error message:
<error>
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
</error>
and when I run predictions against the trained NN model in the "Mining Model Prediction" pane it predicts the same value for every case in the testing set.
I would like to develop an application that can create Data Mining structures and a mining model in SQL Server 2005 with VB.NET. I tried the code from book Data Mining with SQL server 2005 in chapter 14 but did not work. Any good idea?
Thank you very much for your help. The errors that I can see in the code that you gave in your answer are the following and they are more or less the same as I had previously
I tried the code but initially I have encounter the following problems.
1. In any line that have the declaration As Server, As Database like in Public Function CreateDatabase(ByVal srv As Server, ByVal databaseName As String) As Database gives me the problem that type Database is not declared the same type Server is not declared and it does not give me any option.
2. In addition to that for As DataSource, As RelationalDataSource, As RelationalDataSourceView, As ScalarMiningStructureColumn, As DataSourceViewBinding, gives me the problem that type is not declared.
3. Finally in mc = New MiningModelColumn("Yearly income", Utils.GetSyntacticallyValidID("Yearly income", Type.GetType(MiningModelColumn))) is not accesible in this context because it is 'Private'. I have some more problems but I thing that by solving the above that I referred I will solve the rest.
I perform data mining on all products and a specific product category. Do I need to create 2 data source views, one for all products and the other one for the specific product category? Afterward, I run the Data Mining Wizard 2 times to create 2 mining structures. I also need to add the same mining model (e.g. Bayes, Cluster) to each of these mining structures. Is there any simple way to do it?
I just found that I am not able to view the accuracy chart for my mining model. The error message is: no mining models are selected for comparision. Which is quite strange.
How to right choose key column in"Mining Structure" for Microsoft Analysis Services?
I have table:
"Incoming goods"
Create table Income ( ID int not null identity(1, 1) [Date] datetime not null, GoodID int not null, PriceDeliver decimal(18, 2) not null, PriceSalse decimal(18, 2) not null, CONSTRAINT PK_ Income PRIMARY KEY CLUSTERED (ID), CONSTRAINT FK_IncomeGood foreign key (GoodID) references dbo.Goods ( ID ) )
I'm trying to build a relationship(regression) between “Price Sale” from Good and “Price Deliver”.But I do not know what column better choose as “key column”: ID or GoodID ?
I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project:
"Error 6 Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid. 0 0 " I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?
I've tried those two operations in the Management Studio. Though we can create a mining structure and mining model in Management Studion, but we cannot process the analysis-service database.
(1) I create only a mining structure through CREATE MINING STRUCTURE. No error reported. But if I process the analysis-service database in Management Studio I always get error
'Error : The '<mining_structure__name>' structure does not contain bindings to data (or contain bindings that are not valid) and cannot be processed.
I then tried to create it by creating and running an XMLA script. It was successful. However, it's much harder to learn XMLA.
If any of you created an analysis-service database in Mgt Studio, and create a mining structure in the same place using DMX script, can you process the database?
(2) Is there any use of CREATE MINING STRUCTURE operation without binding to any table? Examples I saw so far did not show relating it to do. In my experience processing the analysis-service database with that mining structure is doomed to fail.
(3) Is there any way we can create mining structure through CREATE MINING
STRUCTURE operation in Management Studio and use RELATED TO clause
to bind it to any Relationship to an attribute column (mandatory only if it applies), indicated by the RELATED TO clause
(4) If this is the fact, is there any use of CREATE MINING STRUCTURE operation? If we use BI Dev Studio, it's much easier to use the wizard.
(5) I found I cannot create a mining model inside a mining structure through operation CREATE MINING MODEL. If you call that operation, you end up having a mining model and a mining structure with the same name. I found that in order to create a mining model inside a mining structure you have to call operation ALTER MINING STRUCTURE ADD MINING MODEL. Is it true this is the only way?
I am working with several tables, but for now I just mention 4 : one is fact table (named Usage), and 3 dimensional tables Periods, Products, and Regions. The fact table contains references to the dimensional tables. Table Periods contain two other columns month and year.
I created a cube containing columns from those 4 tables. Deployment was successful. Trouble comes when I want to create a mining structure using Time Series containing these columns :
- Period - Amount (of money) - Product name - Region name
When I choose to use cube (instead of table) as source for mining structure, I'm forced to choose only one dimension (among the Periods, Products, and Regions). Whatever dimension I choose I end up being unable to use the column period as the Time-Key column. Effectively I cannot use Time Series method since I cannot use the column period.
(1) Why is this so [why Visual Studio forced us to use only one dimension from the cube] ? (2) Why Visual Studio eliminates the column period, column that has relationship with the time dimension? (3) What is the use of Cube anyway to the mining? Is there still any use for it? (4) What is the solution to that kind of problem I face?
I ran a decision tree, clustering and neural network mining model across a dataset of about 200,000 records. I am trying to evaluate the accuracy of each of my models but I can't view the results.
I get the following error:
Failed to execute the query due to the following error:
XML for Analysis parser: The XML for Analysis request timed out before it was completed. Execution of the managed stored procedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
Is it possible to save column mapping definitions from a Transform Data Task? The practical use is I have four tables with very similar layouts of which 200 or so columns are identical. I have various front and back office applications that require local copies of this data in various formats. It is EXTREMELY tedious to remap all of the columns for each Transform Data Task required on these applications.
Is there a way to store all of the column mapping def's and import them into a new transform data task?
I know, this is a common OLAP Error, but In fact I'm having trouble with this while trying to process a DM Mining Structure. I'm currently working on a website that gets data from its users and analyzes it using SSAS. The thing is each time we add a new "analysis criterium" (sorry I'm trying to translate our french BI language in English...), we have to build a new mining structure, which needs data about users who have actually answered the question associated with this criterium. Some times, there are thousands, and some other only dozens, which is the case for the structure I'm having trouble with.
I got only 2 hundred tuples in the learning set. So lots of the common criteria weren't filled: I removed them using a stored procedure before feeding the structure, so that I got no column with only "null" values.
Of course, I know that 200 learning cases is really not enough to build an accurate model, but the purpose was just a proof of concept for machine driven Mining Structure building, and that was supposed to work even with so few cases. When I process the MS, it fires: (Sorry it's in french, translation follows) Erreurs dans le moteur de stockage OLAP : La clé d'attribut est introuvable : Table : _x0032_0_EtudeIphone_Apprentissage, Colonne : EtudeIphone, Valeur : le nouvel iPhone (téléphone tactile et musical dApple). Erreurs dans le moteur de stockage OLAP : La clé d'attribut a été convertie en un membre inconnu parce que cette dernière est introuvable. Attribut Id de la dimension : 20_EtudeIPhone ~MC-Id de la base de données : ClassificationVDCE, Enregistrement : 2.
Badly translated it says "Errors in OLAP Storage Engine: Attribute Key not found Table:<StrangeTable> Column <MyPredictableColumn> Value <OneOfTheInterestingValues> Errors in OLAP Storage Engine: Attribute key not found: converted to an unknown member. Attribute Id from dimension..."
Why? Too few cases? I have structures based on the same template but associated with other criteria and they work perfectly.
I'm ready to answer any question, and give any detail. Thanks in advance.
Just found that we are not able to define multiple key columns for a mining structure in SQL Server 2005 Data Mining engine, just wondering is there other way to define multiple key columns for a mining structure there? As in many cases, the table we are mining are with composite key consisting of different foriengn keys, e.g. A fact table are with transaction information and other foreign keys. If I am not able to define these composite key here for this fact table, I will have to have a named calculation in data source view to have a key column which is based on these original composite keys? Is this a better way to solve this problem or there is any other alternatives to figure it out?
Hope my question is clear for your help and I am looking forward to heaing from you shortly for your kind advices and help and thanks a lot in advance.
Can anyone spot what i am missing here ? The problem is that i am getting a null object for e.TextData in the t_OnEvent(object sender, TraceEventArgs e) function below. I am trying to get event- notifications while processing the data mining structure.
Say you have an existing populated SQL 2005 database, with 700+ tables, and you want to just change the order of the columns inside every table. Short of manually building conversion scripts, anyone know an automated way to do this? I was thinking thru ways to do them all in one shot, and have tools like Erwin and DbGhost that could be used also. Basically moving some standard audit columns from the end of the tables to just after the PK columns.
hi - I am totally new to SSIS etc and SQL 2005. I have a dts task to recreate in SSIS. I have done most of them and muddled my way through, but this basic problem has got me stuck. When mapping columns from my file to my ole db output table, I want to map one input column onto two output columns, but it will only seem to let me select one destination column for each input? I have tried shift/alt/ctrl etc to try to get it to map to both columns but it wont have it. How do I do it?
Also, somehow my Dataflow Sources tab has gone from the toolbox, and I can't seem to get it back any way - I switched on everything I could see and all components etc, but it is not in there as an option. How do I get it back in the toolbox?
I got a quite a strange problem with Mining structure for OLAP data source though. The problem is: I am not able to edit the mining structure in the mining structure editor. The whole data source view within the mining structure editor is greyed. Could please anyone here give me any advices for that?
Thank you very much in advance for any help for that.
I've tried to use the export command to export a mining model from management studio, but it returns that export statement is not supported for OLAP mining structures, Ive checked the EXPORT(DMX) reference I can't see any note that it is not applicable on OLAP structures.