Broken Lift Chart And Empty Classification Matrix In Data Mining
Sep 5, 2007
Hi,
I am studying data mining features of SSAS and for a workshop I've
created 2 views derived from vTargetMail view of AdventureWorksDW.
Train data consists every record except those in Pacific, and test
view consists only records from Pacific area.
1. I've created a mining structure based on Decision Tree and selected
BikeBuyer as predictable column.
2. According to input column suggestions, I've selected Age,
Eng.Education, NumberCarsOwned, YearlyIncome, CommuteDistance,
NumberChildsatHome and TotalChildren as input columns,
3. I've modified no other setting, and deployed project.
I can get training results in decision tree browser and dependency
network (and both seem to give rather logical results) however, when I
try to browse lift chart or classification matrix I get an empty
class.matr. and a lift chart of a single 45 degree line.
Am I missing a step, or must I do some fine-tuning on (what)
parameters?
Hi, I am studying data mining features of SSAS and for a workshop I've created 2 views derived from vTargetMail view of AdventureWorksDW. Train data consists every record except those in Pacific, and test view consists only records from Pacific area.
1. I've created a mining structure based on Decision Tree and selected BikeBuyer as predictable column. 2. According to input column suggestions, I've selected Age, Eng.Education, NumberCarsOwned, YearlyIncome, CommuteDistance, NumberChildsatHome and TotalChildren as input columns, 3. I've modified no other setting, and deployed project.
I can get training results in decision tree browser and dependency network (and both seem to give rather logical results) however, when I try to browse lift chart or classification matrix I get an empty class.matr. and a lift chart of a single 45 degree line.
Am I missing a step, or must I do some fine-tuning on (what) parameters?
Hi, I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output) The Lift Chart/Mining Legend in Analysis Services shows three results €“ Score, Population Correct (%), and Predict Probability (%)
Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?
However, I can€™t work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:
Model Score Pop Correct Pred Probability Decision Trees 0.83 76.59% 54.28% Neural Network 0.75 67.63% 50.05% Ideal Model 100.00%
Can anyone help with this and give a detailed explanation?
Hi, I am not getting Mining Accuracy Chart and Min ing Model Prediction Plz tel me how to do.And how to use the filter input data used to generate the lift chart and select predictable mining model columns to show in the lift chart
I am having trouble really understanding what makes a model accurate and effective at predicting some attribute. I can't seem to find any clear documentation about the mining legend of the lift chart on the Mining Accuracy Chart tab when working with the Data Mining Structure designer in VS 2005. Specifically, I would like to know more about what numbers in the Score, Population Correct and Predict Probability columns mean, and why they change when you move the vertical gray bar on the Lift Chart. Also, what is generally a good score to be aiming for, provided that it is highly difficult to get 100% accuracy with the kind of data that I am using.
Any more information on this subject is much appreciated. Thank you for your time,
Hello . Because of my graduation project , I interested in data mining application , Adventureworks DW on MS VS 2005 . I opened File->Open->project/solution ->Enterprise -> AdventureworksDW .then I successfully deployed the algorithms decision tree and Clustering . Then I opened tab Mining Accuracy Chart then selected input table "testing" , which I had created before , from vTargetMail . After that , mining structure table and target mail table has automaticaly linked each other .Next , I selected predictive input as 1 , of the predictable row "BikeBuyer" . But , when I clicked "Lift Chart ", I only got a 45 degree line , everytime .. How can I fix it ?
I am a bit confused for the model evaluation (lift chart), should we map all the columns for both the mining structure and the case table? I mean for those predictive models, we have a predict column, shouldnt we ignore the mapping of the predictive column between the mining structure and the case table? But it seemes we are not allowed to miss the predictive column mapping between the mining structure and the case table.
Why is that? Could any experts here give me some explanation on that?
Hope my question is clear for your help.
Thanks a lot and I am looking forward to hearing from you shortly.
managed plug-in framework that's available for download here: http://www.microsoft.com/downloads/details.aspx?familyid=DF0BA5AA-B4BD-4705-AA0A-B477BA72A9CB&displaylang=en#DMAPI.
This package includes the source code for a sample plug-in algorithm written in C#.
in this source code all .cs files are modified for clustering algorithm
if my plugin algorithm is of association or classification type then what modifications are requried in source code???
I'm consistently getting >10 minute render times in a matrix based report whose underlying query only takes 1-2 seconds to run in both RS's data tab and mgt studio. The bad render times occur in preview tab, Report Viewer and when requested from Report Manager. I believe most of the time being taken is RS's attempt to pivot and present the data.
I can run a homegrown query that pivots (with unknown number of cols) the data itself in under 4 seconds but would be concerned about mapping an unknown number of columns to a table region in RS.
The matrix I'm testing with is 67 columns across (just over 2 months of daily columns) and is supposed to be 207 rows long. There is no aggregation required by RS when it is pivoting the data because each intersection occurs at most one time in the query's result set. The number of data points actually being "bucketed" is around 13,800 values. Row headers consist of 3 columns.
RS is also cutting off the number of rows it is supposed to be showing. The number of rows cutoff seems to be 100% repeatable based on the date range chosen. RS doesnt tell me when it cuts off rows that should have been displayed. From what I can tell so far, it never cuts off columns.
The number of rows cut off seems consistent across designer's preview, report viewer and when requested thru report manager. The "run" command (exclamation icon in data tab) seems to always return the correct number of unpivoted rows, just as in mgt studio.
Are there known bugs and or limitations in RS's matrix data region feature? Are there some settings or workarounds that might make it work more to my liking?
Is there a Lift chart viewer like the model viewers that can be embedded within your windows apps ? If not, then can this be easily created via Adomd.net api calls ? Has anyone done this easily in custom DM applications ?
how can i change the language of lift chart for description of vertical axis and horizontal axis and howcan i see the script of this chart for time series algorithm
We're running into an issue where analysts are having problems obtaining lift charts (via the Mining Accuracy Chart UI available in the Visual Studio Analysis Services project) and performing prediction (via the Mining Model Prediction UI).
The issue seems to be related to the underlying analyst security model. Note that this post is related to:
Analysts that work on the same problem will only have access to:
- A sandbox relational database (which contains views into the same source database). The analyst is db_owner of the sandbox database, so she/he can create data transformations required, etc. The sandbox database contains views to the source database, but the analyst only has read-access to the specific data elements needed from the source DB. So, they are very restricted w.r.t. the source database, but are db_owners of their sandbox relational databases. Note that the analyst will connect to he database via Windows Authentication.
- An Analysis Services sandbox database to use for their modeling, etc. In this AS sandbox db, we've created a role called "Administrator" and checked the permissions: Full control (Administrator), Process database, and Read definition. The analyst's windows account is the "user" associated with this role.
Also, in this situation, the SQL Server 2005 Relational Engine and Analysis Services are running on a single machine. The goal of this security model is to provide analysts with the ability to work in their "workspaces" (both SQL and AS), but not to see other analysts work, etc.
Under this model, Analysts are able to deploy mining models when the Data Source object that points to their relational "sandbox" DB is set-up with "Impersonation Information" = "Use a specific user name and password", where the Analyst provides their domain account information.
But, when trying to build a lift chart using the same data source view objects that were used to successfully train the model, the following error is occurring consistently:
Window Title: "Loading Mining Accuracy Chart" Window Text: "Failed to execute the query due to the following error: Execution of the managed stored rocedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of invocation. Either '<domain><login>' user does not have permission to access the '' object, or the object does not exist. Errors in the high-level relational engine. A connection could not be made to the data source specified in the query. Errors in the high-level relational engine. A connection could not be made to the data source specified in the query.."
Since the Analyst was able to build the model with her/his given '<domain><login>' credentials, it is puzzling why the lift chart is failing.
Why the lift chart for my mining models evaluation does not have a random guess model line there? As normal, there should be lines like trained models, ideal model, and random guess model? Why is that? What did I miss? Could any experts here shed me any light on that.
Thanks a lot in advance for your advices and help and I am looking forward to hearing from you shortly.
I have a mining structure that I am using to perform a text-mining classification task. The mining structure contains three models: a decision tree, a naive bayes and a neural network.
Both the decision tree and the naive bayes models process without any problems, but I am having significant difficulties with the neural network model.
Initially when I processed the model, processing would fail altogether with the following error message:
"Memory error: Allocation failure : Not enough storage is available to process this command"
This was remedied by taking the steps prescribed in http://support.microsoft.com/kb/917885 (I upgraded to SQL 2005 SP1 and applied all available hotfixes listed in http://support.microsoft.com/kb/918222/). This got me to the point where the model (seemingly) processed correctly after restricting the MAXIMUM_INPUT_ATTRIBUTES to a relatively low number. So after processing, I went to try and browse the neural network model and view the lift chart...
<error>
"Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Input string was not in a correct format.."
</error>
(see http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=935340&SiteID=17)
Also when I would attempt to view the lift chart and the classification matrix the queries would time out with the following error message:
<error>
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
</error>
Now, my poking around on Technet lead me to believe that this issue could finally be resolved by uprading to the CTP release of SQL server 2005 SP2. Now I am still encountering problems. When I go to browse the model in the Neural Network Viewer, I see the correct drop down menus to select attributes and attribute values in the "Input" and "Output" panes but I see no data displayed in the "Variables" pane at the bottom.
Interestingly, while I cannot view the model contents in the graphical viewer, the mining model contents viewer reveals model contents that look to be pretty normal for a trained neural network.
Attempts to view the lift chart time out with the error message:
<error>
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
</error>
and when I run predictions against the trained NN model in the "Mining Model Prediction" pane it predicts the same value for every case in the testing set.
I am using a matrix to display current and historical financial data. To do this, I am using 5 different datasets. Each dataset contains 1 row of data for a specific year for a specific account.
Now that problem I am having is displaying this data in a chart. When I create a new chart, it asks which dataset I would like to use. I do not want to use any specific dataset, I want to get chart's data from the matrix. Is this possible?
I have made a chart that displays data for my company. However, when I have the report.rdl on the "live" serverand use the "live" data source, I can't get the report manager to display the chart, only the famuos x of a broken image link. If I place the report.rdl on the test server and use the "live" data source the chart is displayed.
What is the difference between having the report on the test server versus the "live" server?
Report manager -> report on test server -> live data source = working
Report manager -> report on "live" server -> live data source = no image
I dont think the problem is sequrity related, but I dont know for sure...
I was wondering what the best way was to deal with subscriptions breaking due to an empty table. I have subscriptions that people have scheduled to go out daily, but on certain days the table may be empty, in this case the subscription doesn't read the parameters for the report and then the subscription breaks.
My original solution involved creating a #temp table with the same columns as the original table and inserting one row into it which I'd union with the original table, this row in the temp table had all its values set to 0. The solution worked when I ran it in SQL Server Management Studio but it seems SRS doesn't like the INSERT INTO statement, which is the error I get, but I've read on these forums that it doesn't like #temp tables either. I proceeded to use a stored procedure with all the code in it, but I might have trouble filtering on multi value parameters, because at times these parameter lists get real big, plus I have to do this for multiple reports and don't want to get into creating stored procedures for each report.
Following is what the code I used look something like that executes and does the job in Management Studio but not SRS. I'm mainly just looking for the easiest and cleanest way to do this, since it'll have to be done across multiple reports, so disregard the code if there's an easier way to do it. Thanks in advance.
create table #dummytable ( name varchar(35), country varchar(35), idnumber (int) ) GO
insert into #dummytable (name),values('0');
select name, country, idnumber from originaltable where name in (@name)
union select name = 0, country = 0, idnumber = 0 from #dummytable
I have a Matrix table that expands to the right when choosing an amount of months to be shown. Under this matrix I have to Charts. The two charts are situated together, that is no space between them, and to the left of the report.
Now, if I choose a lot of months, say three years the matrix diagram will be huge to the right. The problem I have is that the second diagram, the one on the right, moves to the right depending on how big the report gets, and this is not good at all. The two charts are supposed to be all the way to the left.
I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?
Thanks a lot for any help and guidance in advance.
I ran a decision tree, clustering and neural network mining model across a dataset of about 200,000 records. I am trying to evaluate the accuracy of each of my models but I can't view the results.
I get the following error:
Failed to execute the query due to the following error:
XML for Analysis parser: The XML for Analysis request timed out before it was completed. Execution of the managed stored procedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
I would like to know if there is any way to migrate third-party data mining packages with SQL Server 2005 data mining algorithms together then we can have a comparison among all of them to get the best results for training models.
Hoping someone will have a solution for this error
Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.
Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?
I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project:Â
"Error 6 Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid. 0 0 " I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?
- a data mining structure with about 80 columns. - a data mining model using Microsoft_Decision_Trees with 2 prediction columns.Â
I thought I would then explore the possibility of have more than 2 prediction columns, in this case 20.
I get an error message and I can't work out : a) if this is because there's a limit to the maximum number of prediction columns and where that maximum is stated. b) if something else has become corrupted c) there's a know bug and if the error message is either meaningful or not.
Either way, I'm unable to complete the data mining wizardÂ
The error message is :Errors in the metadata manager. Either the mining structure with the ID of '[my model Structure]' does not exist in the database with the ID of 'DMAddinsDB', or the user does not have permissions to access the object.
In the Mining Accuracy Chart, the predictable columns of nested tables does not show up in the "Select predictable mining model columns to show in the lift chart" table. The "Predictable column name" is empty.
Predictable columns in the case table shows up, but not the predictable columns in the nested table. What am I missing?
I have this issues with empty points in a Line Chart. How to make it work with dynamic values in the series and some dates in the category, how to join the "data" values for a "series"where there are no category value. So far I cannot see any fix that would make it work by joining the data points with the "average" data on a empty point (no category value) and with the same color on the line.
I am using Microsoft_Time_Series and have set HISTORIC_MODEL_GAP to various values (from 1 to 21). I always get this error: Error (Data mining): The 'HISTORIC_MODEL_GAP' data mining parameter is not valid for the 'My Time Series' model.
In Algorithm Parameters window, this parameters is not there by default, so I have to add it.
Implementing data mining Add-in in an academic setting? We need to handle over 150 new students a semester and have their connection to Analysis Services survive for their four years at the college. We are introducing data mining to every freshman business student as a unit within their Intro to Excel class (close to a month of work to give them a sense of what is possible). Other courses later in their curriculum will expand on that introduction.Â
Once implemented, we would have as many as 900 connections to manage (four years from now). It is possible that multiple sections will be running at the same time, so 40 students may be accessing the data mining tools concurrently. Â
Is there a way to "bulk establish" the access credentials and establish those databases?
Hi everyone, I have created a matrix report which takes 3 parameters being store, year and quarter which then displays budget information in a matrix. This all works fine but I would also like to have a bar chart below the matrix to visually show the results.
However when I add the bar chart and add the relevent dataset details etc the chart does not appear when I preview the report. I wondered if the chart wasn't showing because the parameter values are not being passed to the chart.
Has anybody any idea of adding charts to reports in this way or knows of any instructions to do so.
With SASS Database i have created Data mining Structure Using Time series algorithm, while processing the SSAS db, Data mining  taking long time to process, so how we can  reduce processing time ???