Problems With Neural Net Viewer, Lift Chart And Predictions Still Ocurring In SP2
Nov 24, 2006
I have a mining structure that I am using to perform a text-mining
classification task. The mining structure contains three models: a
decision tree, a naive bayes and a neural network.
Both the decision tree and the naive bayes models process without
any problems, but I am having significant difficulties with the neural
network model.
Initially when I processed the model, processing would fail altogether with the following error message:
"Memory error: Allocation failure : Not enough storage is available to process this command"
This was remedied by taking the steps prescribed in (I upgraded to SQL 2005 SP1 and
applied all available hotfixes listed in This got me to the point
where the model (seemingly) processed correctly after restricting the
MAXIMUM_INPUT_ATTRIBUTES to a relatively low number. So after
processing, I went to try and browse the neural network model and view
the lift chart...
"Execution of the managed stored procedure GetAttributeScores
failed with the following error: Exception has been thrown by the
target of an invocation.Input string was not in a correct format.."
Also when I would attempt to view the lift chart and the
classification matrix the queries would time out with the following
error message:
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure
GenerateLiftTableUsingDatasource failed with the following error:
Exception has been thrown by the target of an
Now, my poking around on Technet lead me to believe that this issue
could finally be resolved by uprading to the CTP release of SQL server
2005 SP2. Now I am still encountering problems. When I go to browse the
model in the Neural Network Viewer, I see the correct drop down menus
to select attributes and attribute values in the "Input" and "Output"
panes but I see no data displayed in the "Variables" pane at the
Interestingly, while I cannot view the model contents in the
graphical viewer, the mining model contents viewer reveals model
contents that look to be pretty normal for a trained neural network.
Attempts to view the lift chart time out with the error message:
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure
GenerateLiftTableUsingDatasource failed with the following error:
Exception has been thrown by the target of an
and when I run predictions against the trained NN model in the
"Mining Model Prediction" pane it predicts the same value for every
case in the testing set.
What can we tell from the lift value of the attribute value in Neural Network? (any so-called threshold of this lift value which identifies whether or not an attribute value is important or whatever to the selecte output attribute value?) I mean with lift value of a particular value, when we describe the characteristics of a segmentaion with Neural network, what can we actually describe it?
I just dont know what can we desribe this for the segmentation by neural network viewer?
I am looking forward to hearing from you shortly and thansk a lot.
Hi, I am not getting Mining Accuracy Chart and Min ing Model Prediction Plz tel me how to do.And how to use the filter input data used to generate the lift chart and select predictable mining model columns to show in the lift chart
Is there a Lift chart viewer like the model viewers that can be embedded within your windows apps ? If not, then can this be easily created via api calls ? Has anyone done this easily in custom DM applications ?
how can i change the language of lift chart for description of vertical axis and horizontal axis and howcan i see the script of this chart for time series algorithm
I have read the threads regarding the Neural Network Viewer and I think I have a similar problem. I do have Service Pack 2 installed and I'm running the x64 version of SQL 2005.
I'm building a model from a single relational CASE table. Granted the table has many columns summarized at the customer level but there are it is well formed and has no NULL values (plenty of zero or blank values though). The only time I can get the NN Viewer to work is when I accept the attribute recommendations. It seems once I stray from these recommendation, even if there is still correlation with an attribute, I cannot view the model using the NN Viewer. My latest error message says:
"The provider could not determine the String value. For example, the row was just created, the default for the String column was not available, and the consumer had not yet set a new String value."
I get this message even when all input attributes are Continuous so I'm not sure what String column it is referring to.
Any help is greatly appreciated. I'm in a time crunch and I have sold the client on SQL Server 2005 capabilities. It's a bit embarrasing if I can't get this resolved.
-- Steve
P.S.: I don't recall having any issues with the NN Viewer prior to Service Pack 2 (although others have). Have you done regression testing to test this issue?
I am confused about the value of Probability of Value 1 or 2 (on a particular attribute value) in Neural Network viewer. E.g. the value of Probability of value 1 is actually very low (the same to the value of Probability of value 2), but why the bar which shows the strength of the probability of these two values are still so strong even stronger than other values of probability of value 1 or 2 based on other attribute values which have a much higher probability of value 1 or 2?
And how does the algorithm calculate the Probability of attribute value in nerual network by the way?
Hope my question is clear.
I am looking forward to hearing from you shortly and thanks a lot in advance.
I am having trouble really understanding what makes a model accurate and effective at predicting some attribute. I can't seem to find any clear documentation about the mining legend of the lift chart on the Mining Accuracy Chart tab when working with the Data Mining Structure designer in VS 2005. Specifically, I would like to know more about what numbers in the Score, Population Correct and Predict Probability columns mean, and why they change when you move the vertical gray bar on the Lift Chart. Also, what is generally a good score to be aiming for, provided that it is highly difficult to get 100% accuracy with the kind of data that I am using.
Any more information on this subject is much appreciated. Thank you for your time,
I've successfully created and processed a very simple neural network mining model (defined against a cube). However, when I go to the model viewer in BI studio, it displays the following error:
"Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Input string was not in a correct format.."
Any ideas about what's going wrong? This is with SQL Server 2005 SP1.
We're running into an issue where analysts are having problems obtaining lift charts (via the Mining Accuracy Chart UI available in the Visual Studio Analysis Services project) and performing prediction (via the Mining Model Prediction UI).
The issue seems to be related to the underlying analyst security model. Note that this post is related to:
Analysts that work on the same problem will only have access to:
- A sandbox relational database (which contains views into the same source database). The analyst is db_owner of the sandbox database, so she/he can create data transformations required, etc. The sandbox database contains views to the source database, but the analyst only has read-access to the specific data elements needed from the source DB. So, they are very restricted w.r.t. the source database, but are db_owners of their sandbox relational databases. Note that the analyst will connect to he database via Windows Authentication.
- An Analysis Services sandbox database to use for their modeling, etc. In this AS sandbox db, we've created a role called "Administrator" and checked the permissions: Full control (Administrator), Process database, and Read definition. The analyst's windows account is the "user" associated with this role.
Also, in this situation, the SQL Server 2005 Relational Engine and Analysis Services are running on a single machine. The goal of this security model is to provide analysts with the ability to work in their "workspaces" (both SQL and AS), but not to see other analysts work, etc.
Under this model, Analysts are able to deploy mining models when the Data Source object that points to their relational "sandbox" DB is set-up with "Impersonation Information" = "Use a specific user name and password", where the Analyst provides their domain account information.
But, when trying to build a lift chart using the same data source view objects that were used to successfully train the model, the following error is occurring consistently:
Window Title: "Loading Mining Accuracy Chart" Window Text: "Failed to execute the query due to the following error: Execution of the managed stored rocedure GenerateLiftTableUsingDatasource failed with the following error: Exception has been thrown by the target of invocation. Either '<domain><login>' user does not have permission to access the '' object, or the object does not exist. Errors in the high-level relational engine. A connection could not be made to the data source specified in the query. Errors in the high-level relational engine. A connection could not be made to the data source specified in the query.."
Since the Analyst was able to build the model with her/his given '<domain><login>' credentials, it is puzzling why the lift chart is failing.
Why the lift chart for my mining models evaluation does not have a random guess model line there? As normal, there should be lines like trained models, ideal model, and random guess model? Why is that? What did I miss? Could any experts here shed me any light on that.
Thanks a lot in advance for your advices and help and I am looking forward to hearing from you shortly.
Hello . Because of my graduation project , I interested in data mining application , Adventureworks DW on MS VS 2005 . I opened File->Open->project/solution ->Enterprise -> AdventureworksDW .then I successfully deployed the algorithms decision tree and Clustering . Then I opened tab Mining Accuracy Chart then selected input table "testing" , which I had created before , from vTargetMail . After that , mining structure table and target mail table has automaticaly linked each other .Next , I selected predictive input as 1 , of the predictable row "BikeBuyer" . But , when I clicked "Lift Chart ", I only got a 45 degree line , everytime .. How can I fix it ?
Hi,I am studying data mining features of SSAS and for a workshop I'vecreated 2 views derived from vTargetMail view of AdventureWorksDW.Train data consists every record except those in Pacific, and testview consists only records from Pacific area.1. I've created a mining structure based on Decision Tree and selectedBikeBuyer as predictable column.2. According to input column suggestions, I've selected Age,Eng.Education, NumberCarsOwned, YearlyIncome, CommuteDistance,NumberChildsatHome and TotalChildren as input columns,3. I've modified no other setting, and deployed project.I can get training results in decision tree browser and dependencynetwork (and both seem to give rather logical results) however, when Itry to browse lift chart or classification matrix I get an emptyclass.matr. and a lift chart of a single 45 degree line.Am I missing a step, or must I do some fine-tuning on (what)parameters?Thanks...
Hi, I am studying data mining features of SSAS and for a workshop I've created 2 views derived from vTargetMail view of AdventureWorksDW. Train data consists every record except those in Pacific, and test view consists only records from Pacific area.
1. I've created a mining structure based on Decision Tree and selected BikeBuyer as predictable column. 2. According to input column suggestions, I've selected Age, Eng.Education, NumberCarsOwned, YearlyIncome, CommuteDistance, NumberChildsatHome and TotalChildren as input columns, 3. I've modified no other setting, and deployed project.
I can get training results in decision tree browser and dependency network (and both seem to give rather logical results) however, when I try to browse lift chart or classification matrix I get an empty class.matr. and a lift chart of a single 45 degree line.
Am I missing a step, or must I do some fine-tuning on (what) parameters?
I am a bit confused for the model evaluation (lift chart), should we map all the columns for both the mining structure and the case table? I mean for those predictive models, we have a predict column, shouldnt we ignore the mapping of the predictive column between the mining structure and the case table? But it seemes we are not allowed to miss the predictive column mapping between the mining structure and the case table.
Why is that? Could any experts here give me some explanation on that?
Hope my question is clear for your help.
Thanks a lot and I am looking forward to hearing from you shortly.
I bought the book €œData Mining with SQL Server 2005€?, but I can€™t find the solution to a problem I have.
I want to retrieve from C# the logistic regression Attribute Value (AV) Scores for the Logistic Regression Algorithm. I can see the Scores from the Microsoft Logistic Regression Viewer (the same of Neural Network Viewer), but I cannot retrieve them via DMX, OLEDB or similar.
Otherwise, is there a formula that I can use to compute that score from the coefficient, support, or probability values of the Attribute Value pair (I can read this values from DMX)? I can access to them via DMX:
with a query like
Hi, I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output) The Lift Chart/Mining Legend in Analysis Services shows three results €“ Score, Population Correct (%), and Predict Probability (%)
Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?
However, I can€™t work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:
Model Score Pop Correct Pred Probability Decision Trees 0.83 76.59% 54.28% Neural Network 0.75 67.63% 50.05% Ideal Model 100.00%
Can anyone help with this and give a detailed explanation?
set @v_dbquery = char(39)+'SELECT [ProspectAlternateKey], [FirstName], [LastName], [MaritalStatus], [Gender], [YearlyIncome], [TotalChildren], [NumberChildrenAtHome], [HouseOwnerFlag], [NumberCarsOwned] FROM [dbo].[ProspectiveBuyer]'+char(39);
set @v_query = 'SELECT [TM_Cluster].[Bike Buyer], t.[ProspectAlternateKey], PredictProbability([TM_Cluster].[Bike Buyer]) From [TM_Cluster] PREDICTION JOIN OPENQUERY([Adventure Works DW],@v_dbquery) AS t ON [TM_Cluster].[Marital Status] = t.[MaritalStatus] AND [TM_Cluster].[Gender] = t.[Gender] AND [TM_Cluster].[Yearly Income] = t.[YearlyIncome] AND [TM_Cluster].[Total Children] = t.[TotalChildren] AND [TM_Cluster].[Number Children At Home] = t.[NumberChildrenAtHome] AND [TM_Cluster].[House Owner Flag] = t.[HouseOwnerFlag] AND [TM_Cluster].[Number Cars Owned] = t.[NumberCarsOwned]' -- print @v_query
set @full_query = 'select * from openquery (DMserver,'+char(39)+ @v_query +char(39)+')' ;
I've been playing around with the association mining model in SQL server 2005 and built a market-basket analysis of my data that I'm pretty happy with. The next task for me is figuring out how to run DMX queries against the data that I've just mined, so we may possibly use it in a web based application. This wouldn't necessarily be a difficult problem (and still may not be), but every example I've seen for the Mining Model Prediction Designer uses relational databases and I built my mining model off OLAP. Therefore, my predictable attribute is nested and when relating the mining model structure to the relational database that the cube was built off always gives me an error:
"Errors in the high-level relational engine. The 'CompanyName' column could not be found in the top-level clause of the SHAPE statement."
What I would like to do, and I'm not really even sure how I should structure any of my queries, is feed the model a product and have it return a listing of all the products it predicts. Currently, I've only been able to get the designer mode to process a singleton query, and even that didn't return any useful data. I know that this probably can be done pretty easily so any advice you may be able to offer would be greatly appreciated!!
So you may better understand my question, my association mining structure hierarchy looks as this..
[Model] ProductRecommend
With that in mind, I'm trying to perform a query simliar to this:
PredictProbability([ProductRecommend].[Product].[PRODUCTCLASSID]), <---- Throws Error for PredictProbability syntax no matter what I try to get to [PRODUCTCLASSID]
(SELECT [PRODUCT] FROM [ProductRecommend].[Product])
I would like to use analysis services to analyze stock prices.
I want to find conditional probabilities: P (YpriceChg >= 10% s.t. Ydate between A and B| X Price Chg >= 20%)?
€¦ Like given a price change of X percent or greater, predict the probability of a price change of Y percent or greater, within a specified time window (like 2 days, 3 months etc.).
I also want to add a support filter, like:
N > 30 cases (i.e., there have been at least 10 instances of a 10% or greater price change, for the chosen time window)
I have a database of prices, monthly, daily, etc. I also have a number of cols that compute statistics such as pChg1M, pChg-1M, vChg1d. Like price chg 1 month forward, price change 1 month backward, volumeChg1d forward. Ideally, I would like to minimize the column flags necessary for the experiment. Can you offer some hints, as far as setting up appropriate columns/flags and choosing a algorithm (maybe decision trees, association rules, or NB)?
After having built a decision tree model to predict a boolean output attribute using 64-bit SQL Server 2005 (build 9.0.3054), we have observed that predictions for some cases are being done at non-leaf nodes in the tree.
Specifically, after executing a prediction join which returns:
and comparing the values of PredictNodeID(MiningModel.OutputAttribute) with the mining model content column [NODE_UNIQUE_NAME] to determine the actual "rule" used to make the case-level prediction.
We have observed that for a subset of cases, predictions are being made at nodes in the tree that are not leaf nodes. Specifically, predictions are being made at a node that is 3 levels deep. The leaf nodes below this inner-tree node are 2 levels further down the tree.
Also supporting the fact that that predictions are being made at this non-leaf node is that the PredictProbability corresponds exactly with the output attribute distribution at this non-leaf node.
In this particular application, we would have obtained better results if the predictions were made at the leaf-nodes.
A few questions: 1. Why are predictions with decision trees made at non-leaf nodes? 2. Is there a way to "force" predictions to occur at leaf nodes via DMX?
When I export the report in excel format the chart is displayed as picture. I want it to be displayed as editable chart.Does Office Writer work in this situation and did anyone use Office Writer to accomplish same type of problem.Is there any other method or product we can use instead of the office writer.
I need to create a chart with the following features
1) Bar chart that has data for 3 years (3 series) 2) Line chart that has the same data as per the above points on the bar chart but this is a running total. (3 series) 3) These data points are for the 12 months 4) there should be a secondary axis for the cumulative one
i don't know weather is it possible or not..but Can any One tell's me How can i refresh the Chart Values.. acutally what's happening..
i have two Chart in a reprot .. One is Main Category and other one is SubCategory... acutally what i want.. in Main Category chart sum of Quantities of Subcategory values comes in bar or any other format.. and when i click on Main chart any bar it's refresh the other chart and return the result of subcategory under that main category in details...
i don't know is this possible .. acutally i m very new in reproting.. infact that's my first report.. so i want to do this.. if any article or any help anyone can provide me..
I've generated some mining models against an OLAP data source (dimension). However, when I go to generate a lift chart, it seems that the only data source that can be used for input is the data source view (the relational database). Is that right? Or is there something I'm missing here.
I was figuring I'd use one slice to train the model, then another slice to test the accuracy of the results. But right now it's looking like I can't do that.
I have two problems while trying to train a neural network. My network have 10 continuous input ad 1 discrete output (3 states)
The parameters I chose are : -Hidden node ratio 10 -Holdout percentage 10
The others are default.
First,when i train it thanks to BI dev studio, the training is very fast (less than 5 seconds) and the results compared with the training set are bad (at least 30% of errors). Is there a way to improve the training (I don't care about the time required to train if it works)?
General data mining books talk about NN taking inputs which are between -1 and 1. Even Jamie's book says that's what it generally receives. I don't think this is a requirement for the Microsoft algorithm, but I wanted to ask if it was a best practice. If you're feeding it something like home values where 99% of homes are under $1 million you can use some normalization trick so that mansions don't skew the data. But if your data doesn't need such normalization, is there any need to normalize it to the -1 to 1 range?
Also, is the Microsoft algorithm sensitive to the relative size of different inputs? For instance, if InputA is home size (500-50,000 square feet) and InputB is months unoccupied (0-24 months), does that cause the Microsoft NN to weigh home size more heavily?
I created a test table (name - "Nset") with the columns: id (int), n1 (float), n2 (float), n3 (float) and c1 (varchar). Then filled a table the followings information: id n1 n2 n3 c1
1 0,1 0,1 0,6 one
2 0,2 0,1 0,5 one
3 0,7 0,5 0,1 two
4 0,4 0,9 0,3 two
5 0,5 0,1 0,5 three
And created a neural network with tuning by default. "id"-field is the key. n1, n2 and n3 are inputs. c1 - predict.
Then i tryed predict query, like:
(SELECT 0,5 AS [n1], 0,1 AS [n2], 0,5 AS [n3]) AS t
The result is "three". This is correct. And some other tests appeared correct.
But, when I filled the column c1 with numerical values (one = 1, two=2, three=3) and changed type to int, a predict query left off to work correctly.
Previous query return 4.
And other tests showed that a value returned large on unit.