Data Mining Model Viewer Of Decision Tree Out Of Memory Error
May 18, 2007
We've successfully processed a large decision tree model in SQL Server 2005. When I try to view the tree in the mining model viewer, I get the following error:
TITLE: Microsoft Visual Studio
------------------------------
The tree graph cannot be created because of the following error:
'Exception of type 'System.OutOfMemoryException' was thrown.'.
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%u00ae+Visual+Studio%u00ae+2005&ProdVer=8.0.50727.42&EvtSrc=Microsoft.AnalysisServices.Viewers.SR&EvtID=ErrorCreateGraphFailed&LinkId=20476
The link provides no other documentaiton on the error.
We're using 64-bit SQL on a Dell Workstation running XP-64 with 16GB of memory. From my view of things we aren't close to running out of memory. Since the model processed and the error occurs when viewing the model, is this a problem with Visual Studio and nont necessarily Anlaysis Services?
I get the following error when I try to load the mining model in the mining model viewer
Query (1, 6) The '[System].[Microsoft].[AnalysisServices].[System].[DataMining].[NeuralNet].[GetAttributeValues]' function does not exist.
I get a similar error when I try to load the Load Mining Accuracy Chart
Failed to execute the query due to the following error:
Query (1, 6) The '[System].[Microsoft].[AnalysisServices].[System].[DataMining].[AllOther].[GenerateLiftTableUsingDatasource]' function does not exist.
Hello, I have a table (in Access) with about 30 fields and 1,700,000 records. I had created a mining model in AS2005 with only one key (the autonum column called ID) and other attributes marked as Input and/or predict. When processing the model, it finish (after 15 min.) with an error: 3183 "Not enough space in temporal disk" After some search , I encountered that is close related to the memory asigned to the tempdb. I tried to increase the size of tempdb but it is imposible, moreover, it starts with 8MB but it is autosized when needed.
I don't know how to solve this issue. Or, if it is a question of memory/disk space management (I have 100GB of free space in disk).
I tried the same model changing the KEY (I assign StudyID as key) then with the same data but 60,000 StudyIDs it is ok, so the mining model is ok (no nested tables, no case, too easy for getting a memory error)...
Please, can anyone recommend a possible solution for this issue?. Many Thanks.
I am going through the data mining web control viewer tutorial and its going great. I have been able to build and setup the viewer. The problem I am having is when I publish the site out to my web server, it gives me the following error:
Code Snippet
Error: Either the user, <domain><computerName>$, does not have access to the prospectDataMining database, or the database does not exist.
When I debug this on my local machine via Visual Studio 2005, it works GREAT! It is just when I publish the site to the web server.
I have a dedicated SSAS server along w/ a dedicated web server. To test, I published the site to the SSAS server to see if it was a connection issue. I received the same error w/ a different <domain><computerName>$.
I looked at trying to put in the optional connection info for the dataMining html tree viewer properties... but apparently dont know how to do that properly. I also checked the IIS directory security and enabled Integrated security.
What am i doing wrong? ANY help is much appreciated.
Can anyone tell me the steps involved in retrieving a model's (decision tree) pmml and use the model content to devleop a web based interface. I am using SQL Server 2005.
I've successfully created and processed a very simple neural network mining model (defined against a cube). However, when I go to the model viewer in BI studio, it displays the following error:
"Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Input string was not in a correct format.."
Any ideas about what's going wrong? This is with SQL Server 2005 SP1.
I tried to find the graphs I saved from Data Mining Model Viewer, but where is it saved? As we see from the mining model viewer we could save the graphs there by clicking the 'save graph' button, but where is the graph?
Really need help for that.
Thank you very much in advance for any guidance and help for that.
I used a decision-tree mining-model to describe and predict fraud. The table contains 1039 records with 775 distinct value of A-number (the calling party). I used 9 columns in the model. SQL Server reports that only 3 columns are significant in predicting the fraud
- BPN_is_too_short (called party-number is too short) - Duration_is_zero - Invalid_area_code
The key-column in A-number, and the predicted column is Is_Fraud with the range of values are only 0 and 1. There's no record with NULL (missing-value) in the column Is_Fraud.
Mining Legend shows in the first split [-] 625 cases of fraud [-] 150 cases of non-fraud [-] 0 cases of missing
In addition to that, Mining Legend shows [-] 79.69% of fraud [-] 19.64% of non-fraud [-] 0.67% Missing
Now when I compare those values, they don't match. (A) 625/775 is 80.645%, not 79.69% (B) 150/775 is 19.355%, not 19.64% (C) 0 cases of NULL (missing value) should imply 0% of missing, not 0.67% of missing
Furthermore in one node (with the split on duration_is_zero), there are 541 cases of fraud and 0 cases of non-fraud. This implies the node is leaf-node. However, Mining Legend shows
514 cases of fraud, 99.35%
0 cases of non-fraud, 0.33%
[F] 0 cases of missing, 0.33%
My questions (1) Why the values don't match like in cases A through C ? (2) Why the values don't match even in cases D through F when we have no subtree at all ?
I've searched explanation by reading the mathematical reasoning, entropy, Gini index; but it does not answer the discrepancies of those values and percentages in the Mining Legend.
Well... As I said in other topics, I'm doing a clustering plugin for text mining. I'm facing many problems and, with your help, solving them one by one.
First of all, thanks a lot again.
Well... I've made a clustering function that is actually working very well. But I'm exporting its results to a log file I use as an algorithm trace for debugging.
My clustering method returns a vector containing information of what cluster each register belongs. For instance:
vector[0] = 1 -> The register of index 0 belongs to cluster 1.
vector[1] = 9 -> The register of index 1 belongs to cluster 9.
vector[2] = 2 -> The register of index 2 belongs to cluster 2.
...
And so on.
But... I know that none of the Navigation methods receives a structure like this one discribed above. I only use it to log the results to debug the algorithm.
But how to pass this information (what register (or test case) belongs to what cluster) to the Navigation ?
Thanks a lot again, and any help will be very appreciated.
I have MS Time Seeries model using a database of over a thousand products each of which has hundreds of cases. It amazingly takes only a few minutes to finish processing the model, but when I click Mining Model Viewer to view the models, it takes many hours to show up. Once the window is open, I can choose model for different products almost instantly. Is this normal?
In order to setup my forecasting mining model I have created a special view that runs against my fact table and creates time series on the level I need.
Code Snippet Select DFUKEY, DATE, QTY from Dim_FACT where DFKEY like '020%'
So I get the following input fr my model:
time series key (e.g. DFUKEY)
date (time key)
QTY (to be predicted)
For testing purposes I created a small view (similar to AdventureWorks) that only contained a few time series. The model was created and processed in ~2 minutes or less. The viewer came up almost immediately and I was able to see results.
Now my real view has about 25000 time series I need a forecast for and that I also like to review in the viewer. If I create a mining model against that bigger view the processing takes ~15m or so and the viewer is likely to time out.
The worst part thought is when I try to get the forecast for a time series (see query below) it takes minutes before the answers come back.
I'm using SQL Server 2005 Standard Edition, and when I try to process a Decision Tree with more or less 50 input variables I get the following warning:
"Informational (Data mining): Automatic feature selection has been applied to model, TREE_2 due to the large number of attributes. Set MAXIMUM_INPUT_ATTRIBUTES and/or MAXIMUM_OUTPUT_ATTRIBUTES to increase the number of attributes considered by the algorithm."
I've tried to set MAXIMUM_INPUT_ATTRIBUTES to 10 and then there's an error saying: "The 'MAXIMUM_INPUT_ATTRIBUTES' data mining parameter is not valid for the 'TREE_2' model."
I have a framework 2.0 winforms application that uses the data mining viewer controls. I upgraded the project to visual studio 2008 and compiled it under framework 2.0. It compiles fine, but when the form with the TimeSeriesViewer control loads, the application throws the following exception:
System.Reflection.TargetInvocationException was unhandled by user code Message="Unable to get the window handle for the 'AxChartSpace' control. Windowless ActiveX controls are not supported." Source="System.Windows.Forms" StackTrace: at System.Windows.Forms.AxHost.InPlaceActivate() at System.Windows.Forms.AxHost.TransitionUpTo(Int32 state) at System.Windows.Forms.AxHost.CreateHandle() at System.Windows.Forms.Control.CreateControl(Boolean fIgnoreVisible) at System.Windows.Forms.Control.CreateControl(Boolean fIgnoreVisible) at System.Windows.Forms.AxHost.EndInit() at Microsoft.AnalysisServices.Viewers.TimeSeriesViewer.InitializeComponent() at Microsoft.AnalysisServices.Viewers.TimeSeriesViewer..ctor() at RMS2.UI.DecisionSupport.ShowModel(String modelName, Int32 tabIndex) in C:UsersDougDocumentsRmsIIRmsIIRMS2.UIDecisionSupport.cs:line 72 at RMS2.UI.DecisionSupport.DecisionSupport_Load(Object sender, EventArgs e) in C:UsersDougDocumentsRmsIIRmsIIRMS2.UIDecisionSupport.cs:line 42 at System.Windows.Forms.Form.OnLoad(EventArgs e) at System.Windows.Forms.Form.OnCreateControl() at System.Windows.Forms.Control.CreateControl(Boolean fIgnoreVisible) at System.Windows.Forms.Control.CreateControl() at System.Windows.Forms.Control.WmShowWindow(Message& m) at System.Windows.Forms.Control.WndProc(Message& m) at System.Windows.Forms.ScrollableControl.WndProc(Message& m) at System.Windows.Forms.ContainerControl.WndProc(Message& m) at System.Windows.Forms.Form.WmShowWindow(Message& m) at System.Windows.Forms.Form.WndProc(Message& m) at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m) at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m) at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam) InnerException: System.AccessViolationException Message="Attempted to read or write protected memory. This is often an indication that other memory is corrupt." Source="System.Windows.Forms" StackTrace: at System.Windows.Forms.UnsafeNativeMethods.IOleObject.DoVerb(Int32 iVerb, IntPtr lpmsg, IOleClientSite pActiveSite, Int32 lindex, IntPtr hwndParent, COMRECT lprcPosRect) at System.Windows.Forms.AxHost.DoVerb(Int32 verb) at System.Windows.Forms.AxHost.InPlaceActivate() InnerException:
The control is being added programatically, as in the wiinforms samples. Can anyone suggest a workaround? Thank you in advance.
I am trying to write some DMX queries to create and populate a mining model as a SSMS analysis services project. I followed the following steps:
1. Create the mining model using a CREATE MINING MODEL ... query.
2. Followed by the - INSERT INTO MINING MODEL ... query, which fetches prediction data from another mining model to populate the mining model.
3. I now want to use this new model for prediction, which requires processing the mining model first. When I process the model, it throws the following error:
Error (Data mining): The 'XYZ_Structure' structure does not contain bindings to data (or contains bindings that are not valid) and cannot be processed.
Please suggest if I am making a mistake in the above procedure. I will appreciate all help in overcoming this issue.
Auxilliary question: How do I process the mining model programmatically?
I have successfully created and processed a mining structure (decision tree). However, when I try to generate the tree graph, I am getting the following error:
TITLE: Microsoft Visual Studio ------------------------------ The tree graph cannot be created because of the following error: 'Object reference not set to an instance of an object.'. For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%u00ae+Visual+Studio%u00ae+2005&ProdVer=8.0.50727.762&EvtSrc=Microsoft.AnalysisServices.Viewers.SR&EvtID=ErrorCreateGraphFailed&LinkId=20476 ------------------------------ ADDITIONAL INFORMATION: Object reference not set to an instance of an object. (Microsoft.AnalysisServices.Viewers) ------------------------------ BUTTONS: OK ------------------------------
...with the following technical details
=================================== The tree graph cannot be created because of the following error: 'Object reference not set to an instance of an object.'. (Microsoft Visual Studio) ------------------------------ For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%u00ae+Visual+Studio%u00ae+2005&ProdVer=8.0.50727.762&EvtSrc=Microsoft.AnalysisServices.Viewers.SR&EvtID=ErrorCreateGraphFailed&LinkId=20476 =================================== Object reference not set to an instance of an object. (Microsoft.AnalysisServices.Viewers) ------------------------------ Program Location: at Microsoft.AnalysisServices.Viewers.TreeViewerBase.GetTooltipText(TreeGraphNode treeNode) at Microsoft.AnalysisServices.Viewers.TreeViewerBase.predictionTreeComboBox_SelectedIndexAction() Has anyone encountered this same error?
I am trying to model data in analysis services with the Advance Create Mining Model function in the excel addin. I am having trouble creating an association model that works like the Associate button above the Advanced button.
The format of my data is like this
OrderID Product
100 Bike
100 Helmet
100 Shoes
200 Helmet
200 basketball
200 Bat
300 Shoes
300 Socks
The associate button works perfectly since it asks me which column is the transaction id (orderid) and which column I am trying to predict (product). The advanced create mining model asks me to determine what the columns are...
OrderID=key Product=Input+Predict?
When I run the advance create mining model associate, I get a browser that gives me no rules and the support for only one item itemset (each product but no combination of products).
Does anyone know what I have to do to get it to work like the associate button?
I perform data mining on all products and a specific product category. Do I need to create 2 data source views, one for all products and the other one for the specific product category? Afterward, I run the Data Mining Wizard 2 times to create 2 mining structures. I also need to add the same mining model (e.g. Bayes, Cluster) to each of these mining structures. Is there any simple way to do it?
I am trying to run one of the mining models from the book "Delivering BI using SQl Server 2005" but I am running into "Decision Trees found no splits for model". The mining structure has 4 columns, the fourth one being marked as "Predict Only". My Cube slice for the model has sufficient data in the cube. I am lost.. Help!!
While recently working with several mining models, I came across something that struck me as pretty odd - and I'm hoping to find an explanation for the behavior.
Consider the following setup:
A single table in the relational database represents the only case table A single, continuous column is the predictable A mining structure has been created
The mining structure contains a single model, based on the MS Decision Trees algorithm Input columns were selected for the model via the BI Studio wizard (i.e., those provided via the "Suggest" button) The structure has been fully processed Now, the interesting parts:
I view the scatterplot for the mining model, under the Mining Accuracy Chart tab Back on the Mining Structure tab, I delete one of the input columns I add the same column back into the structure The structure is fully processed again When I view the scatterplot for the mining model, under the Mining Accuracy Chart tab, a different set of data points are presented for the model predictions A different set of decision trees under the Mining Model Viewer tab confirms thisHow could different patterns have been found this second time around, even though all of the input columns were the same (as well as the training cases)?
(Note: I encountered this situation while creating a new mining model that was identical to an existing one. Even though the models received the exact same inputs and training cases, they yielded different results. I was able to reproduce the behavior by using steps 1-6 above, though.)
Can someone provide some insight on this behavior, or some kind of explanation of what may be happening?
I wanted to use the Decision Tree to show a result..... after i configure the Mining Structures..... and set all the input.... my decision tree shows only until level 2..... i have 3 input and one PredictOnly column.....where is the other input?
Say.... i have House Owner, Marital Status, Num Cars Owned and Number Of Children(PredictOnly)
my Tree only shows All ---- > Marital Status when i input all 3 together...... the other 2 doesn't seems to show.
wat should i do?? my database in SQL Server and the other keys are all correct and deploying finely.....why is this happening.....?
i'm a newbie in this software.......so any pro here can plz help me if there's actually something that i might have missed out along the way.......
Can we represent the Decision Tree in a programatically way in an .NET application? I understand that the outcome of a Decision Tree model can be integrated into an .NET application but not sure if we can also visualize it. Does MS SQL Server support any API to render such a tree?
I have got a lot of results like the following two nodes:
All Existing Cases: 1035298 Missing Cases: 1604 Y = 3,214,966,177,062,520,000,000.000
a >= -0.9822378254 and < -0.7867621803 Existing Cases: 45291 Missing Cases: 17 Y = 9,491,528,329,086,450,000,000.000
Every node of the tree is as odd as this. I checked the training data and found there are 5 bad points with extraordinarily high values of Y. There are over a million points, how can these five points screw up the entire analysis.
I do have good results for other predicted parameters even though they also bad points.
My question is how to make a tree from the case above I mean what method we should use to split the tree. (Mannually counting) I hope anyone could help me by explaining i details.Because i want to make some analysis how microsoft decision tree works exactly.So Please explain me the process to build the tree completely with the method.
Small problem here. I have successfully installed the Data Mining Web Controls in my pc. I can use it to display the result that i want, the problem is the expand image and collapse image did not show out in the column, by the way I can click the blank column to expand or collapse the tree node, it function completely. How can i display the expand image and Collapse image?
Im working on my minor project for my Undergrad course. I have no earlier experience on working with SQL, im the biggest noob if there ever was one.
For a part of my project i have to design a page using php and sql to query from a big student database selected details(Rank, Sex, Branch) and calculate the industrial placement chances and to construct a multiway decision search tree on SQL(im using WAMP server).
This page is supposed to help new students joining the college decide an ideal branch based on past performances and placement record. A new student will enter his rank and relevant details and the from the decision tree an ideal branch(es) with high placement history will be suggested.
My project assignment reads: "Now from the above prepared data constuct a decision search tree implement it a either using association rules or persistent Objects and store it in secondary storage as shown
Further studies can be done to improve existing decision trees ... data mining bayesian classifier blah blah blah ... "
What i have done till now is create a table in this format:
But this hardly a tree. Rather i had flattened each path of the tree and made it into a table like: [node] -> [node] -> [node] -> [leaf]
I have tried to read some text on how to do this, but its not making sence and most importantly im not sure what im reading is actually going to help me achieve my project goals. Right now stranded reading random articles. I have to do this within 5 days. I have asked people around here some professionals and teachers, noone seems to have done this before. A little help in direction would be greatly appreciated.
I have read some sources about microsoft decision tree algorithm like in claude seidman book, paper about scalable classification over sql databases and paper about learning bayesian network. But i still don't understand and i still didn't get the point on how microsoft decision tree algorithm works exactly when splitting an atribut. Because i have read that microsoft decision tree using Bayesian score to split criteria is it true?
Well, anyone could help me to understand about microsoft decision tree algorithm, please give me details explanation with some example(cases).
well i've read in Claude seidmann book about Data mining with microsoft decision, that the statistical techniques employed to build the decision trees include:
Cart, Chaid and C.45.Could anyone explain to me about cart,chaid and c.45? and how the tree statistical techniques influence the decision tree.