Question About Decision Trees And Neural Networks
May 9, 2006
I have some accounting data, with some transaction attributes and amounts.
I'm using Decision Trees to try and predict the next month's amount for certain combinations of attributes.
I've tried two different structures for the model:
A: one with 9 discrete text input attributes.
B: And another with the same 9 attributes + a avarage Amount for all combinations of the nine attribute for every transaction.
When i've processed them and look in the dependency network, it says that the strongest link for the structure A is attribute "1".
And for the second its the avarage-Amount attribute.
Okey, that seems fine, but the second strongest link in structure B is attribute "2".
Shouldn't it be attribute 1 like in structure A?
Second question, if I run the same data in a Neural Network model, the prediction becomes much worst then the decision tree.
I get many predictions that are negative values even though all training data contains positiv values.
The StDev becomes the same for every row also..
What am I doing wrong with that one. I have alot of transactions and a read somewhere that a Neural Network should work better than a decision tree in a case similar to mine.
The score in the "Lift chart" for the Neural Network model becomes 0,00 and for Decision Trees with the same data I get around 110.
View 1 Replies
ADVERTISEMENT
Feb 6, 2007
I created a test table (name - "Nset") with the columns:
id (int), n1 (float), n2 (float), n3 (float) and c1 (varchar).
Then filled a table the followings information:
id n1 n2 n3 c1
1 0,1 0,1 0,6 one
2 0,2 0,1 0,5 one
3 0,7 0,5 0,1 two
4 0,4 0,9 0,3 two
5 0,5 0,1 0,5 three
And created a neural network with tuning by default. "id"-field is the key. n1, n2 and n3 are inputs. c1 - predict.
Then i tryed predict query, like:
SELECT
PREDICT([Nset].[c1])
FROM
[Nset]
NATURAL PREDICTION JOIN
(SELECT 0,5 AS [n1], 0,1 AS [n2], 0,5 AS [n3]) AS t
The result is "three". This is correct. And some other tests appeared correct.
But, when I filled the column c1 with numerical values (one = 1, two=2, three=3) and changed type to int, a predict query left off to work correctly.
Previous query return 4.
And other tests showed that a value returned large on unit.
Is this correct?
Thanks.
View 8 Replies
View Related
Jan 17, 2007
1) Scaling of Inputs
Is the standarization of the inputs done automatically when running the Microsoft Neural Network algorithm or I should be transforming the variables before running the algorithm?
2) Predicted Probabilities
How do I create a table with the actual predicted probabilities of the model for each observation? In the Mining Model Prediction tab the output would be either 0 or 1, my question is how can I obtain the actual value of the estimated probability?
View 4 Replies
View Related
Jun 29, 2006
I am getting negative predictions (continuous) from a neural network model that has been trained on data that only contains positive values or zeros (no nulls).
Is there a setting that can limit the lower end of the output range to zero?
View 1 Replies
View Related
Jun 13, 2007
I am trying to get familiar with Microsoft neural networks to predict property prices. The results are better but I wanted to amend the default parameters passed to the neural network.
So on MINING MODEL TAB when I right click and go into SET ALGORITHM PARAMETERS, I can't see any parameters there, if I try to enter a parameter for example MAXIMUM_STATES and process the model I get the following error message
"The 'maximum_states' data mining parameter is not valid for the 'My Model' model"
I also added a decision tree model to the same structure and when go into SET ALGORITHM PARAMETERS pop menu it comes with many pre populated parameters with default values.
My question is that why I am unable to add parameters to the NEURAL NETWORK and why it does not come with pre populated parameters like DECISION TREES.
Your help will be much appreciated.
View 3 Replies
View Related
Nov 6, 2006
I am studying the behavior of 200.000 clients. With the use of decision trees I would like to know if my clients will abandon our service or not. I use a training set of 21.822 clients and I use a predict variable "aband" wich is a discrete variable and it can be 0 or 1. In my training set i have 21.597 cases in which aband is 0 and 255 cases in which aband is 1. Looking at the classification matrix obtained using as input table a testing set (unselected data) I can see that my decision tree doesn't recognize the cases in which aband is 1. Here is the Classification Matrix:
Counts for Dati Training on [Aband]
Predicted 0 (Actual) 1 (Actual)
0 21597 225
1 0 0
What should I do?
Chiara
View 3 Replies
View Related
May 18, 2006
I would appreciate answers to the following doubts I have regarding Decision trees, CONTAINS and using CONTAINS in a DMX query:
1. Does MS decision tree work only off equality/inequality conditions for the nodes? Is it possible to use a predicate as the branch criteria for a node?
2. Can the T-SQL predicate CONTAINS(...) be used in a DMX query? I need to check if a column-value is a substring of another column and create an intermediate column that will enable me to construct a decision tree with the phrase-present/absent branch.
3. Can CONTAINS(...) be used in a select clause? Like -
SELECT CONTAINS(JAT.column1, '"Good day"')
FROM JustAnotherTable;
4. Does CONTAINS(...) support both arguments to be column references? Or, is it mandatory that the pattern (argument #2) has to be a literal string or a variable? E.g.: I need to know the validity of the following expression -
SELECT * FROM JustAnotherTable JAT
WHERE CONTAINS(JAT.column1, JAT.column3);
View 1 Replies
View Related
Aug 3, 2007
Hi,
I'm new to data mining, and have created an MS decision trees model. The model has the columns age, call outcome, call reason, country name, employee name and gender - all as inputs.
In the mining model viewer, I only get nodes for the age, despite having data for all the other columns.
Can anyone help?
Thanks
Jeremy
View 12 Replies
View Related
Dec 8, 2007
Hi,
I'm interested in understanding how the parametes work in the MS Decision Trees algorithm.
As far as I can tell, the MINIMUM_SUPPORT and COMPLEXITY_PENALTY parameters both control the number of splits and hence the depth of the tree.
Unfortunately the BOL descriptions are very brief - so can anyone tell me the difference between these 2 parameters?
Thanks
Jeremy
View 1 Replies
View Related
Mar 19, 2007
Hello.
I am trying to build a decision tree to predict prices. I have created the tree and looked at the lift charts, but I have not seen any of the traditional statistics I am used to from other programs (R-Squared, F statistics, etc.).
Does anyone have an example of how they calculated R-Squared for a decision tree on a continuous variable?
Thanks,
Brian
View 9 Replies
View Related
Dec 13, 2006
Hello,
I installed the bike buyer example and i am learning the DMX language. Now i wrote the following query (using MS decision trees):
SELECT
T.[Last Name],
[Bike Buyer],
PredictProbability(Predict([Bike Buyer])) AS [Probability]
From
[v Target Mail]
PREDICTION JOIN
OPENQUERY
(....... And so on..)
Now the result is surprising to me. In the resulttabel all the probabilities are equal.
Bike Buyer Probability
1 0.99994590500919611
0 0.99994590500919611
0 0.99994590500919611
0 0.99994590500919611
0 0.99994590500919611
1 0.99994590500919611
and so on.
Now i am wondering what predictProbability means. I thought that PredictProbability meant the probability that the prediction is correct. Now all the probabilities are the same and the input is different. Can somebody tell me what PredictProbability means or am I using it wrong?
Thanx in advance,
Joris Valkonet
View 6 Replies
View Related
Sep 12, 2007
In a decision tree algorithm, is there a known way to force a branch at a top level? For exmaple, I have 30 known decision patterns that are going to be completely different and I don't want them to intermingle. I wanted to force a branch at the top node on one of the 30 patterns so I wouldn't have to create 30 mining models per client.
Brian
View 4 Replies
View Related
Jul 26, 2007
How is the value of Prediction Probability calculated in the context of decision trees?
View 7 Replies
View Related
Dec 7, 2006
Hi,
I am using MS Decision Trees algorithm and for a specific model i get the above warning.As a result of that i dont get any splits in my tree. Is there anything i can do to avoid this?
Thank you for reading
View 1 Replies
View Related
Jan 5, 2007
Hi,
I am trying to run one of the mining models from the book "Delivering BI using SQl Server 2005" but I am running into "Decision Trees found no splits for model". The mining structure has 4 columns, the fourth one being marked as "Predict Only". My Cube slice for the model has sufficient data in the cube. I am lost.. Help!!
Regards
View 4 Replies
View Related
Dec 12, 2007
While recently working with several mining models, I came across something that struck me as pretty odd - and I'm hoping to find an explanation for the behavior.
Consider the following setup:
A single table in the relational database represents the only case table
A single, continuous column is the predictable
A mining structure has been created
The mining structure contains a single model, based on the MS Decision Trees algorithm
Input columns were selected for the model via the BI Studio wizard (i.e., those provided via the "Suggest" button)
The structure has been fully processed
Now, the interesting parts:
I view the scatterplot for the mining model, under the Mining Accuracy Chart tab
Back on the Mining Structure tab, I delete one of the input columns
I add the same column back into the structure
The structure is fully processed again
When I view the scatterplot for the mining model, under the Mining Accuracy Chart tab, a different set of data points are presented for the model predictions
A different set of decision trees under the Mining Model Viewer tab confirms thisHow could different patterns have been found this second time around, even though all of the input columns were the same (as well as the training cases)?
(Note: I encountered this situation while creating a new mining model that was identical to an existing one. Even though the models received the exact same inputs and training cases, they yielded different results. I was able to reproduce the behavior by using steps 1-6 above, though.)
Can someone provide some insight on this behavior, or some kind of explanation of what may be happening?
Thanks,
Joe Miller
View 3 Replies
View Related
Sep 29, 2015
I followed the tutorial posted at [URL] ...
Everything was ok until the last step where I had to process the mining structure which resulted in a warning
"Informational (Data mining): Decision Trees found no splits for model, Tbl Decision Tree Example."
What does this error mean? How do I resolve it? Also, I only see the first level in the Mining Model Viewer, I don't see the levels 2 and 3.
View 2 Replies
View Related
Oct 10, 2007
I am wondering the best way to go about a task I have been assigned. We have two similar websites but each is located on a different network. One network is secure so it cannot be accessed on the normal WWW. The secure network will contain the master database. I need to write a program or do something with SQL server to retrieve all records from the WWW site and get them onto the secure database. I also in the future will need to update records from the WWW site if they have been updated. What is the easiest way to move data from one network to the other when I cannot connect to both databases simultaneously?
Thanks,
Matt
View 5 Replies
View Related
Apr 8, 2001
How can I copy a database from one server to another when they are not on the same network! I have tried to copy across the backup file from one and attempted a restore but i keep getting an error message (Abnormal execution).
Is there a way to do this! HELP!
View 2 Replies
View Related
Sep 6, 2001
Does anyone know if SQL7 will work with Storage Area Networks(SAN's)? I've read that SQL2000 implements something called a Virtual Interface System Area Network (VI SAN) that allows communication with devices connected via a SAN.
My site is installing a SAN and I need to know if SQL7 can utilize those resources (Storage,etc) and how reliable if so.
Randy
View 1 Replies
View Related
Jan 4, 2006
How to establish connection string for sql server on non-domain network?
View 2 Replies
View Related
May 28, 2008
Hi,
My database looks like:
CategoryID ParentID Title Sort
1 -1 Cars 1
2 1 Honda 1
3 -1 Bikes 2
4 1 Ford 2
5 1 Toyota 3
6 3 Kawasaki 1
How can I retrieve the values in the following order:
1, 2, 4, 5, 3, 6
I have:
WITH MYCTE(categoryID, parentID, Title, Sort)
(
SELECT TOP 1 categoryID, parentID, Title, Sort
FROM Categories
WHERE parentID = -1
ORDER BY Sort ASC
UNION ALL
SELECT c.categoryID, c.parentID, c.title, c.sort
FROM Categories c
INNER JOIN MYCTE cte ON (cte.categoryID = c.parentID)
)
SELECT *
FROM MYCTE
It doesn't seem to work though? Help! hehe
View 4 Replies
View Related
Jun 14, 2006
Hello,
In SQL Servern Books is written:
SQL Server Authentication is provided for backward compatibility only. When possible, use Windows Authentication.
How can Windows Authentication be realized in small networks from two ore more computers running Windows XP?
Having all a application written in VB Net 2005 for example, which connects to a central database on one computer. Where the cost and afford for a domain controller running Windows Server is not necessary.
If Windows Authentication can't be realized or can't be realized easy in such a scenario, and SQL Server Authentication is not supported any more, then SQL Server can't be taken as database server for this scenario, where the focus is at simplicity and low cost.
Regards,
Markus
View 5 Replies
View Related
Dec 6, 2007
Hello there,
I'm currently building up a SQL 2005 Active/Standby cluster in a DMZ. I have three NIC's in each server.
Each NIC is connected to a different network:
192.168.100.1 is the public NIC
10.0.0.1 is the NIC used for communication betwen the cluster nodes (heartbeat)
192.168.200.1 is the admin NIC
I have installed my cluster using the 192.168.100.0 network for public access. This means that my SQL virtual ip is 192.168.100.10
Each server can be administered over the 192.168.200.0 network (admin) and the cluster/sql sever ip is available from the 192.168.100.0 (public) network.
Now for my question: How can I assign a ip address from my admin network (e.g.192.168.200.10) to the existing SQL server cluster to make it available from my admin network while keeping the public ip.
Thanx in advance!
Chris
View 2 Replies
View Related
Oct 26, 2005
Hi, for a new project i'm trying to build a tree structure in SQL using one table with 'Node' & 'ParentNode' fields along with 'title', etc.
Table = Tree
Node : ParentNode : Title : Show_Record
1 0 Root 1
2 1 Child 1
Then i'm trying to get SQL to return that in XML to my Tree Control 'oBout ASP TreeView'.
Now the tree control can accept XML fine as long as it's in a set format, which shouldn't be difficult and should cut my code from 200 lines to one.
However getting SQL to return the table records in XML is proving to be a total nightmare.
I've hunted the web but not getting very far, I've even got a couple of O'Reilly guides but still no luck, so any help would be excellent with this.
I wrote a sql query (basic 'select * from tree for xml raw') which returns the results in RAW XML, but when I run this in Query Analyser it returns the results as one long string broken up with '<' & '>' but gets to the third record and cuts off halfway.
<row node="1" parentnode="0" title="Root" type_image="book.gif" type_expanded="True"/><row node="2" parentnode="1" title="Service Delivery" type_image="page.gif" type_expanded="False"/><row node="3" parentnode="1" title="Business Support" type_image="page.
Anyone know why Query Analyser does that?
Any help in this much appreciated, as you can imagine i'm at my wits end.
:eek:
View 4 Replies
View Related
Jun 30, 2006
Hi! I have created a DMM using Trees. But when I go to the Mining Model Predition tab and select a Predict function, I get this in the criteria column: <Scalar column reference>[, EXCLUDE_NULL|INCLUDE_NULL][, INCLUDE_NODE_ID]. When select Result, I get this error: "An incorrect number of arguments are used in the function at line 3, column 3." I'm predicting a continuous variable.
But when I delete everything except <Scalar column reference> I get this error: "Parser: The syntax for '<' is incorrect."
When I delete everything in the criteria column, I get this: "Query execution failed."
If I change the criteria to "<Scalar column reference>,INCLUDE_NULL, INCLUDE_NODE_ID" I get the error again that the query execution failed.
I'm working from a data set I created. I had no problems with predictions using clustering, but can't seem to get Trees to work.
View 3 Replies
View Related
Mar 6, 2007
hi,
I am using Time series alogorrithm.I just wants to know about the autoregression tree.I am having data like
Studid Date Perf
001 01/01/2007 90
001 02/01/2007 95
001 03/01/2007 89
002 01/01/2007 79
002 02/01/2007 90
002 03/01/2007 95
Like that. when I use my Model Viewer --> Descision Tree --> It shows like
Perf = 90.0084 + 1.02 * Perf(-2) + 0.25 * Perf(-2).
What is this value and how its getting calculated?
View 1 Replies
View Related
Sep 15, 2006
Hi,
Would anyone be able to provide a reference paper on the neural net algorithm implemented in SQL Server 2005 to better understand how it works?
Thanxs for any info.
View 3 Replies
View Related
Jul 9, 2007
I am having trouble setting up my Pull Subscription and I am new to replication.
I have several servers hosting a databased website that will be the same, except for user input and traffic. Quite simply, I need to copy most tables, SPs and data from network to network. I can't use FTP/Web synch ... as I mentioned the networks do not touch eachother or the internet.
On server Web1, it was easy to create a Publication called Pub via the wizard for my database: TheDB. Then on Web1, again, I added a Subscription to the Publication, indicating my second server, Web2, and the same database name: TheDB (I have already backed up and restored TheDB to all my servers). Here's one of the sp's I ran on Web1:
use [TheDB]
exec sp_addsubscription @publication = N'Pub', @subscriber = N'Web2'', @destination_db = N'TheDB', @sync_type = N'Automatic', @subscription_type = N'pull', @update_mode = N'read only'
GO
This is where I feel stuck. Using the wizard on Web2 doesn't allow me to see Web1. So I tried the following on Web2:
use [TheDB]
exec sp_addpullsubscription @publisher = N'Web1', @publication = N'Pub', @publisher_db = N'TheDB', @independent_agent = N'True', @subscription_type = N'pull', @description = N'', @update_mode = N'read only', @immediate_sync = 1
exec sp_addpullsubscription_agent @publisher = N'Web1', @publisher_db = N'TheDB', @publication = N'Pub', @distributor = N'Web1', @distributor_security_mode = 1, @distributor_login = N'', @distributor_password = null, @enabled_for_syncmgr = N'False', @frequency_type = 1, @frequency_interval = 0, @frequency_relative_interval = 0, @frequency_recurrence_factor = 0, @frequency_subday = 0, @frequency_subday_interval = 0, @active_start_time_of_day = 0, @active_end_time_of_day = 0, @active_start_date = 0, @active_end_date = 19950101, @alt_snapshot_folder = N'', @working_directory = N'', @use_ftp = N'True', @job_login = null, @job_password = null, @publication_type = 0
GO
I copied the snapshot folder, ie. 20070709134423, onto CD and moved it into Web2's default replication folder, but I always receive: cannot connect to Distibutor. I've tried using an Alias, as well, but don't understand exactly how I should point that either. I checked the publication's PAL and my Web2 user has rights and is an owner of the Web2 TheDB database.
Any help is appreciated.
Nate
View 10 Replies
View Related
Aug 10, 2006
I have two problems while trying to train a neural network.
My network have 10 continuous input ad 1 discrete output (3 states)
The parameters I chose are :
-Hidden node ratio 10
-Holdout percentage 10
The others are default.
First,when i train it thanks to BI dev studio, the training is very fast (less than 5 seconds) and the results compared with the training set are bad (at least 30% of errors). Is there a way to improve the training (I don't care about the time required to train if it works)?
Second, I decided to train the network using SQL server management studio and I get this error which I can't understand : "Les connexions ad hoc telles que spécifiées dans des clauses OPENROWSET ne peuvent pas être utilisées sur ce serveur". Translated it may be something like "this server can't use ad hoc connections such as specified in OPENROWSET".
My query is :
INSERT INTO MINING model [Associations Learn2]([From Requete1],[From Requete2],
[Keywords1],[Keywords2],[Nb Apparition1],[Nb Apparition2],[Nombre Requete Distincte],[Probabilite],[Titre1],[Titre2],[Type],[Uid])
OPENROWSET
('SQLNCLI.1','Data Source=STAG-XP-EDITION;user=sa;password=***;Initial Catalog=OpenFind_StockagePreNeurone',
'SELECT [From Requete1],[From Requete2],
[Keywords1],[Keywords2],[Nb Apparition1],[Nb Apparition2],[Nombre Requete Distincte],
[Probabilite],[Titre1],[Titre2],[Type],[Uid] FROM associationsLearn2'
)
Could someone explain me the error?
View 1 Replies
View Related
Jun 10, 2007
General data mining books talk about NN taking inputs which are between -1 and 1. Even Jamie's book says that's what it generally receives. I don't think this is a requirement for the Microsoft algorithm, but I wanted to ask if it was a best practice. If you're feeding it something like home values where 99% of homes are under $1 million you can use some normalization trick so that mansions don't skew the data. But if your data doesn't need such normalization, is there any need to normalize it to the -1 to 1 range?
Also, is the Microsoft algorithm sensitive to the relative size of different inputs? For instance, if InputA is home size (500-50,000 square feet) and InputB is months unoccupied (0-24 months), does that cause the Microsoft NN to weigh home size more heavily?
View 1 Replies
View Related
Jun 9, 2006
Hello , using MS Visual studio 2005 , I deployed sql table with NN algorithm , it successfuly deployed . But when I tabbed to "Mining Model Viewer" it gave me the following error :
The following system error occurred: Invalid procedure call or argument.
Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
what can I do ?
View 1 Replies
View Related
Aug 1, 2006
Hello there,
I'm working with Analysis sevices 2005 developer edition. Looking through the documentation i becomes apperent that the NN algorithm takes 255 input attributes by default. This can be changed to any integer value, OK....
My problem is that I want to feed the network with 40000 input variables. In order to do so, I will have to do a select:
SELECT fld1, fld2, ...... fld39999, fld40000
FROM tblSometable
However, this is not possible, as the books online describes it is only possible to return 4096 columns from a select statement.
Question : How do I populate a NN in AS2005, with nmore than 4096 inputs ?!
View 3 Replies
View Related