Neural Net Algorithm
Sep 15, 2006Hi,
Would anyone be able to provide a reference paper on the neural net algorithm implemented in SQL Server 2005 to better understand how it works?
Thanxs for any info.
Hi,
Would anyone be able to provide a reference paper on the neural net algorithm implemented in SQL Server 2005 to better understand how it works?
Thanxs for any info.
I have two problems while trying to train a neural network.
My network have 10 continuous input ad 1 discrete output (3 states)
The parameters I chose are :
-Hidden node ratio 10
-Holdout percentage 10
The others are default.
First,when i train it thanks to BI dev studio, the training is very fast (less than 5 seconds) and the results compared with the training set are bad (at least 30% of errors). Is there a way to improve the training (I don't care about the time required to train if it works)?
Second, I decided to train the network using SQL server management studio and I get this error which I can't understand : "Les connexions ad hoc telles que spécifiées dans des clauses OPENROWSET ne peuvent pas être utilisées sur ce serveur". Translated it may be something like "this server can't use ad hoc connections such as specified in OPENROWSET".
My query is :
INSERT INTO MINING model [Associations Learn2]([From Requete1],[From Requete2],
[Keywords1],[Keywords2],[Nb Apparition1],[Nb Apparition2],[Nombre Requete Distincte],[Probabilite],[Titre1],[Titre2],[Type],[Uid])
OPENROWSET
('SQLNCLI.1','Data Source=STAG-XP-EDITION;user=sa;password=***;Initial Catalog=OpenFind_StockagePreNeurone',
'SELECT [From Requete1],[From Requete2],
[Keywords1],[Keywords2],[Nb Apparition1],[Nb Apparition2],[Nombre Requete Distincte],
[Probabilite],[Titre1],[Titre2],[Type],[Uid] FROM associationsLearn2'
)
Could someone explain me the error?
General data mining books talk about NN taking inputs which are between -1 and 1. Even Jamie's book says that's what it generally receives. I don't think this is a requirement for the Microsoft algorithm, but I wanted to ask if it was a best practice. If you're feeding it something like home values where 99% of homes are under $1 million you can use some normalization trick so that mansions don't skew the data. But if your data doesn't need such normalization, is there any need to normalize it to the -1 to 1 range?
Also, is the Microsoft algorithm sensitive to the relative size of different inputs? For instance, if InputA is home size (500-50,000 square feet) and InputB is months unoccupied (0-24 months), does that cause the Microsoft NN to weigh home size more heavily?
I created a test table (name - "Nset") with the columns:
id (int), n1 (float), n2 (float), n3 (float) and c1 (varchar).
Then filled a table the followings information:
id n1 n2 n3 c1
1 0,1 0,1 0,6 one
2 0,2 0,1 0,5 one
3 0,7 0,5 0,1 two
4 0,4 0,9 0,3 two
5 0,5 0,1 0,5 three
And created a neural network with tuning by default. "id"-field is the key. n1, n2 and n3 are inputs. c1 - predict.
Then i tryed predict query, like:
SELECT
PREDICT([Nset].[c1])
FROM
[Nset]
NATURAL PREDICTION JOIN
(SELECT 0,5 AS [n1], 0,1 AS [n2], 0,5 AS [n3]) AS t
The result is "three". This is correct. And some other tests appeared correct.
But, when I filled the column c1 with numerical values (one = 1, two=2, three=3) and changed type to int, a predict query left off to work correctly.
Previous query return 4.
And other tests showed that a value returned large on unit.
Is this correct?
Thanks.
Hello , using MS Visual studio 2005 , I deployed sql table with NN algorithm , it successfuly deployed . But when I tabbed to "Mining Model Viewer" it gave me the following error :
The following system error occurred: Invalid procedure call or argument.
Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
what can I do ?
Hello there,
I'm working with Analysis sevices 2005 developer edition. Looking through the documentation i becomes apperent that the NN algorithm takes 255 input attributes by default. This can be changed to any integer value, OK....
My problem is that I want to feed the network with 40000 input variables. In order to do so, I will have to do a select:
SELECT fld1, fld2, ...... fld39999, fld40000
FROM tblSometable
However, this is not possible, as the books online describes it is only possible to return 4096 columns from a select statement.
Question : How do I populate a NN in AS2005, with nmore than 4096 inputs ?!
1) Scaling of Inputs
Is the standarization of the inputs done automatically when running the Microsoft Neural Network algorithm or I should be transforming the variables before running the algorithm?
2) Predicted Probabilities
How do I create a table with the actual predicted probabilities of the model for each observation? In the Mining Model Prediction tab the output would be either 0 or 1, my question is how can I obtain the actual value of the estimated probability?
I have read the threads regarding the Neural Network Viewer and I think I have a similar problem. I do have Service Pack 2 installed and I'm running the x64 version of SQL 2005.
I'm building a model from a single relational CASE table. Granted the table has many columns summarized at the customer level but there are it is well formed and has no NULL values (plenty of zero or blank values though). The only time I can get the NN Viewer to work is when I accept the attribute recommendations. It seems once I stray from these recommendation, even if there is still correlation with an attribute, I cannot view the model using the NN Viewer. My latest error message says:
"The provider could not determine the String value. For example, the row was just created, the default for the String column was not available, and the consumer had not yet set a new String value."
I get this message even when all input attributes are Continuous so I'm not sure what String column it is referring to.
Any help is greatly appreciated. I'm in a time crunch and I have sold the client on SQL Server 2005 capabilities. It's a bit embarrasing if I can't get this resolved.
-- Steve
P.S.: I don't recall having any issues with the NN Viewer prior to Service Pack 2 (although others have). Have you done regression testing to test this issue?
I am getting negative predictions (continuous) from a neural network model that has been trained on data that only contains positive values or zeros (no nulls).
Is there a setting that can limit the lower end of the output range to zero?
I am trying to get familiar with Microsoft neural networks to predict property prices. The results are better but I wanted to amend the default parameters passed to the neural network.
So on MINING MODEL TAB when I right click and go into SET ALGORITHM PARAMETERS, I can't see any parameters there, if I try to enter a parameter for example MAXIMUM_STATES and process the model I get the following error message
"The 'maximum_states' data mining parameter is not valid for the 'My Model' model"
I also added a decision tree model to the same structure and when go into SET ALGORITHM PARAMETERS pop menu it comes with many pre populated parameters with default values.
My question is that why I am unable to add parameters to the NEURAL NETWORK and why it does not come with pre populated parameters like DECISION TREES.
Your help will be much appreciated.
In Neural Network model, is that a way to have table column as predictable?
Mary
I have some accounting data, with some transaction attributes and amounts.
I'm using Decision Trees to try and predict the next month's amount for certain combinations of attributes.
I've tried two different structures for the model:
A: one with 9 discrete text input attributes.
B: And another with the same 9 attributes + a avarage Amount for all combinations of the nine attribute for every transaction.
When i've processed them and look in the dependency network, it says that the strongest link for the structure A is attribute "1".
And for the second its the avarage-Amount attribute.
Okey, that seems fine, but the second strongest link in structure B is attribute "2".
Shouldn't it be attribute 1 like in structure A?
Second question, if I run the same data in a Neural Network model, the prediction becomes much worst then the decision tree.
I get many predictions that are negative values even though all training data contains positiv values.
The StDev becomes the same for every row also..
What am I doing wrong with that one. I have alot of transactions and a read somewhere that a Neural Network should work better than a decision tree in a case similar to mine.
The score in the "Lift chart" for the Neural Network model becomes 0,00 and for Decision Trees with the same data I get around 110.
Hi, all,
What can we tell from the lift value of the attribute value in Neural Network? (any so-called threshold of this lift value which identifies whether or not an attribute value is important or whatever to the selecte output attribute value?) I mean with lift value of a particular value, when we describe the characteristics of a segmentaion with Neural network, what can we actually describe it?
I just dont know what can we desribe this for the segmentation by neural network viewer?
I am looking forward to hearing from you shortly and thansk a lot.
With best regards,
Yours sincerely,
Hi, all,
I am confused about the value of Probability of Value 1 or 2 (on a particular attribute value) in Neural Network viewer. E.g. the value of Probability of value 1 is actually very low (the same to the value of Probability of value 2), but why the bar which shows the strength of the probability of these two values are still so strong even stronger than other values of probability of value 1 or 2 based on other attribute values which have a much higher probability of value 1 or 2?
And how does the algorithm calculate the Probability of attribute value in nerual network by the way?
Hope my question is clear.
I am looking forward to hearing from you shortly and thanks a lot in advance.
With best regards,
Yours sincerely,
I am in the process of training a Neural Networks, which could take significant iterations in the process of getting trained. While using other tools like I can visually see the convergence (in terms of error for the model). Is there a way to see any progress while training while using Analysis Server - Neural Network training? It would be useful to see the accuracy, interation number and timeout while in the process of training etc...
Thanks
Rajeev Gupta
Wanted some candid feedback on this idea. Everyone knows that neural nets are a black box in terms of the weights and such it uses. The best you can do is to get an idea how "sensitive" the NN is to each input. I don't know of any example code that's out there to help you do this in an automated manner (correct me if I'm wrong) so I'm thinking about writing a sproc that would help with this task.
Basically, the sproc would take in the mining model's name and the key of one case. Let's say the case and its attributes are:
ZipCode=93901 (the case key)
MedianIncome=99,098
PopulationDensity=1234
AvgTemperature=74.3
Predict likelihood to respond to an offer
For that case, it would go through an iteration per input. First it would test the first input, MedianIncome. It would run 100 predictions using the input values listed above. All the inputs would remain the same throughout except for MedianIncome where it would try out the complete range of that input. Based upon how much the prediction changes, you would have an idea how sensitive the model is to MedianIncome. Then it would move on to the next input and do the same.
When it's done testing each input, it could spit out a dataset listing all the inputs in order of how sensitive the model is to them and a few other stats like min and max prediction.
Thoughts? Improvements? Alternatives?
Hello,
Say that I have 100,000 attributes/feature selections for my SQL Server Neural Network Algorithm.
Customer Attr1 Attr2 Attr3 ..... Atr100000
==============================
Jack 1 0 1 ..... 1
Sam 0 1 1 ...... 0
Mary 1 1 0 ...... 1
Knowing the fact I can't fit those info on a table and SQL Server's Neural Network does not support table prediction . What's an alternative to use Neural Network in SQL Server 2005 to solve my problem?
Please assist!
Mary
I've successfully created and processed a very simple neural network mining model (defined against a cube). However, when I go to the model viewer in BI studio, it displays the following error:
"Execution of the managed stored procedure GetAttributeScores failed with the following error: Exception has been thrown by the target of an invocation.Input string was not in a correct format.."
Any ideas about what's going wrong? This is with SQL Server 2005 SP1.
Greetings,
I have a mining structure that I am using to perform a text-mining
classification task. The mining structure contains three models: a
decision tree, a naive bayes and a neural network.
Both the decision tree and the naive bayes models process without
any problems, but I am having significant difficulties with the neural
network model.
Initially when I processed the model, processing would fail altogether with the following error message:
"Memory error: Allocation failure : Not enough storage is available to process this command"
This was remedied by taking the steps prescribed in
http://support.microsoft.com/kb/917885 (I upgraded to SQL 2005 SP1 and
applied all available hotfixes listed in
http://support.microsoft.com/kb/918222/). This got me to the point
where the model (seemingly) processed correctly after restricting the
MAXIMUM_INPUT_ATTRIBUTES to a relatively low number. So after
processing, I went to try and browse the neural network model and view
the lift chart...
<error>
"Execution of the managed stored procedure GetAttributeScores
failed with the following error: Exception has been thrown by the
target of an invocation.Input string was not in a correct format.."
</error>
(see http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=935340&SiteID=17)
Also when I would attempt to view the lift chart and the
classification matrix the queries would time out with the following
error message:
<error>
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure
GenerateLiftTableUsingDatasource failed with the following error:
Exception has been thrown by the target of an
invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
</error>
Now, my poking around on Technet lead me to believe that this issue
could finally be resolved by uprading to the CTP release of SQL server
2005 SP2. Now I am still encountering problems. When I go to browse the
model in the Neural Network Viewer, I see the correct drop down menus
to select attributes and attribute values in the "Input" and "Output"
panes but I see no data displayed in the "Variables" pane at the
bottom.
Interestingly, while I cannot view the model contents in the
graphical viewer, the mining model contents viewer reveals model
contents that look to be pretty normal for a trained neural network.
Attempts to view the lift chart time out with the error message:
<error>
XML for Analysis parser: The XML for Analysis request timed out before it was completed.
Execution of the managed stored procedure
GenerateLiftTableUsingDatasource failed with the following error:
Exception has been thrown by the target of an
invocation.Microsoft::AnalysisServices::AdomdServer::AdomdException.
</error>
and when I run predictions against the trained NN model in the
"Mining Model Prediction" pane it predicts the same value for every
case in the testing set.
Any thoughts?
Hi!
I bought the book €œData Mining with SQL Server 2005€?, but I can€™t find the solution to a problem I have.
I want to retrieve from C# the logistic regression Attribute Value (AV) Scores for the Logistic Regression Algorithm. I can see the Scores from the Microsoft Logistic Regression Viewer (the same of Neural Network Viewer), but I cannot retrieve them via DMX, OLEDB or similar.
Otherwise, is there a formula that I can use to compute that score from the coefficient, support, or probability values of the Attribute Value pair (I can read this values from DMX)?
I can access to them via DMX:
NODE_DISTRIBUTION -> SUPPORT and PROBABILITY ATTRIBUTE_VALUE...
with a query like
SELECT FLATTENED (SELECT ATTRIBUTE_NAME, ATTRIBUTE_VALUE FROM NODE_DISTRIBUTION WHERE VALUETYPE = ... ) FROM [MyModel].CONTENT WHERE NODE_TYPE ....
Thanks in advance
Regards,
Marco
Does any have a algorithm that can divide A into B without using the divide
sign (/) or the multiplication sign ( * ).
I am new to DM and I am not sure which algorithm would be best to use.
I am trying to build a custom comparitor application that companies can use to compare themselves against other companies based on certain pieces of information. I need to group a company with 11 other companies based on 6 attributes. I need the ability to apply weightings to each of the 6 attributes and have those taken into consideration when determining which 10 other companies each company is grouped with. Each group must contain 11 members, the company for the user logged in and 10 other companies that it will be compared against.
At first I thought that clustering would be a good fit for this but I can not see a way to mandate that each cluster contain exactly 11 members, I cannot see a way to weight the inputs, and I think each company can only be in one cluster at a time which do not meet my requirements.
Any help will be greatly appreciated!
Well, i have read in claude seidman book about data mining that some algorithm inside in microsoft decision tree are CART, CHAID and C45 algorithm. could anyone explain to me about the tree algorithm and please explain to me how the tree algorithm used together in one case?
thank you so much
Use this to check if Luhn has valid check digitCREATE FUNCTIONdbo.fnIsLuhnValid
(
@Luhn VARCHAR(8000)
)
RETURNS BIT
AS
BEGIN
IF @Luhn LIKE '%[^0-9]%'
RETURN 0
DECLARE@Index SMALLINT,
@Multiplier TINYINT,
@Sum INT,
@Plus TINYINT
SELECT@Index = LEN(@Luhn),
@Multiplier = 1,
@Sum = 0
WHILE @Index >= 1
SELECT@Plus = @Multiplier * CAST(SUBSTRING(@Luhn, @Index, 1) AS TINYINT),
@Multiplier = 3 - @Multiplier,
@Sum = @Sum + @Plus / 10 + @Plus % 10,
@Index = @Index - 1
RETURN CASE WHEN @Sum % 10 = 0 THEN 1 ELSE 0 END
END
Peter Larsson
Helsingborg, Sweden
Hello,Do you know if the algorithm for the BINARY_CHECKSUM function in documentedsomewhere?I would like to use it to avoid returning some string fields from theserver.By returning only the checksum I could lookup the string in a hashtable andI think this could make the code more efficient on slow connections.Thanks in advanced and kind regards,Orly Junior
View 3 Replies View RelatedWhat kind of algorithm does the MAX command uses? I have a table that I need to get the last value of the Transaction ID and increment it by 1, so I can use it as the next TransID everytime I insert a new record into the table. I use the MAX command to obtain the last TransID in the table in this process. However, someone suggested that there is a problem with this, since if there are multiple users trying to insert a record into the same table, and processing is slow, they might essentially come up with the same next TransID. He came up with the idea of having a separate table that contains only the TransID and using this table to determine the next TransID. Will this really make a difference as far as processing speed is concerned or using a MAX command on the same table to come up with the next TransID enough? Do you have a better suggestion?
Thanks
Hi All!
I have few questions regarding Clustering algorithm.
If I process the clustering model with Ks (K is number of clusters) from 2 to n how to find a measure of variation and loss of information in each model (any kind of measure)? (Purpose would be decision which K to take.)
Which clustering method is better to use when segmenting data K-means or EM?
Thanks in advance!
Hi.
Does anyone know of or where I can find implementation of these C# algorithm /class libraries:
a) RLS - Recursive Least Square algorithm?
b) MWAR - Multi-resolution Wavelet Auto-regresive algorithm?
c) AR - Autoregresive moving awerage algorithm?
d) EWMA - Exponentially Weighted Moving Average
The .NET framework System.Math class do not seem to have these libraries.
Regards
Shorin
Hi
I want to predict which product can be sold together , Pl help me out which algorithm is best either association, cluster or decision and pl let me know how to use case table and nested table my table structure is
Cust_ID
Age
Product
Location
Income
Thanks
Rajesh Ladda
hi,
i am using sqlserver2005 as back end for my project.
actually we developing an stand alone web application for client, so we need to host this application in his server. he is not willing to install sql server 2005 edition in his sever so we r going by placing .mdf file in data directory of project.
but before i developed in server2005 i used aes_256 algorithm to encrypt n decrypt the pwd column by using symmetric keys.it is working fine.
but when i took the .mdf file of project n add into my project it is throwing error at creation of symmetric key that
"Either no algorithm has been specified or the bitlength and the algorithm specified for the key are not available in this installation of Windows."
please suggest me a solution
Hi,
i'm making my master thesis about a new plug-in algorithm, with the LVQ Algorithm.
I make the tutorial with the pair_wise_linear_regression algorithm and i have some doubts. i was searching for the code of the algorithm in the files of the tutorial and i didn't saw it. I have my new algorithm programmed in C++ ready to attach him, but i don't know where to put him, in which file i have to put him to start to define the COM interfaces? And in which file is the code of the pair_wise_linear_regression algorithm in the SRC paste of the tutorial?
Thanks
Hello friends,
Can u give some idea about the Algorithm in Data Mining for Clustering..
Please reply...
I am trying to predict Revenue gererated by each Person.
My Input like this:
Month Person Revenue
-----------------------------------------
20050101 Person1 $1000
20050101 Person1 $2000
20050201 Person1 $1000
20050101 Person2 $5000
20050201 Person2 $2000
20050201 Person2 $3000
Obviosly for Person1 and 200501 I expect to see on MS Time Series Viewer $3000, correct?
Instead I see REVENUE(actual) - 200501 VALUE =XXX,
Where XXX is absolutly different number.
Also there are negative numbers in forecast area which is not correct form business point
Person1 who is tough guy tryed to shoot me.
What I am doing wrong. Could you please give me an idea how to extract correct
historical and predict information?
Thnak you,
Tim.