Question On Sequence Clustering Algorithm
May 1, 2007
Hi, all experts here,
Thank you very much for your kind attention.
I have a question on sequence clustering algorithm. As generally it is used for sequence analysis especially for web path visiting analysis. Besides that, what else scenarios could we apply this algorithm as well?
Thanks a lot in advance and I am looking forward to hearing from you shortly.
With best regards,
Yours sincerely,
View 5 Replies
ADVERTISEMENT
Oct 29, 2007
Hi All!
I have few questions regarding Clustering algorithm.
If I process the clustering model with Ks (K is number of clusters) from 2 to n how to find a measure of variation and loss of information in each model (any kind of measure)? (Purpose would be decision which K to take.)
Which clustering method is better to use when segmenting data K-means or EM?
Thanks in advance!
View 4 Replies
View Related
Dec 5, 2006
Hi, all here,
Since we are not able to use accuracy chart for Clustering algorithms there. So how can we verify the accuracy of clustering algorithm models here in terms of its classification and regression tasks?
Thank you very much in advance for your guidance and advices for that.
With best regards,
Yours sincerely,
View 12 Replies
View Related
Mar 2, 2007
hi,
I am having data like this
Studid Date Perf
001 01/01/2008 90
001 02/01/2008 89 Cluster 1
001 03/02/2008 91
002 01/01/2008 75
002 02/01/2008 79 Cluster 2
002 03/02/2008 69
I wants to create two clusters cluster1 for studid 001. cluster2 for studid 002.
How to write Prediction Query using clustering algorithm?
View 1 Replies
View Related
Nov 24, 2006
Hi, all experts here,
Thank you very much for your kind attention.
I am having a question about the node_distribution.PRABABILITY. Some of the attribute values though have a small number of support for the specific node, but why it has a big node_distribution.probability even greater than 1? How can the node_distribution.PROBABILITY be greater than 1? How dose SQL Server 2005 data mining engine calculate the node_distribution.PRPBABILITY for its Clustering algorithm? Really confused and need guidance for that.
Thank you very much for your help.
With best regards,
Yours sincerely,
View 7 Replies
View Related
Nov 10, 2006
Hi!
I've read a lot of informoation about the Microsoft Sequence
Clustering algorithm, but the more i read more confused i get. Here's
my doubt:
Can it discover sequences of "tokes" and then group them? Or only compare sequences in order to group them?
Thanks in advance
View 3 Replies
View Related
Aug 14, 2007
I€™m a college student currently studying 10th semester in the Universidad de los Andes, Colombia and I€™m working on a data mining project. I need to use the cluster sequence approach; therefore I need to completely understand how it works. In order to understand it, I need to know which inputs it uses, how the algorithm works and which type of outputs does the approach throw. Do you have any idea where I can find this type of information? and examples?
Any help would be appreciated.
Thank-you for your time.
View 4 Replies
View Related
Oct 10, 2006
I get this message when I deploy my sequence clustering model:
Error 1 Error (Data mining): Duplicate Key Sequence values in an input case for SeqCluster. Ambiguous case(s) may lead to unreliable results. Disambiguate the data (recommended) or increase ErrorLog KeyErrorLimit server parameter. 0 0
I don't have any duplicate keys in my case table. I'm using a date for my key sequence column in my nested table. Is that the problem?
View 3 Replies
View Related
Nov 1, 2006
How do I limit (or partition) a sequence clustering by year? I would like to do it within the sequence clustering mining structure instead of partition at the OLAP cube/
View 1 Replies
View Related
Jul 13, 2006
Hi
I read the paper of sequence clustering. It seems that the major application of the algorithm is for the web site. I was just thinking that can I apply this algorithm on the purchase sequences of credit card data?
If so,please also tell me the difference between sequence clustering and association rules on credit card data application. Although I realize that sequence clustering is a fully probabilistic model and it has the capability of prediction, association rules also give the probabilities of purchasing the other products.
Thanks in advance.
To Wong
View 1 Replies
View Related
Aug 24, 2007
hi
where i can find some data bases to see how the sequence clustering works?.
thanks a lot
View 1 Replies
View Related
Mar 4, 2007
somebody help me??
View 4 Replies
View Related
Dec 18, 2007
I am using Sequence Clustering algorithm. (I've built several models with Clustering algorithm and Decision Trees for this client, which work fine.).
Background: Sequence data must be stored in a nested table, which can have only 1 non-key attribute.
I specify a mining model structure with the nested table key as the datetime, and the nested table discrete prediction column as [sort name] . this builds the model fine.
When I try to process this data mining model, I get Process failed: "Errors in the OLAP storage engine: The sort order specified for distinct count records is incorrect".
Iit may be that OLAP distinct count requests numerical data type, but not from the examples I've seen. Tried this anyway €“ doesn€™t work on numeric either €“ same problem.
Any Suggestions?
View 1 Replies
View Related
Nov 13, 2007
We have 2 env. : Testing and Production, both are running Windows 2003 Enterprise Server with SQL Server 2005. The difference is Testing is NOT running Windows cluster but Production do so, what is the best way to transfer a database from testing to production?
We have another systems that both testing and production are running on NON-cluster and we use backup/restore to transfer the database, can it apply in this case.
And I found that there are a tools called DTC, which can transfer all DB objects from one DB to another, is it a best way to transfer between non-cluster and cluster env.?
View 2 Replies
View Related
May 22, 2002
Does any have a algorithm that can divide A into B without using the divide
sign (/) or the multiplication sign ( * ).
View 1 Replies
View Related
Nov 24, 2006
I am new to DM and I am not sure which algorithm would be best to use.
I am trying to build a custom comparitor application that companies can use to compare themselves against other companies based on certain pieces of information. I need to group a company with 11 other companies based on 6 attributes. I need the ability to apply weightings to each of the 6 attributes and have those taken into consideration when determining which 10 other companies each company is grouped with. Each group must contain 11 members, the company for the user logged in and 10 other companies that it will be compared against.
At first I thought that clustering would be a good fit for this but I can not see a way to mandate that each cluster contain exactly 11 members, I cannot see a way to weight the inputs, and I think each company can only be in one cluster at a time which do not meet my requirements.
Any help will be greatly appreciated!
View 3 Replies
View Related
Jun 8, 2006
Well, i have read in claude seidman book about data mining that some algorithm inside in microsoft decision tree are CART, CHAID and C45 algorithm. could anyone explain to me about the tree algorithm and please explain to me how the tree algorithm used together in one case?
thank you so much
View 1 Replies
View Related
Dec 11, 2006
Use this to check if Luhn has valid check digitCREATE FUNCTIONdbo.fnIsLuhnValid
(
@Luhn VARCHAR(8000)
)
RETURNS BIT
AS
BEGIN
IF @Luhn LIKE '%[^0-9]%'
RETURN 0
DECLARE@Index SMALLINT,
@Multiplier TINYINT,
@Sum INT,
@Plus TINYINT
SELECT@Index = LEN(@Luhn),
@Multiplier = 1,
@Sum = 0
WHILE @Index >= 1
SELECT@Plus = @Multiplier * CAST(SUBSTRING(@Luhn, @Index, 1) AS TINYINT),
@Multiplier = 3 - @Multiplier,
@Sum = @Sum + @Plus / 10 + @Plus % 10,
@Index = @Index - 1
RETURN CASE WHEN @Sum % 10 = 0 THEN 1 ELSE 0 END
END
Peter Larsson
Helsingborg, Sweden
View 20 Replies
View Related
Jul 23, 2005
Hello,Do you know if the algorithm for the BINARY_CHECKSUM function in documentedsomewhere?I would like to use it to avoid returning some string fields from theserver.By returning only the checksum I could lookup the string in a hashtable andI think this could make the code more efficient on slow connections.Thanks in advanced and kind regards,Orly Junior
View 3 Replies
View Related
Dec 7, 2007
What kind of algorithm does the MAX command uses? I have a table that I need to get the last value of the Transaction ID and increment it by 1, so I can use it as the next TransID everytime I insert a new record into the table. I use the MAX command to obtain the last TransID in the table in this process. However, someone suggested that there is a problem with this, since if there are multiple users trying to insert a record into the same table, and processing is slow, they might essentially come up with the same next TransID. He came up with the idea of having a separate table that contains only the TransID and using this table to determine the next TransID. Will this really make a difference as far as processing speed is concerned or using a MAX command on the same table to come up with the next TransID enough? Do you have a better suggestion?
Thanks
View 3 Replies
View Related
Sep 15, 2006
Hi,
Would anyone be able to provide a reference paper on the neural net algorithm implemented in SQL Server 2005 to better understand how it works?
Thanxs for any info.
View 3 Replies
View Related
Jan 10, 2006
Hi.
Does anyone know of or where I can find implementation of these C# algorithm /class libraries:
a) RLS - Recursive Least Square algorithm?
b) MWAR - Multi-resolution Wavelet Auto-regresive algorithm?
c) AR - Autoregresive moving awerage algorithm?
d) EWMA - Exponentially Weighted Moving Average
The .NET framework System.Math class do not seem to have these libraries.
Regards
Shorin
View 2 Replies
View Related
Jul 12, 2006
Hi
I want to predict which product can be sold together , Pl help me out which algorithm is best either association, cluster or decision and pl let me know how to use case table and nested table my table structure is
Cust_ID
Age
Product
Location
Income
Thanks
Rajesh Ladda
View 1 Replies
View Related
Feb 14, 2008
hi,
i am using sqlserver2005 as back end for my project.
actually we developing an stand alone web application for client, so we need to host this application in his server. he is not willing to install sql server 2005 edition in his sever so we r going by placing .mdf file in data directory of project.
but before i developed in server2005 i used aes_256 algorithm to encrypt n decrypt the pwd column by using symmetric keys.it is working fine.
but when i took the .mdf file of project n add into my project it is throwing error at creation of symmetric key that
"Either no algorithm has been specified or the bitlength and the algorithm specified for the key are not available in this installation of Windows."
please suggest me a solution
View 1 Replies
View Related
Feb 7, 2008
Hi,
i'm making my master thesis about a new plug-in algorithm, with the LVQ Algorithm.
I make the tutorial with the pair_wise_linear_regression algorithm and i have some doubts. i was searching for the code of the algorithm in the files of the tutorial and i didn't saw it. I have my new algorithm programmed in C++ ready to attach him, but i don't know where to put him, in which file i have to put him to start to define the COM interfaces? And in which file is the code of the pair_wise_linear_regression algorithm in the SRC paste of the tutorial?
Thanks
View 3 Replies
View Related
Feb 26, 2007
Hello friends,
Can u give some idea about the Algorithm in Data Mining for Clustering..
Please reply...
View 1 Replies
View Related
Aug 17, 2006
I am trying to predict Revenue gererated by each Person.
My Input like this:
Month Person Revenue
-----------------------------------------
20050101 Person1 $1000
20050101 Person1 $2000
20050201 Person1 $1000
20050101 Person2 $5000
20050201 Person2 $2000
20050201 Person2 $3000
Obviosly for Person1 and 200501 I expect to see on MS Time Series Viewer $3000, correct?
Instead I see REVENUE(actual) - 200501 VALUE =XXX,
Where XXX is absolutly different number.
Also there are negative numbers in forecast area which is not correct form business point
Person1 who is tough guy tryed to shoot me.
What I am doing wrong. Could you please give me an idea how to extract correct
historical and predict information?
Thnak you,
Tim.
View 5 Replies
View Related
Mar 28, 2006
Hi,
I want to create a symmetric key that will be encrypted by certificate key. Can u guide me which algorithm is best out of the following:
DES, TRIPLE_DES, RC2, RC4, RC4_128, DESX, AES_128, AES_192, AES_256.
I tried using AES_128, AES_192, AES_256 but it says 'the algorithm specified for the key are not available in this installation of Windows.'
Pls tell me which else algorithm is best to use and pls specify why.
Thanks
Gaurav
View 5 Replies
View Related
Jul 25, 2006
Hi
Pl any one tell me which algorithm is better for Customer retention Using SQL server 2005 analysis services
It will be great if some one can give the same with example of data model with key column , and rest
Thanks in Advance
Rajesh Ladda
View 3 Replies
View Related
Jul 2, 2007
Currently I want to run a vanilla multivariate regression and get some statistics back about the regression that is built. For instance, besides the coefficients, I also want the two-sided p-values on the coefficients and the R2 of the model.
I've tried playing with the Microsoft_Linear_Regression algorithm and have run into two issues. I'm doing all this programmatically using DMX queries rather than through the BI studio.
(a) I can never get the coefficients from the regression to match with results I would get from running R or Excel. The results are close but still significantly off. I suspect this is because the Linear Regression is just a subset of the Decision/Regression Trees functionality, in which case some kind of Bayesian prior is being incorporated here. Is that the issue? And if so, is there some way to turn off the Bayesian scoring and get a vanilla multivariate regression? I don't see anything in the inputs to the linear regression that would let me do this, and even running Microsoft_Decision_Trees with a few different settings, I can't get the output I'm looking for. If there's no way to turn off the Bayesian scoring, can someone explain to me what the prior being used here is and how Bayesian learning is being applied to the regression?
(b) Using the Generic Tree Viewer, I see that there are a few "statistics" values in the Node_Distribution, but I'm not sure what they're referring to. One of them looks like it might be the MSE. I could play with this some more to find out, but I'm hoping someone here can save me that work and tell me what these numbers are. Hopefully they will constitute enough information for me to rebuild the p-values and the R2.
Thanks!
Wilfred
View 3 Replies
View Related
Oct 18, 2006
I have a code for Nearest neighbour algorithm, I want to build a datamining algorithm using that code..
I have the following link that includes the source code for a sample plug-in algorithm written in C#.
(managed plug-in framework that's available for download here: )http://www.microsoft.com/downloads/details.aspx?familyid=DF0BA5AA-B4BD-4705-AA0A-B477BA72A9CB&displaylang=en#DMAPI.
But i am confused on where to insert my algorithm logic?
View 3 Replies
View Related
Jan 20, 2007
What is the algorithm that generates the itemsets in the Association model? I'm looking to possibly use this part of the Association algorithm (i.e. the grouping into itemsets) in a separate plug-in algorithm.
View 1 Replies
View Related
Jan 18, 2007
Hi Jamie:
I am building data mining models to predict the amount of data storage in GB we will need in the future based on what we have used in the past. I have a table for each device with the amount of storage on that device for each day going back one year. I am using the Time Series algorithm to build these mining models. In many cases, where the storage size does not change abruptly, the model is able to predict several periods forward. However, when there are abrupt changes in storage size (due to factors such as truncating transaction logs on the database ), the mining model will not predict more than two periods. Is there something I can change in terms of the parameters the Time Series Algorithm uses so that it can predict farther forward in time or is this the wrong Algorithm to deal with data patterns that have a saw tooth pattern with a negative linear component.
Thanks,
View 1 Replies
View Related