Process Association Algorithm Using ISS
Feb 20, 2008
Hi!
I need to deploy several Association algorithms, so I want to do it using ISS. Can anyone help me telling me which task should I have to use to do it?
Thanks!
Ezequiel
Hi!
I need to deploy several Association algorithms, so I want to do it using ISS. Can anyone help me telling me which task should I have to use to do it?
Thanks!
Ezequiel
What is the algorithm that generates the itemsets in the Association model? I'm looking to possibly use this part of the Association algorithm (i.e. the grouping into itemsets) in a separate plug-in algorithm.
View 1 Replies View RelatedI need to create a set of cases for a project that uses the Microsoft Association Rules algorithm to make recommendations for products to customers. My question is: the set of scenarios must include all transactions of customers for training?. or is it sufficient some percentage of total transactions? If i do not use all transactions of customers, could be that the algorithm does not consider some products in their groups or rules and could not make recommendations about these?
thanx
Diego B.
Can anyone tell me, how the Business Ã?ntelligence Studio calculates the importance of a rule. I can't find the formula. I know some formulas, but the result in SQL Server is completly different.
Thanks!
MS uses the a priori algorithm in Association Rules, while other DM software have gone to the Novel Algorithm. Can you tell us why MS decided to stay with the a priori? Did you overcome the limitations that it's accused of having? Thanks!
View 5 Replies View RelatedIn assotiation rules each rule has a [support, confidence] part. In Microsoft Association Rules there is a [probability,importance] measure in each rule and importance can be greater that 1.
I found the following in msdn but i'm not sure if i understood correctly.
MINIMUM_PROBABILITY: Specifies the minimum probability that a rule is true. For example, setting this value to 0.5 specifies that no rule with less than fifty percent probability is generated.
The default is 0.4.
MAXIMUM_SUPPORT: Specifies the maximum number of cases in which an itemset can have support. If this value is less than 1, the value represents a percentage of the total cases. Values greater than 1 represent the absolute number of cases that can contain the itemset.
The default is 1.
My questions are
1) Can i explain the [probability,importance] in [support,confidence]? If yes, how?
2) What importance>1 means?
Thank you in advance.
hi
i m trying to build microsoft association model using Microsoft association algorithm. i got
1) patient table(patientid, name, city)
2) diseases(diseaseid, dieseasename)
It is M:N [many to many] relationship between above tables, so
3)Patient_diseases(patientid,disease_id). [RELATIONSHIP TABLE]
i am trying to associate city in patient table --> disease in diseases table. I want to build association data mining model and use it on web form, such a way when the user enters city associated disease will be displayed.
should i select all 3 table to build the model? could help me to decide what tables should i select as Case and what tables as Nested? what attributes from the table should i select as key, input, predictive ?
i am using data mining tutorials on sqlserverdatamining.com to build this model. is there anything further during my model building i get into confusion? please suggest me where i can find complete resource or inform here.
i appreciate Mr.Jamie for his guidance so far in my academic project. i do have the book 'Data mining with sql server 2005'. I left with just one day to do this and document.
hoping someone could suggest. your help is much appreciated.
regards
raju
managed plug-in framework that's available for download here: http://www.microsoft.com/downloads/details.aspx?familyid=DF0BA5AA-B4BD-4705-AA0A-B477BA72A9CB&displaylang=en#DMAPI.
This package includes the source code for a sample plug-in algorithm written in C#.
in this source code all .cs files are modified for clustering algorithm
if my plugin algorithm is of association or classification type then what modifications are requried in source code???
Hi,
I was trying to extract data from the source server using OLEDB Source and SQL Server Destination when i encountered this error:
"Transaction (Process ID 135) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.".
What must be done so that even if the table being queried is locked, i wouldn't experience any deadlock?
cherriesh
Hello all,
I am running into an interesting scenario on my desktop. I'm running developer edition on Windows XP Professional (9.00.3042.00 SP2 Developer Edition). OS is autopatched via corporate policy and I saw some patches go in last week. This machine is also a hand-me-down so I don't have a clean install of the databases on the machine but I am local admin.
So, starting last week after a forced remote reboot (also a policy) I noticed a few of the databases didn't start back up. I chalked it up to the hard shutdown and went along my merry way. Friday however I know I shut my machine down nicely and this morning when I booted up, I was in the same state I was last Wenesday. 7 of the 18 databases on my machine came up with
FCB:pen: Operating system error 32(The process cannot access the file because it is being used by another process.) occurred while creating or opening file 'C:Program FilesMicrosoft SQL ServerMSSQL.1MSSQLDataTest.mdf'. Diagnose and correct the operating system error, and retry the operation.
and it also logs
FCB:pen failed: Could not open file C:Program FilesMicrosoft SQL ServerMSSQL.1MSSQLDataTest.mdf for file number 1. OS error: 32(The process cannot access the file because it is being used by another process.).
I've caught references to the auto close feature being a possible culprit, no dice as the databases in question are set to False. Recovery mode varies on the databases from Simple to Full. If I cycle the SQL Server service, whatever transient issue it was having with those files is gone.
As much as I'd love to disable the virus scanner, network security would not be amused. The data and log files appear to have the same permissions as unaffected database files. Nothing's set to read only or archive as I've caught on other forums as possible gremlins. I have sufficient disk space and the databases are set for unrestricted growth.
Any thoughts on what I could look at? If it was everything coming up in RECOVERY_PENDING it's make more sense to me than a hit or miss type of thing I'm experiencing now.
Dear list
Im designing a package that uses Microsofts preplog.exe to prepare web log files to be imported into SQL Server
What Im trying to do is convert this cmd that works into an execute process task
D:SSIS ProcessPrepweblogProcessLoad>preplog ex.log > out.log
the above dos cmd works 100%
However when I use the Execute Process Task I get this error
[Execute Process Task] Error: In Executing "D:SSIS ProcessPrepweblogProcessLoadpreplog.exe" "" at "D:SSIS ProcessPrepweblogProcessLoad", The process exit code was "-1" while the expected was "0".
There are two package varaibles
User::gsPreplogInput = ex.log
User::gsPreplogOutput = out.log
Here are the task properties
RequireFullFileName = True
Executable = D:SSIS ProcessPrepweblogProcessLoadpreplog.exe
Arguments =
WorkingDirectory = D:SSIS ProcessPrepweblogProcessLoad
StandardInputVariable = User::gsPreplogInput
StandardOutputVariable = User::gsPreplogOutput
StandardErrorVariable =
FailTaskIfReturnCodeIsNotSuccessValue = True
SuccessValue = 0
TimeOut = 0
thanks in advance
Dave
How do I use the execute process task? I am trying to unzip the file using the freeware PZUnzip.exe and I tried to place the entire command in a batch file and specified the working directory as the location of the batch file, but the task fails with the error:
SSIS package "IngramWeeklyPOS.dtsx" starting.
Error: 0xC0029151 at Unzip download file, Execute Process Task: In Executing "C:ETLPOSDataIngramWeeklyUnzip.bat" "" at "C:ETLPOSDataIngramWeekly", The process exit code was "1" while the expected was "0".
Task failed: Unzip download file
SSIS package "IngramWeeklyPOS.dtsx" finished: Success.
Then I tried to specify the exe directly in the Executable property and the agruments as the location of the zip file and the directory to unzip the files in, but this time it fails with the following message:
SSIS package "IngramWeeklyPOS.dtsx" starting.
Error: 0xC002F304 at Unzip download file, Execute Process Task: An error occurred with the following error message: "%1 is not a valid Win32 application".
Task failed: Unzip download file
SSIS package "IngramWeeklyPOS.dtsx" finished: Success.
The command in the batch file when run from the command line works perfectly and unzips the file, so there is absolutely no problem with the command, I believe it is just the set up of the variables on the execute process task editor under Process. Any input on resolving this will be much appreciated.
Thanks,
Monisha
I am designing a utility which will keep two similar databases in sync. In other words, copying the new data from db1 to db2 and updating the old data from db1 to db2.
For this I am making use of the 'Tablediff' utility which when provided with server name, database, table info will generate .sql file which can be used to keep the target table in sync with the source table.
I am using the Execute Process Task and the process parameters I am providing are:
WorkingDirectory : C:Program Files (x86)Microsoft SQL Server90COM
Executable : C:SQL_bat_FilesSQL5TC_CTIcustomer.bat
The customer.bat file will have the following code:
tablediff -sourceserver "LV-SQL5" -sourcedatabase "TC_CTI" -sourcetable "CUSTOMER_1" -destinationserver "LV-SQL2" -destinationdatabase "TC_CTI" -destinationtable "CUSTOMER" -f "c:SQL_bat_Filessql5TC_CTIsql_filescustomer1"
the .sql file will be generated at: C:SQL_bat_Filessql5TC_CTIsql_filescustomer1.
The Problem:
The Execute Process Task is working fine, ie., the tables are being compared correctly and the .SQL file is being generated as desired. But the task as such is reporting faliure with the following error :
[Execute Process Task] Error: In Executing "C:SQL_bat_FilesSQL5TC_CTIpackage_occurrence.bat" "" at "C:Program Files (x86)Microsoft SQL Server90COM", The process exit code was "2" while the expected was "0". ]
Some of you may suggest to just set the ForceExecutionResult = Success (infact this is what I am doing now just to get the program working), but, this is not what I desire.
Can anyone help ?
I'm pulling data from Oracle db and load into MS-SQL 2008.For my data type checks during the data load process, what are options to ensure that the data being processed wouldn't fail. such that I can verify first in-hand with the target type of data and then if its valid format load it into destination table else mark it with error flag and push into errors table... All this at the row level.One way I can think of is to load into a staging table then get the source & destination table -column data types, compare them and proceed.
should I just try loading the data directly and if it fails try trouble shooting(which could be a difficult task as I wouldn't know what caused error...)
Hi Folks,
I am having this table locking issue that I need to start paying attention to as its getting more frequent.
The problem is that the data in the tables is live finance data that needs to be changed and viewed almost real time so what I have picked up so far is that using 'table Hints' may not be a good idea.
I have a guy at work telling me that introducing a data access layer is the only way to solve this, I am not convinced but havnt enough knowledge to back my own feeling up. (asp system not .net).
Thanks in advance
We are facing deadlock issue in our web application. The below message is coming:
> Session ID: pwdagc55bdps0q45q0j4ux55
> Location: xxx.xxx.xxx.xxx
> Error in: http://xxx.xxx.xxx.xxx:xxxx/Manhatta...Bar=&Mode=Edit
> Notes:
> Parameters:
> __EVENTTARGET:
> __EVENTARGUMENT:
[code].....
Hi,
I'm trying to upload the ASPNETDB.MDF file to a hosting server via FTP, and everytime when it was uploaded half way(40% or 50%)
I would get an error message saying:
"550 ASPNETDB.MDF: The process cannot access the file because it is being used by another process"
and then the upload failed.
I'm using SQL Express. Does anybody know what's the cause?
Thanks a lot
Hi. When I try to start a package manually clicking the Start Debugging button I get this after a little while:
Cannot process request because the process (3880) has exited. (Microsoft.DataTransformationServices.VsIntegration)
How can I prevent this from happening? This happens every time I want to start the package and
every time the process id is different. Here it is 3880.
Darek
Hello,
Let€™s say (for simplicity), in my site you can do one of two things €“ look at products and buy products.
I want to build an association structure between my products based on those two actions, but(!) when a user looks at two products it creates less important association than when the user actually bought those two products.
So basically, I want to give a different factor base on different actions occurred on my products.
How do I build my structure? How do I query it?
I'm trying to figure out how to build a personalization engine.
If my structure is built with users as case, and products as nested - I€™d like to predict best products per user (rather than associated products), and If possible, ignoring products he already bought.
How do I do it?
Hi,
I am working on a table that has following fields transaction_id, product_name,product_brand,product_size,product_quantity.
fyi, If a customer purchases 3 items, all have same transaction_id.
I need to use this table (in BIDS) for finding associations between different products,, but I am unable to do so.
Can anyone help me as to which fields should be used as input so that I can predict the association.
Thanks a lot.
Aashutosh Magdum
When i use the MS association rules ,i don't know how it is worked on the background .I stuy the Fp-Growth algorithm , but there're some questions , I don't kown what's the meaning of transcation database. who can give me one example ? thanks .I know we can store the data in relation database,but in basket Analysis ,how a transaction stroed in relation database?
View 3 Replies View RelatedI am entering to administration of
SS2005 SP1 (Windows 2003) having files mdf, ndf, ldf in
C:Program FilesMicrosoft sql serverMSSQLData
This dir also has two *.cer files.
Apparently no encryption is used
How can I get known what these *.cer files are for?
Hi,
I have a product basket scenario in where I have to recommend contracts to a customer based on the product and the quantity he/she buys. Product Quantity is an important factor which administers the user in the purchase of a particular contract
I have the following tables with me.
Customer product transaction table, Customer Contract transaction table but there is no direct relationship between contract and product in the database. The only way the two can be linked is through the customer.
If I create a mining structure with Customer-Product information as the nested table and Customer-Contract information as the nested table with customer being the link between the two, the model is showing some irreverent contract recommendations.
what is the solution for the above problem? Is it because the is no direct relationship between the product and the contract?
How can I overcome this problem?
1) I use the identifier of transaction and attribute in one table.
Do I can to build a association rules structure without the use of the nested tables?
I tried - did not turn out...
2) As it is necessary to use a main and child table, can not build a prediction query.
When I try to add the predict column in a criteria/argument (Field=PredictSupport), i'll given message:
"Nested table column cannot be used as an argument in a data mining function."
I can not use other columns, because they are not predicable.
I'm wondering if anyone can give me some help with an association model I'd like to setup. It's a typical market-basket analysis, but rather than grouping by individual customers, I'd like to group by customer grouping. (In our database, customers are grouped into categories like: large, small, medium) If this is possible, I'd like to generate the most popular items (so just querying the most probable itemsets), for each customer grouping (I'll refer to this as 'segments' from here on out), and then create a listing of customers in each segment which do not have the most popular items for their segment. I know for this last part I can use reporting services to tackle that problem, however, I'm not really sure how I can really do the rest of this with an association model in SSAS.
Our table structure looks like this:
Code Snippet
CustomerTable PurchasesTable
------------- --------------
CustomerName(key) CustomerName
CustomerGroup PurchasedProduct
And the data is arranged in this fashion:
Code SnippetCustomer Table:
CustomerName CustomerGroup
------------- -------------
A large
B large
C small
Purchases Table:
CustomerName PurchasedProduct
------------ ----------------
A ProductA
A ProductB
B ProductA
C ProductC
C ProductD
I know this is a lot of information but any help you guys may be able to offer would be great! Thanks!
Hi there,
it has been a long i'm trying to execute Microsoft Association Rules on my database.
I solved memory leak problem now, but i still can't understand output rules.
Database contain all the italian student who took a degree last year. Here in Italy, they have to compile a summary where they speak about universitary experience. ie: they talk about experience with teachers (pointage from 1 to 5); they says if they want to continue in the universitary field or not, and so on.
Most of the rules, says:
Int_Stud=1-2, RapDoc>4
Int_Stud is the column where i store student intention to continue university. 1 means they want to go on, 2 means they do not want to continue to study. So, this rules has no sense, because it relates all the student (in my mind): the one who wants to continue university and the one who do not want to.
I think problem is that visual studio 2005 and analysis service has no understanding of Int_Stud world, they've no idea that Int_Stud can have just 2 values and that they're opposite each other. Is there a solution to this problem? Can i discretize this column?
Even if I know not to have perfect english, I hope to be understandable
Hello Developers,
I used the add mining model to mining structure to modify a model so that maximum itemset =2, min prob=.01, min support= 2.
When i select maximum rows to anything higher than 2000 (default) i get duplicate rules.
The maximum rules returns is exactly16000 even though i set it higher than that.
Any ideas on the causes?
Thanks
Davy
Why do association itemsets have probabilities associated with them when its rules that generate probabilities? Any queries I do against my model are using these itemset probabilities rather than the probabilities that the rules generate. More over, the probabilities generated for these itemsets are far less than the MINIMUM_PROBABILITY tag in the algorithm properties menu.
View 3 Replies View RelatedI note that there exist three web viewers for data mining algorithms, namely, DMNaiveBayesViewer, DMDecisionTreeViewer and DMClusterViewer. How come there are no viewers for association rules (itemsets, rules, dependency network)? Can you suggest any alternative way of showing such valuable information in a web application?
View 1 Replies View RelatedHello,
How do I get n and only items predicted by a specific item(s) either directly or indirectly as shown in the dependency network diagram?
For instance, the predict function won€™t work for me - because running this query on AdvantureWorks:
SELECT PREDICT([Association].[Products], 5)
From [Association]
NATURAL PREDICTION JOIN
(SELECT (SELECT 'Touring Tire Tube' AS [Model]) AS [Products]) AS t
Returns Sport-100 as a second result, although it is not predicted by any mean by Touring Tire Tube as shown in the dependency network diagram.
My query should have returned just one row - Touring Tire.
I understand Mr. MacLennan's explanation provided at http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=282651&SiteID=1 and appreciate the time he took to explain how importance works. However, like the user with username "sang", I also ran the data in BI 2005 and got the same results listed by the aforementioned user. I did this using the following data:
donut
muffin
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
y
n
y
n
y
n
y
n
y
n
y
etc.
The rule muffin -> donut has an importance of -0.105302438, which is not the same as Mr. MacLennan's results. I tried switching the roles of a and b in a -> b and using different bases on the logarithms. I don't get the result of -0.105302438 with any of these. I also tried to calculate importance with a small data set I have and can't get the results using Mr. MacLennan's explanation with that data set either. Any thoughts on the descrepancy?
Hi
I am doing the Market basket analysis for a retailer using association rule. The whole data set is huge which contains grocery, clothes and books etc. If I want to check out the relationship between several different clothes brands, (e.g. LEVI'S and adidas), should I just remove all the grocery and books transactions, use the subset which only contains clothes transactions to re-run the association rules? Is this gonna work?
Thanks in advance!