To What Extent Can We Believe The Pattens Discovered By Mining Models?
Jul 16, 2007
Hi, all,
Again I am confused about the extent of being convinced by the mining models. We can validate the models with accuracy chart, but then to what extent can we trust that? (you never get 100% correct miming models) If we dont trust the results of the models, then the patterns the models discover are meaningless.
Just need some advices from you experts here to help me convince people on what I got from my mining models.
I am looking forward to hearing from you shortly and thank you very much.
With best regards,
Yours sincerely,
View 3 Replies
Apr 26, 2006
Hi, all here,
I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?
Thanks a lot for any help and guidance in advance.
View 4 Replies
View Related
Sep 26, 2006
I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output)
The Lift Chart/Mining Legend in Analysis Services shows three results �� Score, Population Correct (%), and Predict Probability (%)
Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?
However, I can��t work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:
Model Score Pop Correct Pred Probability
Decision Trees 0.83 76.59% 54.28%
Neural Network 0.75 67.63% 50.05%
Ideal Model 100.00%
Can anyone help with this and give a detailed explanation?
Many thanks,
S Rajput
View 4 Replies
View Related
Nov 27, 2006
Hi ,all here,
Thank you very much for your kind attention.
I just found that I am not able to view the accuracy chart for my mining model. The error message is: no mining models are selected for comparision. Which is quite strange.
Any guidance? thank you very much.
With best regards,
Yours sincerely,
View 5 Replies
View Related
May 10, 2007
I've created models with Decision Tree and Neural Network algorithms that predict continous target. But I don't know how to interpret scores that occure under scatter accuracy plot. How should I interpret scores under scatter accuracy plot?
How can I estimate occuracy of model created with Time Series? How can I compare accuracy of model created with Time Series with models created with Decision Trees and Neural Network algorithms?
Thanks in advance.
View 5 Replies
View Related
Jul 16, 2007
Hi, guys,
Thanks for your kind attention.
Just want to make things perfectly work and make the most of our fantastic SQL Server 2005 Data Mining Engine. Can any of you here give me some super advices on the validation of the mining models. As we always see, the 3 aspects of a mining model are: Score, Population correct, and Predict Probability. So the question is: How can we combine these three aspects to best judge the mining models by being able to tell which model is the best one? And to what extent can we really trust these mining models?
These are very important before we can actually bring the models into work to convince other people who have no ideas what are going on with these models. Yes, we just want to convince them with the results of these models and make the most of them and best help them getting the most from their business operations etc.
By the way please can you explain a bit details on each of these aspects? Thanks again.
I am looking forward to hearing from you shortly and thanks bunch for your help.
With best regards,
Yours sincerely,
View 3 Replies
View Related
Jun 12, 2006
Hi, all here,
I am having a question about automating data mining models managements. As we know in many businesses, patterns vary very frequently, therefore, the mining models created will need to be created again afterwards according to new rules appearing in the data. But can we make all these process automated like automatically assessing the mining model accuracy and automatically recreate the mining models based on predifined specifications? Would please any one here give me any idea about that?
Thanks a lot for any guidance and help for that.
With best regards,
View 3 Replies
View Related
Jul 2, 2007
I am in the process of creating an Integration Services package to automate the process of training mining models and getting predictions. Until recently, I have been processing the models directly from Business Intelligence Studio without any problems. However, when I try to use the exact same training set as an input to the Data Mining Model Training destination, I get several errors. Here is the output:
[Mining Models [1]] Error: Parser: An error occurred during pipeline processing.
[Mining Models [1]] Error: Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation.
[Mining Models [1]] Error: Errors in the OLAP storage engine: An error occurred while the 'CPT MODIFIER' attribute of the 'BCCA DMS ~MC-CLAIM LIN~5' dimension from the 'BCCA LRG DMS TEST' database was being processed.
[Mining Models [1]] Error: File system error: The record ID is incorrect. Physical file: . Logical file: .
[Mining Models [1]] Error: Errors in the OLAP storage engine: The process operation ended because the number of errors encountered during processing reached the defined limit of allowable errors for the operation.
[Mining Models [1]] Error: Errors in the OLAP storage engine: An error occurred while the 'BILL TYPE' attribute of the 'BCCA DMS ~MC-CLAIM LIN~5' dimension from the 'BCCA LRG DMS TEST' database was being processed.
[Mining Models [1]] Error: File system error: The record ID is incorrect. Physical file: . Logical file: .
[DTS.Pipeline] Error: The ProcessInput method on component "Mining Models" (1) failed with error code 0x80004005. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
I have not been able to find an answer as to why this is happening. I found a post regarding a similar problem with processing an OLAP cube in SSIS, but it seems that the author of that post never found an answer. Has anyone else here seen similar errors when processing mining models from Integration Services?
Also, if I process the mining models manually then try to run only predictions in SSIS, I get many of the same errors. I'll keep looking into the problem myself, but I would be very grateful if someone in this forum could shed some light on this issue.
View 4 Replies
View Related
Nov 20, 2006
Hi, all experts here,
Thank you very much for your kind attention.
I got a strange problem with SQL Server 2005 data mining models though. I have selected the input columns for my mining model (which are different from the input columns for its mining structure, since I ignored some of the columns for the selected model). But the mining model still used all input columns from the mining structure rather than those I chose for the mining model.
Would please any one here give me any guidance and advices for that. Really need help for that.
Thanks a lot in advance for any help.
With best regards,
Yours sincerely,
View 7 Replies
View Related
Dec 18, 2007
If you make an view you can script it in SSMS and get the DDL.
How to do something similar with mining models/structures getting DMX?
I see you easy can get XMLA, but I would prefer the "dmx-ddl"...
View 1 Replies
View Related
Dec 12, 2006
Still new to DM and SSIS...anyand all help is greatly appreciated!
In SSIS they say that you can use the Analysis Services Processing Task to process a mining model/mining structure, however, I do not see where you can give it a relational table to work off of. I know that I can use a data flow to do this but I wanted to go a different route if I could to process my models as I don't really necessarily need the data flow as what I am tring to do is pretty simple.
That brings me to a more general question, what is the best method for training your models using SSIS? I am building a new model everytime the package runs using some variables and the DDL task, running a query on it, and destroying it at the end of the package but I am having logistical problems training it outside of the data flow. I tried using the DM Query task but it requires that you output a result set and I am not sure if I can use it to create and process models.
I would think that they would just give you a DMX task similar to the SQL task but that does not seem to be the case. Also, when I browse the AS objects via the processing task I can only see the mining structures and not the mining models.
Am I just missing something here?
Dan Meyers
View 5 Replies
View Related
Jul 25, 2007
I'll try to explain what I'd like to do. On my SQL Server 2005, SSAS contains a mining model (In fact a cluster one).
I'd like to show a detailed diagram build from this model on a web site. This diagram (and this is why I need automation) would depend on the user who's consulting it.
For example, a firm A will see the number of its customers in each cluster, and this information will be different for another firm B
So, I thought about several steps to perform:
1) Feed the model with the data for each firm
2) Build a Visio diagram from the previous data using the DM addins for Visio 2007
3) Generate HTML using the Visio export wizard
4) Publish HTML
And (important) this should be done automatically.
I made some experiments:
Step 1 is easy to perform with SSIS
Step 3 is also easy to do with the Visio SDK (using, among others, the exportAsHtml control) and Step 4 is trivial
I failed to perform stage 2, even with the SDK, since creating a diagram from a DM template requires the user to fill a wizard. By code, I'm able to create a document from a DM template, Drop a DM Stencil but when the wizard appears I'm unable to get a handle on it. And even if I was, I dunno if it would be a "clean" way of doing.
So my question is: first of all: isn't there an easier way to generate HTML from Mining Models automatically? And if my approach to this problem is the best (or the "least bad"), how to generate datamining Visio diagrams from code automatically?
View 5 Replies
View Related
Jul 16, 2007
Hi, all,
Another tricky confusion to me is that: many algorithms settings for the native algorithms in SQL Server 2005 Data Mining do not really significantly improve the results of those mining models with settings changes? (Apart from clustering algorithm setting of cluster number, by setting 0 as the number of clusters, the system will automatically cluster the data into clusters which I assume is the best way of mining the model with this method).
Any good advcies on this will be a lot appreciated.
I am looking forward to hearing from you shortly for this confusion and thanks a lot in advance.
With best regards,
Yours sincerely,
View 3 Replies
View Related
Nov 29, 2006
I am using BI Dev Studio for SS2005 in a research (as opposed to a production) environment. Often I want to compare the results of multiple models using the same attributes. If I switch to a different model, the Design view completely resets. Is there any way to retain the same field names with different models in the Design view?
My current workaround is to give my models similar names with AR, DT, CL, LOG, NN suffixes and make global changes in the DMX.
I have consulted the following without finding an answer:
Thanks for your help,
View 3 Replies
View Related
May 27, 2007
Lets take the following example:
Movie train table:
ID Class
1 +
2 +
3 -
4 +
5 -
Actor train nested table:
ID MovieID Gender
1 1 F
2 1 M
3 1 F
4 1 F
5 2 M
6 2 M
7 2 F
8 3 F
9 3 F
10 4 M
11 4 M
12 4 F
13 4 F
14 5 F
15 5 M
We want to build a classifier model in order to predict the Class of a Movie based on the Gender of movie's actors. To deal with the nested table Analysis Services maps each record of the nested table to an attribute of the case table. These attributes are named Actor(n).Gender with n = 1..15, and so they are dependent on the nested table record numbers. Both Microsoft Decision Trees and Microsoft Naive Bayes algorihms use these attributes without any modification.
We are implementing a Relational Naive Bayes algorithm and we are planning to aggregate such attributes in order to make them independent of the nested table record numbers.
Next step we tried to predict some unseen cases and here we face with
a very huge problem.
Lets take more two tables of unseen cases:
Movie test table:
ID Class
6 +
Actor test nested table:
ID MovieID Gender
1 6 F
2 6 M
3 6 F
4 6 F
16 7 F
17 7 M
18 7 F
19 7 F
20 7 F
21 8 M
22 8 M
23 8 F
Predicting the movie 6 Class is not a problem since the movie actors were included in the training dataset and when the records are mapped to attributes because they already exist in the model. But when you
try to predict movies (7 an 8) with unseen actors all new attributes are simply ignored in the ALGORITHM:redict call (in_ulCaseValues is zero!) because they do not exist in the model!
What is the solution?
View 3 Replies
View Related
Feb 19, 2008
Hi all,
I have a very simple time series model which processing works fine without any problem. However when I run the following query
PredictTimeSeries(PriceChange, -3, 2)
[TimeSeries].[Symbol] = 'x'
I get the following error:
TITLE: Microsoft SQL Server 2005 Analysis Services
Error (Data mining): A time series prediction was requested with a start time further in the past than the internal models of the mining model, TimeSeries, specified in the HISTORIC_MODEL_GAP and HISTORIC_MODEL_COUNT parameters can process.
The following is the excerpt of the minding model script related to the two parameters:
<Value xsi:type="xsdtring">Previous</Value>
<Value xsi:type="xsd:int">1</Value>
<Value xsi:type="xsd:int">10</Value>
These HISTORIC_MODEL_GAP (1) and HISTORIC_MODEL_COUNT (10) should accommodate PredictTimeSeries(PriceChange, -3, 2). Could anyone shed some light on this?
View 3 Replies
View Related
Nov 24, 2006
Hi, all here,
Thank you very much for your kind attention.
I am wondering if it is possible to use SSIS to sample data set to training set and test set directly to my data mining models without saving them somewhere as occupying too much space? Really need guidance for that.
Thank you very much in advance for any help.
With best regards,
Yours sincerely,
View 5 Replies
View Related
Oct 25, 2002
Can anyone let me now why the extent scan fragmentation is very high.
I do have a clustered index on this table .
The fill factor is 0 and this table has high inserts as it is used to maintain history.
Rebuilding the Indexes did not help.
DBCC SHOWCONTIG scanning 'ACCOUNTS' table...
Table: 'ACCOUNTS'(1061578820); index ID: 1, database ID: 5
TABLE level scan performed.
- Pages Scanned................................: 728157
- Extents Scanned..............................: 91759
- Extent Switches..............................: 93305
- Avg. Pages per Extent........................: 7.9
- Scan Density .......: 97.55% [91020:93306]
- Logical Scan Fragmentation ..................: 0.33%
- Extent Scan Fragmentation ...................: 99.99%
- Avg. Bytes Free per Page.....................: 76.6
- Avg. Page Density (full).....................: 99.05%
DBCC execution completed. If DBCC printed error messages, contact your system administrator
Thanks in advance,
Shades[B]Extent Scan Fragmentation
View 2 Replies
View Related
Jul 18, 2007
Hello. When reviewing the DBCC SHOWCONTIG immediately after reindexing all indexes on a database, I see the ExtentFragmentation has values like 50 to 70%... These are SQL 2005 tables with clustered PK's, no large varchars/blobs, and at least 100 pages in the index... The numbers related to PAGE fragmentation are ok after reindexing, but not the EXTENT fragmentation numbers.
I noticed the drive is in need of being defragged at the disk level. Is that a reason why reindexing doesn't fix the Extent frag numbers?? ANy other ideas on this? I can try defragging the DISK over the weekend, bringing the database offline then, but any other thougths on why the Extents show these high %'s? Is there any command to reset them and maybe that isn't happening? Like must I do update usage to get valid Extent frag #'s??
If there were MANY autogrows on the files, is that a different level of fragmentation? and how could all those small pieces of files be pulled back together? Thanks, Bruce
View 7 Replies
View Related
May 6, 2004
I've faced with this problem in my DB:
There are some tables that suffer inserts and deletes daily. These tables have 3 nonclustered indexes including the pk.
I perceived that the space used for these tables are growing day after day even ocurring daily deletes.
These inserts and deletes follow the keys of the pk in ascent order.
I ran DBCC SHOWCONTIG on the tables and got results like this:
- Avg. Bytes Free per Page.....................: 7996.3
When I transform the pk to clustered this problem doesn't happen.
You realized the consequence of this: the users complains because the DB is without space, but it's not true!
Anybody could help me to understand why the extents are not being deallocated?
Thank's for help!
View 8 Replies
View Related
Dec 28, 2007
On some tables when I run DBCC ShowContig followed by DBReindex followed by ShowContig I notice Extent Scan Fragmentation actually increases. Why does this happen? Below are the SHOWCONTIG results after running DBReindex three times.
After First DBReindex
- Pages Scanned................................: 986
- Extents Scanned..............................: 124
- Extent Switches..............................: 123
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 100.00% [124:124]
- Logical Scan Fragmentation ..................: 0.00%
- Extent Scan Fragmentation ...................: 47.58%
- Avg. Bytes Free per Page.....................: 91.0
- Avg. Page Density (full).....................: 98.88%
After Second DBReindex
- Pages Scanned................................: 986
- Extents Scanned..............................: 124
- Extent Switches..............................: 123
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 100.00% [124:124]
- Logical Scan Fragmentation ..................: 0.00%
- Extent Scan Fragmentation ...................: 20.16%
- Avg. Bytes Free per Page.....................: 91.0
- Avg. Page Density (full).....................: 98.88%
After Third DBReindex
- Pages Scanned................................: 986
- Extents Scanned..............................: 124
- Extent Switches..............................: 123
- Avg. Pages per Extent........................: 8.0
- Scan Density [Best Count:Actual Count].......: 100.00% [124:124]
- Logical Scan Fragmentation ..................: 0.00%
- Extent Scan Fragmentation ...................: 67.74%
- Avg. Bytes Free per Page.....................: 91.0
- Avg. Page Density (full).....................: 98.88%
Thanks, Dave
View 6 Replies
View Related
Jul 20, 2005
hello everyone,we dropped the clustered & nonclustered indeces on a table, thenrebuilt them. logical fragmentation is near zero, but extentfragmentation is about 40%. how can this be if the indeces are brandnew?
View 2 Replies
View Related
Apr 11, 2007
I am trying to model data in analysis services with the Advance Create Mining Model function in the excel addin. I am having trouble creating an association model that works like the Associate button above the Advanced button.
The format of my data is like this
OrderID Product
100 Bike
100 Helmet
100 Shoes
200 Helmet
200 basketball
200 Bat
300 Shoes
300 Socks
The associate button works perfectly since it asks me which column is the transaction id (orderid) and which column I am trying to predict (product). The advanced create mining model asks me to determine what the columns are...
OrderID=key Product=Input+Predict?
When I run the advance create mining model associate, I get a browser that gives me no rules and the support for only one item itemset (each product but no combination of products).
Does anyone know what I have to do to get it to work like the associate button?
View 8 Replies
View Related
Oct 18, 2006
Dear friends,
I encounter a serious problem.
I would like to develop an application that can create Data Mining structures and a mining model in SQL Server 2005 with VB.NET. I tried the code from book Data Mining with SQL server 2005 in chapter 14 but did not work. Any good idea?
Please help me.
Best regards,
View 5 Replies
View Related
Oct 20, 2006
Thank you very much for your help.
The errors that I can see in the code that you gave in your answer are the following and they are more or less the same as I had previously
I tried the code but initially I have encounter the following problems.
1. In any line that have the declaration As Server, As Database like in
Public Function CreateDatabase(ByVal srv As Server, ByVal databaseName As String) As Database gives me the problem that type Database is not declared the same type Server is not declared and it does not give me any option.
2. In addition to that for As DataSource, As RelationalDataSource, As RelationalDataSourceView, As ScalarMiningStructureColumn, As DataSourceViewBinding, gives me the problem that type is not declared.
3. Finally in mc = New MiningModelColumn("Yearly income", Utils.GetSyntacticallyValidID("Yearly income", Type.GetType(MiningModelColumn))) is not accesible in this context because it is 'Private'.
I have some more problems but I thing that by solving the above that I referred I will solve the rest.
Thank you any way.
Best regards,
PhD student
View 1 Replies
View Related
Jul 18, 2006
I perform data mining on all products and a specific product category.
Do I need to create 2 data source views, one for all products and the other one for the specific product category?
Afterward, I run the Data Mining Wizard 2 times to create 2 mining structures.
I also need to add the same mining model (e.g. Bayes, Cluster) to each of these mining structures.
Is there any simple way to do it?
View 3 Replies
View Related
Sep 14, 2007
I am not getting Mining Accuracy Chart and Min ing Model Prediction
Plz tel me how to do.And how to use the filter input data used to generate the lift chart and
select predictable mining model columns to show in the lift chart
View 1 Replies
View Related
Sep 29, 2015
I followed the tutorial posted at [URL] ...
Everything was ok until the last step where I had to process the mining structure which resulted in a warning
"Informational (Data mining): Decision Trees found no splits for model, Tbl Decision Tree Example."
What does this error mean? How do I resolve it? Also, I only see the first level in the Mining Model Viewer, I don't see the levels 2 and 3.
View 2 Replies
View Related
Feb 23, 2007
Hi, all experts here,
I would like to know if there is any way to migrate third-party data mining packages with SQL Server 2005 data mining algorithms together then we can have a comparison among all of them to get the best results for training models.
I am looking forward to hearing from you.
Thanks a lot.
With best regards,
Yours sincerely,
View 1 Replies
View Related
May 31, 2006
Hoping someone will have a solution for this error
Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.
Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?
View 16 Replies
View Related
Apr 30, 2015
I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project:
"Error 6
Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid.
0 0
I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?
View 2 Replies
View Related
Jun 4, 2015
Having successfully created :
- a data mining structure with about 80 columns.
- a data mining model using Microsoft_Decision_Trees with 2 prediction columns.
I thought I would then explore the possibility of have more than 2 prediction columns, in this case 20.
I get an error message and I can't work out :
a) if this is because there's a limit to the maximum number of prediction columns and where that maximum is stated.
b) if something else has become corrupted
c) there's a know bug and if the error message is either meaningful or not.
Either way, I'm unable to complete the data mining wizard
The error message is :Errors in the metadata manager. Either the mining structure with the ID of '[my model Structure]' does not exist in the database with the ID of 'DMAddinsDB', or the user does not have permissions to access the object.
View 3 Replies
View Related
Oct 25, 2007
Hi all,
I am using Microsoft_Time_Series and have set HISTORIC_MODEL_GAP to various values (from 1 to 21). I always get this error:
Error (Data mining): The 'HISTORIC_MODEL_GAP' data mining parameter is not valid for the 'My Time Series' model.
In Algorithm Parameters window, this parameters is not there by default, so I have to add it.
Any tip will be greatly appreciated.
View 3 Replies
View Related