Can We Define Multiple Key Columns For A Mining Structure?
Jul 10, 2007
Hi, all,
Just found that we are not able to define multiple key columns for a mining structure in SQL Server 2005 Data Mining engine, just wondering is there other way to define multiple key columns for a mining structure there? As in many cases, the table we are mining are with composite key consisting of different foriengn keys, e.g. A fact table are with transaction information and other foreign keys. If I am not able to define these composite key here for this fact table, I will have to have a named calculation in data source view to have a key column which is based on these original composite keys? Is this a better way to solve this problem or there is any other alternatives to figure it out?
Hope my question is clear for your help and I am looking forward to heaing from you shortly for your kind advices and help and thanks a lot in advance.
I would like to develop an application that can create Data Mining structures and a mining model in SQL Server 2005 with VB.NET. I tried the code from book Data Mining with SQL server 2005 in chapter 14 but did not work. Any good idea?
Thank you very much for your help. The errors that I can see in the code that you gave in your answer are the following and they are more or less the same as I had previously
I tried the code but initially I have encounter the following problems.
1. In any line that have the declaration As Server, As Database like in Public Function CreateDatabase(ByVal srv As Server, ByVal databaseName As String) As Database gives me the problem that type Database is not declared the same type Server is not declared and it does not give me any option.
2. In addition to that for As DataSource, As RelationalDataSource, As RelationalDataSourceView, As ScalarMiningStructureColumn, As DataSourceViewBinding, gives me the problem that type is not declared.
3. Finally in mc = New MiningModelColumn("Yearly income", Utils.GetSyntacticallyValidID("Yearly income", Type.GetType(MiningModelColumn))) is not accesible in this context because it is 'Private'. I have some more problems but I thing that by solving the above that I referred I will solve the rest.
I perform data mining on all products and a specific product category. Do I need to create 2 data source views, one for all products and the other one for the specific product category? Afterward, I run the Data Mining Wizard 2 times to create 2 mining structures. I also need to add the same mining model (e.g. Bayes, Cluster) to each of these mining structures. Is there any simple way to do it?
I just found that I am not able to view the accuracy chart for my mining model. The error message is: no mining models are selected for comparision. Which is quite strange.
Hoping someone will have a solution for this error
Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.
Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?
I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project:
"Error 6 Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid. 0 0 " I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?
I've tried those two operations in the Management Studio. Though we can create a mining structure and mining model in Management Studion, but we cannot process the analysis-service database.
(1) I create only a mining structure through CREATE MINING STRUCTURE. No error reported. But if I process the analysis-service database in Management Studio I always get error
'Error : The '<mining_structure__name>' structure does not contain bindings to data (or contain bindings that are not valid) and cannot be processed.
I then tried to create it by creating and running an XMLA script. It was successful. However, it's much harder to learn XMLA.
If any of you created an analysis-service database in Mgt Studio, and create a mining structure in the same place using DMX script, can you process the database?
(2) Is there any use of CREATE MINING STRUCTURE operation without binding to any table? Examples I saw so far did not show relating it to do. In my experience processing the analysis-service database with that mining structure is doomed to fail.
(3) Is there any way we can create mining structure through CREATE MINING
STRUCTURE operation in Management Studio and use RELATED TO clause
to bind it to any Relationship to an attribute column (mandatory only if it applies), indicated by the RELATED TO clause
(4) If this is the fact, is there any use of CREATE MINING STRUCTURE operation? If we use BI Dev Studio, it's much easier to use the wizard.
(5) I found I cannot create a mining model inside a mining structure through operation CREATE MINING MODEL. If you call that operation, you end up having a mining model and a mining structure with the same name. I found that in order to create a mining model inside a mining structure you have to call operation ALTER MINING STRUCTURE ADD MINING MODEL. Is it true this is the only way?
I am working with several tables, but for now I just mention 4 : one is fact table (named Usage), and 3 dimensional tables Periods, Products, and Regions. The fact table contains references to the dimensional tables. Table Periods contain two other columns month and year.
I created a cube containing columns from those 4 tables. Deployment was successful. Trouble comes when I want to create a mining structure using Time Series containing these columns :
- Period - Amount (of money) - Product name - Region name
When I choose to use cube (instead of table) as source for mining structure, I'm forced to choose only one dimension (among the Periods, Products, and Regions). Whatever dimension I choose I end up being unable to use the column period as the Time-Key column. Effectively I cannot use Time Series method since I cannot use the column period.
(1) Why is this so [why Visual Studio forced us to use only one dimension from the cube] ? (2) Why Visual Studio eliminates the column period, column that has relationship with the time dimension? (3) What is the use of Cube anyway to the mining? Is there still any use for it? (4) What is the solution to that kind of problem I face?
I know, this is a common OLAP Error, but In fact I'm having trouble with this while trying to process a DM Mining Structure. I'm currently working on a website that gets data from its users and analyzes it using SSAS. The thing is each time we add a new "analysis criterium" (sorry I'm trying to translate our french BI language in English...), we have to build a new mining structure, which needs data about users who have actually answered the question associated with this criterium. Some times, there are thousands, and some other only dozens, which is the case for the structure I'm having trouble with.
I got only 2 hundred tuples in the learning set. So lots of the common criteria weren't filled: I removed them using a stored procedure before feeding the structure, so that I got no column with only "null" values.
Of course, I know that 200 learning cases is really not enough to build an accurate model, but the purpose was just a proof of concept for machine driven Mining Structure building, and that was supposed to work even with so few cases. When I process the MS, it fires: (Sorry it's in french, translation follows) Erreurs dans le moteur de stockage OLAP : La clé d'attribut est introuvable : Table : _x0032_0_EtudeIphone_Apprentissage, Colonne : EtudeIphone, Valeur : le nouvel iPhone (téléphone tactile et musical dApple). Erreurs dans le moteur de stockage OLAP : La clé d'attribut a été convertie en un membre inconnu parce que cette dernière est introuvable. Attribut Id de la dimension : 20_EtudeIPhone ~MC-Id de la base de données : ClassificationVDCE, Enregistrement : 2.
Badly translated it says "Errors in OLAP Storage Engine: Attribute Key not found Table:<StrangeTable> Column <MyPredictableColumn> Value <OneOfTheInterestingValues> Errors in OLAP Storage Engine: Attribute key not found: converted to an unknown member. Attribute Id from dimension..."
Why? Too few cases? I have structures based on the same template but associated with other criteria and they work perfectly.
I'm ready to answer any question, and give any detail. Thanks in advance.
Can anyone spot what i am missing here ? The problem is that i am getting a null object for e.TextData in the t_OnEvent(object sender, TraceEventArgs e) function below. I am trying to get event- notifications while processing the data mining structure.
I got a quite a strange problem with Mining structure for OLAP data source though. The problem is: I am not able to edit the mining structure in the mining structure editor. The whole data source view within the mining structure editor is greyed. Could please anyone here give me any advices for that?
Thank you very much in advance for any help for that.
I've tried to use the export command to export a mining model from management studio, but it returns that export statement is not supported for OLAP mining structures, Ive checked the EXPORT(DMX) reference I can't see any note that it is not applicable on OLAP structures.
- a data mining structure with about 80 columns. - a data mining model using Microsoft_Decision_Trees with 2 prediction columns.
I thought I would then explore the possibility of have more than 2 prediction columns, in this case 20.
I get an error message and I can't work out : a) if this is because there's a limit to the maximum number of prediction columns and where that maximum is stated. b) if something else has become corrupted c) there's a know bug and if the error message is either meaningful or not.
Either way, I'm unable to complete the data mining wizard
The error message is :Errors in the metadata manager. Either the mining structure with the ID of '[my model Structure]' does not exist in the database with the ID of 'DMAddinsDB', or the user does not have permissions to access the object.
I posted a related thread before about this error below when I process a dimension. And seems that the solution by using "ClearCache" can not fingure out the issue when I want to process a mining structure...... .
OLE DB error: OLE DB or ODBC error: There is not enough procedure cache to run this procedure, trigger, or SQL batch. Retry later, or ask your SA to reconfigure SQL Server with more procedure cache. ; Sort failed because there is insufficient procedure cache for the configured number of sort buffers. Please retry the query after configuring lesser number of sort buffers.
Could someone please give me some suggestions? Your help will be very appreciated:-)
The SQL below is the start of a massive Stored Procedure for Comparing two Datasets, which will be produced onto a report.
I was wondering if I could call an SQLserver Procedure that would tell me the names of all the Columns that are produced by this SP, so I can print them out and more easily code the report?
SELECT stk.StockNumber, stk.DefaultImageName, tVTP.Make as PolMake, stkV.VehicleMake, tVTP.Model as PolModel, stkV.VehicleModel, tVTP.ModelNo as PolModelNo, stkV.VehicleModelNo, tVTP.EngineNumber as PolEngineNumber, Stk.EngineNumber, tVTP.comHeadLightNumber as PolHeadLightNumber, Stk.comHeadLightNumber, tVTP.comTailLightNumber as PolTailLightNumber, Stk.comTailLightNumber , tVTP.comBumperLightNumber as PolBumperLightNumber, Stk.comBumperLightNumber, tVTP.comCornerLightNumber as PolCornerLightNumber, Stk.comCornerLightNumber, tVTP.Chassis as PolChassis, --Drive Train tVD.DriveTrainDescription as POlDriveTrainDescription, StktVD.DriveTrainDescription, --Body Type tVBT.BodyTypeDescription as PolBodyDescription , StkVBT.BodyTypeDescription
FROM tblStock Stk --JOINS FOR THE Policy Definition INNER JOIN tblVehicles V ON V.VehicleID = Stk.VehicleID INNER JOIN tblVehicleType_Policy tVtP ON tVTP.VehicleMaster = V.VehicleMaster AND tVTP.Make = V.VehicleMake AND tVTP.Model = V.VehicleModel AND tVTP.ModelNo = V.VehicleModelNo INNER JOIN tblVehicleDriveTrain tVD ON tVD.vehicleDrivetrainID = tVTP.DrivetrainID INNER JOIN tblvehicleBodyType tVBT ON tVBT.VehicleBodyTypeID = tVTP.BodyTypeID
--JOINS FOR the Stock Definition INNER JOIN tblVehicles STkV ON StkV.VehicleID = Stk.VehicleID INNER JOIN tblvehicleBodyType StkVBT ON StkVBT.VehicleBodyTypeID = Stk.BodyTypeID INNER JOIN tblVehicleDriveTrain StktVD ON StktVD.vehicleDrivetrainID = stk.DrivetrainID
Either Sql2k or Sql25k are targeted if you answer to this thread. When we have source/destination files we usually wish to define its properties, the width for each field and so on. My question is related with this, how do such by-hand tasks via scripting inside the own ETL? Tedious tasks are if there are more than 20 columns.
Is it possible? I think so regarding 2005 but about 2000 I haven't idea at all how to begin. Issue comes when one programmer must alter lots of columns due to for example, a new file format from mainframe is released.
I am a bit confused for the model evaluation (lift chart), should we map all the columns for both the mining structure and the case table? I mean for those predictive models, we have a predict column, shouldnt we ignore the mapping of the predictive column between the mining structure and the case table? But it seemes we are not allowed to miss the predictive column mapping between the mining structure and the case table.
Why is that? Could any experts here give me some explanation on that?
Hope my question is clear for your help.
Thanks a lot and I am looking forward to hearing from you shortly.
I've two tables A, B. In A table, I need to define the primary key with combination of 2 columns and this Primary Key will be a foreign key in table B. Based on these PK and FK I'll be writing a join to get the second column in table B.
I have 7 source databases and one target database, all using the same structure. The structure is made of 10 tables, with foreign key constraints.
I need to merge the source databases into the target (which won't have any data before that process, but will already have the correct schema), and to keep the relationships between the records.
I know how to iterate over the source databases (with SMO foreach), but I'd like to know if someone can advise the best copy method for that context in SSIS ? (I don't want to keep the primary keys, but I need to keep the relationships...)
How to right choose key column in"Mining Structure" for Microsoft Analysis Services?
I have table:
"Incoming goods"
Create table Income ( ID int not null identity(1, 1) [Date] datetime not null, GoodID int not null, PriceDeliver decimal(18, 2) not null, PriceSalse decimal(18, 2) not null, CONSTRAINT PK_ Income PRIMARY KEY CLUSTERED (ID), CONSTRAINT FK_IncomeGood foreign key (GoodID) references dbo.Goods ( ID ) )
I'm trying to build a relationship(regression) between “Price Sale” from Good and “Price Deliver”.But I do not know what column better choose as “key column”: ID or GoodID ?
I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?
Thanks a lot for any help and guidance in advance.
Can someone please assist? I have no problem using the provided Algorithms (NaiveBayes, Decision Tree, etc) from SQL Server 2005 Data Mining. For example: If I want to predict whether the customers want to buy bike from the following data, then I use Age, Salary, Gender as input/attribute/feature selection and BuyBike column as "Predict" column.
Table Age Salary Gender BuyBike ------------------------------------
However, say that I have 10,000 types of bikes to predict. How to do that? Age Salary Gender BuyBike1 BuyBike2 BuyBike3 ...... BuyBike10000 ------------------------------------------------------------------------------------
Are there any online resources discussing this issue? I am desperately try to solve this problem. Please assist!
I got a strange problem with SQL Server 2005 data mining models though. I have selected the input columns for my mining model (which are different from the input columns for its mining structure, since I ignored some of the columns for the selected model). But the mining model still used all input columns from the mining structure rather than those I chose for the mining model.
Would please any one here give me any guidance and advices for that. Really need help for that.
Request is to merge or join or case stmt or union or... from up to four unique columns all in separate tables to new combined table (matrix) of results from said.
I have a question in SQL server. For example I have a table which has two column like following table and I don't know how can I update theses two column with identity numbers but just the fields which are equal 111.
A newbie question. I am tearing my hair out trying to work out how in Sql Server 2005 to get a printout (or even better a file I can save that i could incroporate in a wrod document), or both, which shows the structure of all the tables in my database.
I want to list all tables (or selected tables perhaps) , and all columns in those tables, with the attributes of each column (nvarchar(2) etc or decimal(18,5) etc). Just a simple listing of all tables and their columns and the attributes of those columns.
Surely this must be possible with a simple one click operation in Sql Server 2005. I have created a database diagram which gives me part of what I want, but that just shows the tables, relationships, and column names, not the attributes of each column which is what I need as well.
I don't want to have to start installing third party products to do this, and I have no great script writing capabilities. Surely such a basic function is easily acheivable with one or two clicks in Sql Server 2005 from a menu somewhere in sql server mangaement studio?
hey everbody, i'm absolutely new to any sort of data management here it goes: suppose we store 100 .txt or .doc files in sql server and we want that none of the files data should match more than 60%: the question which arises are
1. how do we store files in ms-sql (binary format or normal text)? 2. how do we match the files? 3. what code we write in c# for this purpose? 4. has this nething to do with pattern recognition?
My request to all new n active experienced user's to participate because Plzzzzz help me?
While recently working with several mining models, I came across something that struck me as pretty odd - and I'm hoping to find an explanation for the behavior.
Consider the following setup:
A single table in the relational database represents the only case table A single, continuous column is the predictable A mining structure has been created
The mining structure contains a single model, based on the MS Decision Trees algorithm Input columns were selected for the model via the BI Studio wizard (i.e., those provided via the "Suggest" button) The structure has been fully processed Now, the interesting parts:
I view the scatterplot for the mining model, under the Mining Accuracy Chart tab Back on the Mining Structure tab, I delete one of the input columns I add the same column back into the structure The structure is fully processed again When I view the scatterplot for the mining model, under the Mining Accuracy Chart tab, a different set of data points are presented for the model predictions A different set of decision trees under the Mining Model Viewer tab confirms thisHow could different patterns have been found this second time around, even though all of the input columns were the same (as well as the training cases)?
(Note: I encountered this situation while creating a new mining model that was identical to an existing one. Even though the models received the exact same inputs and training cases, they yielded different results. I was able to reproduce the behavior by using steps 1-6 above, though.)
Can someone provide some insight on this behavior, or some kind of explanation of what may be happening?
I am using Server 2012 and very new to SQL. I have a request from a physician for a list of his patients that meet a criteria. This is stored in a temp table names #cohort.
Using this cohort he wants each row to be one patient with a list of labs, vitals, etc. Three items are the most recent lab value and date. I could query each lab individually and place it into a temp table and then join all temp tables at the end, but I am trying to move past that and have all labs in one temp table. All temp tables are joined with PatientSID.
I tried to do something for just 2 labs, but it is not working. There could be nulls values when joined with the #cohort table.
Individually the SELECT statements pull in the most recent lab value and date, but I cannot get them into a temp table with one row of PatientSID and then the lab value and date if they exist.
IF OBJECT_ID ('TEMPDB..#lab') IS NOT NULL DROP TABLE #lab SELECT cohort.PatientSID ,SubQuery1.LabChemResultNumericValueAS 'A1c%' ,SubQuery1.LabChemCompleteDateTimeAS 'A1c% Date' ,SubQuery2.LabChemResultNumericValueAS 'LDL'
I have a business need to create a report by query data from a MS SQL 2008 database and display the result to the users on a web page. The report initially has 6 columns of data and 2 out of 6 have JSON data so the users request to have those 2 JSON columns parse into 15 additional columns (first JSON column has 8 key/value pairs and the second JSON column has 7 key/value pairs). Here what I have done so far:
I found a table value function (fnSplitJson2) from this link [URL]. Using this function I can parse a column of JSON data into a table. So when I use the function above against the first column (with JSON data) in my query (with CROSS APPLY) I got the right data back the but I got 8 additional rows of each of the row in my table. The reason for this side effect is because the function returned a table of 8 row (8 key/value pairs) for each json string data that it parsed.
1. First question: How do I modify my current query (see below) so that for each row in my table i got back one row with 19 columns.
SELECT A.ITEM1,A.ITEM2,A.ITEM3,A.ITEM4, B.* FROM PRODUCT A CROSS APPLY fnSplitJson2(A.ITEM5,NULL) B
If updated my query (see below) and call the function twice within the CROSS APPLY clause I got this error: "The multi-part identifier "A.ITEM6" could be be bound.
2. My second question: How to i get around this error?
SELECT A.ITEM1,A.ITEM2,A.ITEM3,A.ITEM4, B.*, C.* FROM PRODUCT A CROSS APPLY fnSplitJson2(A.ITEM5,NULL) B, fnSplitJson2(A.ITEM6,NULL) C
I am using Microsoft SQL Server 2008 R2 version. Windows 7 desktop.