My records like below..I created a clustered index of DATE field .This is the only one index available in the table. The table contains nearly 5,00,000 records.Now after indexing the DB size increased from 14 GB to 26GB.
Recoverymodel OF DB is SIMPLE
Shall I shrink DB for reucing the size?Will it effect indexing????
2015-03-01 00:07:10.000Â Â Â Â Â Â Â Â Â Â Â Â Â Â 110
2015-03-01 00:07:11.000Â Â Â Â Â Â Â Â Â Â Â Â Â Â 110
2015-03-01 00:07:12.000Â Â Â Â Â Â Â Â Â Â Â Â Â Â 110
2015-03-01 00:07:13.000Â Â Â Â Â Â Â Â Â Â Â Â Â 110
2015-03-01 00:07:10.000Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 111
2015-03-01 00:07:11.000Â Â Â Â Â Â Â Â Â Â Â Â Â 111
2015-03-01 00:07:12.000Â Â Â Â Â Â Â Â Â Â Â Â Â 111
2015-03-01 00:07:13.000Â Â Â Â Â Â Â Â Â Â Â Â Â Â 111
I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?
Thanks a lot for any help and guidance in advance.
I would like to know if there is any way to migrate third-party data mining packages with SQL Server 2005 data mining algorithms together then we can have a comparison among all of them to get the best results for training models.
Hoping someone will have a solution for this error
Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.
Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?
I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project:Â
"Error 6 Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid. 0 0 " I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?
- a data mining structure with about 80 columns. - a data mining model using Microsoft_Decision_Trees with 2 prediction columns.Â
I thought I would then explore the possibility of have more than 2 prediction columns, in this case 20.
I get an error message and I can't work out : a) if this is because there's a limit to the maximum number of prediction columns and where that maximum is stated. b) if something else has become corrupted c) there's a know bug and if the error message is either meaningful or not.
Either way, I'm unable to complete the data mining wizardÂ
The error message is :Errors in the metadata manager. Either the mining structure with the ID of '[my model Structure]' does not exist in the database with the ID of 'DMAddinsDB', or the user does not have permissions to access the object.
I am using Microsoft_Time_Series and have set HISTORIC_MODEL_GAP to various values (from 1 to 21). I always get this error: Error (Data mining): The 'HISTORIC_MODEL_GAP' data mining parameter is not valid for the 'My Time Series' model.
In Algorithm Parameters window, this parameters is not there by default, so I have to add it.
Implementing data mining Add-in in an academic setting? We need to handle over 150 new students a semester and have their connection to Analysis Services survive for their four years at the college. We are introducing data mining to every freshman business student as a unit within their Intro to Excel class (close to a month of work to give them a sense of what is possible). Other courses later in their curriculum will expand on that introduction.Â
Once implemented, we would have as many as 900 connections to manage (four years from now). It is possible that multiple sections will be running at the same time, so 40 students may be accessing the data mining tools concurrently. Â
Is there a way to "bulk establish" the access credentials and establish those databases?
With SASS Database i have created Data mining Structure Using Time series algorithm, while processing the SSAS db, Data mining  taking long time to process, so how we can  reduce processing time ???
Hi, I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output) The Lift Chart/Mining Legend in Analysis Services shows three results €“ Score, Population Correct (%), and Predict Probability (%)
Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?
However, I can€™t work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:
Model Score Pop Correct Pred Probability Decision Trees 0.83 76.59% 54.28% Neural Network 0.75 67.63% 50.05% Ideal Model 100.00%
Can anyone help with this and give a detailed explanation?
I am trying to model data in analysis services with the Advance Create Mining Model function in the excel addin. I am having trouble creating an association model that works like the Associate button above the Advanced button.
The format of my data is like this
OrderID Product
100 Bike
100 Helmet
100 Shoes
200 Helmet
200 basketball
200 Bat
300 Shoes
300 Socks
The associate button works perfectly since it asks me which column is the transaction id (orderid) and which column I am trying to predict (product). The advanced create mining model asks me to determine what the columns are...
OrderID=key Product=Input+Predict?
When I run the advance create mining model associate, I get a browser that gives me no rules and the support for only one item itemset (each product but no combination of products).
Does anyone know what I have to do to get it to work like the associate button?
I have a database available and it is in live production.
I want to show only three tables in the front end. application in dropdown menu. I checked the internet and found only query for seeing all the tables from the DB.
I perform data mining on all products and a specific product category. Do I need to create 2 data source views, one for all products and the other one for the specific product category? Afterward, I run the Data Mining Wizard 2 times to create 2 mining structures. I also need to add the same mining model (e.g. Bayes, Cluster) to each of these mining structures. Is there any simple way to do it?
What do you use to check the data structure at a source database? For instance, i've got a database which i can connect to using the ODBC ( connection in SSIS. To get a better understanding on the data and the relationships in the tables i would like to get a database diagram (ERD) of the source DB. Also i would like to be able to import this diagram into SQL server in the database diagram section when i'm going to load this data into a DWH. Any tips or free tools on better understanding the structure of an external source DB?
As people say, Microsoft was the first major database vendor to include data mining features in a relational database. What dose this exactly mean? Thanks a lot for any guidance.
I am trying to compute the actual size of data and indexes in my database. I have used DBArtisan,Desktop DBA and SEM , they all gave me different results. Does any body now a valide , correct way of determining the size and the utilization of the database.
-Also I am trying to come up with archive/purge procedures , is their publications,white papers or ideas about this issue.
I am wondering if it is possible to use SSIS to sample data set to training set and test set directly to my data mining models without saving them somewhere as occupying too much space? Really need guidance for that.
Need to confirm if we can add space(increase data file size) for the database which is configured for always on similar to that of mirroring or we need to follow any different procedure.
I have a requirement wherein the datafiles on both the primary and secondary replica got full, if i add space to the primary database will it automatically get added to the secondary replica or not?
I found it pretty interesting. I checked the size of a database, before implementing database compression across all the user tables in a database. And Post implementation of compression too I checked the size of the database.
I did not find any difference. But if I expand the table and check propetires->storage and I can see that PAGE compression is implemented across all the tables, but no compaction in the size of the db. It still remains the same.
Hi, I have a problem importing data from SQL Server 2000 'text' columns to SQL Server 2005 nvarchar(max) columns. I get the following error when encountering a transfer of any column that matches the above. The error is copied below,
Any help on this greatly appreciated...
ERROR : errorCode=-1071636471 description=An OLE DB error has occurred. Error code: 0x80004005.An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Unicode data is odd byte size for column 3. Should be even byte size.". helpFile=dtsmsg.rll helpContext=0 idofInterfaceWithError={8BDFE893-E9D8-4D23-9739-DA807BCDC2AC} (Microsoft.SqlServer.DtsTransferProvider)
I want to know is a flat file faster than a RDBMS for indexing for example a search engine indexing would a flat file be better in terms of performance, scalability etc than a RDBMS?
Just wanted to know what is a general rule of thumb when determining log file space against a database's data file.We allow our data file for our database to grow 10%, unlimited. We do not allow our log file to autogrow due to a specific and poorly written process (which we are in a three month process of remove) that can balloon the log file size.Should it be 10% of the Data file, i.e. if the Date file size is 800MB the log file should be 8MB?I realize there are a myraid of factors that go against file size but a general starting point would be nice.ThanksJeff--Message posted via
I have an application on SQL Server 2005. I have completed indexing all the physical primary and foreign keys, virtual primary and foreign keys, sorting order, where clause fields and so on. On first day, I only index all the physical primary and foreign keys, virtual primary and foreign keys. I noticed the loading performance has improved. So I continue with the remaining index process on the second day. This time, I noticed the loading performance is slower by 0.5 to 1 second. Is there any possibility that the loading performance will be slower after indexing? Please advise. Thanks.
I have two tables which are related. The first table(A) has a sequentially assigned unique key (primary) that has a cluster index built on it. This table has roughly 1,000,000 rows of data and grows daily.
The second table(B) has a sequentially assigned unique key (primary). There is a column in table(B) which contains table(A)'s unique key. For each row in the table(A) there are roughly 30 rows in table(B).
Should I build a clustered index on the table(B) column which contains the key to table(A) or a non-clustered index?
I am trying to delete tables from data where the ModifiedDates older than 9 years in AdventureWorks2012 database . I get console notified that foreign keys are dropped but the delete statement is throwing errors. I am sure that somewhere the key constraints are not getting altered, but i'm not able to figure it out as i'm a relative beginner to T-SQL. The error and code:
The DELETE statement conflicted with the REFERENCE constraint "FK_SalesOrderHeaderSalesReason_SalesReason_SalesReasonID". The conflict occurred in database "AdventureWorks2012", table "Sales.SalesOrderHeader [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.SMO") | Out-Null $option_drop = new-object Microsoft.SqlServer.Management.Smo.ScriptingOptions; $option_drop.ScriptDrops = $true;
I am wondering is there any way to select only a portion of a data set to train the mining model? In this case, I mean we dont need to split the dataset in advance, what I want to do is being able to select any random portion of a selected dataset to train a mining model. Any advices?
I am looking forward to hearing from you and thanks a lot in advance for your advices and help.
I have a CSV File, I am importing this into SQL Server using SSIS package through Flat File source Task.
Few Points about data & its handling 1) inside the Procedure they are dropping the index, then populating the table, then again creating the same index. 2) Data is huge in figure (say, in millions)
My doubt: which is the best way to import the data 1) Just inserting the data without dropping the Index 2) Drop index, populate table, re-create index (the way they do right now)
hi I am new at MSSQL 2000 DBA thing. and trying to learn more about analysis service/data warehouse/data mining. so is any expert out there can Recommend some good books or web link article to read? Thanks