Which Certifications Lead To A Position Involving Data Mining?
Jan 20, 2007
I'm a newbe in the realm of database reporting. At my current position, I'm reporting off of CRM databases using Crystal V-11. Previously I'd experience working with HR databases using the same reporting tool. I am interested in progressing to work with database design and scripting. Any suggestion from anyone on which certifications to pursue?
I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?
Thanks a lot for any help and guidance in advance.
I would like to know if there is any way to migrate third-party data mining packages with SQL Server 2005 data mining algorithms together then we can have a comparison among all of them to get the best results for training models.
Hoping someone will have a solution for this error
Errors in the metadata manager. The data type of the '~CaseDetail ~MG-Fact Voic~6' measure must be the same as its source data type. This is because the aggregate function is not set to count or distinct count.
Is the problem due to the data type of the column used in the mining structure is Long, and the underlying field in the cube has a type of BigInt,or am I barking up the wrong tree?
I'm a beginner with SQL 2012 SSDT & SSMS. I get this error message when I try to deploy my project:Â
"Error 6 Error (Data mining): KEY SEQUENCE columns are not supported at the case level. The 'Customer Key' column of the 'TK448 Ch09 Cube Clustering' mining structure contains content that is not valid. 0 0 " I am finding it hard to locate the content that is not valid. I've been trying to find a answer for this problem but can't seem to find anything. How can I locate the content that is not valid and change or delete it so that I can deploy this solution?
- a data mining structure with about 80 columns. - a data mining model using Microsoft_Decision_Trees with 2 prediction columns.Â
I thought I would then explore the possibility of have more than 2 prediction columns, in this case 20.
I get an error message and I can't work out : a) if this is because there's a limit to the maximum number of prediction columns and where that maximum is stated. b) if something else has become corrupted c) there's a know bug and if the error message is either meaningful or not.
Either way, I'm unable to complete the data mining wizardÂ
The error message is :Errors in the metadata manager. Either the mining structure with the ID of '[my model Structure]' does not exist in the database with the ID of 'DMAddinsDB', or the user does not have permissions to access the object.
I am using Microsoft_Time_Series and have set HISTORIC_MODEL_GAP to various values (from 1 to 21). I always get this error: Error (Data mining): The 'HISTORIC_MODEL_GAP' data mining parameter is not valid for the 'My Time Series' model.
In Algorithm Parameters window, this parameters is not there by default, so I have to add it.
Implementing data mining Add-in in an academic setting? We need to handle over 150 new students a semester and have their connection to Analysis Services survive for their four years at the college. We are introducing data mining to every freshman business student as a unit within their Intro to Excel class (close to a month of work to give them a sense of what is possible). Other courses later in their curriculum will expand on that introduction.Â
Once implemented, we would have as many as 900 connections to manage (four years from now). It is possible that multiple sections will be running at the same time, so 40 students may be accessing the data mining tools concurrently. Â
Is there a way to "bulk establish" the access credentials and establish those databases?
With SASS Database i have created Data mining Structure Using Time series algorithm, while processing the SSAS db, Data mining  taking long time to process, so how we can  reduce processing time ???
Hi, I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output) The Lift Chart/Mining Legend in Analysis Services shows three results €“ Score, Population Correct (%), and Predict Probability (%)
Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?
However, I can€™t work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:
Model Score Pop Correct Pred Probability Decision Trees 0.83 76.59% 54.28% Neural Network 0.75 67.63% 50.05% Ideal Model 100.00%
Can anyone help with this and give a detailed explanation?
I am trying to model data in analysis services with the Advance Create Mining Model function in the excel addin. I am having trouble creating an association model that works like the Associate button above the Advanced button.
The format of my data is like this
OrderID Product
100 Bike
100 Helmet
100 Shoes
200 Helmet
200 basketball
200 Bat
300 Shoes
300 Socks
The associate button works perfectly since it asks me which column is the transaction id (orderid) and which column I am trying to predict (product). The advanced create mining model asks me to determine what the columns are...
OrderID=key Product=Input+Predict?
When I run the advance create mining model associate, I get a browser that gives me no rules and the support for only one item itemset (each product but no combination of products).
Does anyone know what I have to do to get it to work like the associate button?
I perform data mining on all products and a specific product category. Do I need to create 2 data source views, one for all products and the other one for the specific product category? Afterward, I run the Data Mining Wizard 2 times to create 2 mining structures. I also need to add the same mining model (e.g. Bayes, Cluster) to each of these mining structures. Is there any simple way to do it?
NOTE: I apologize to anyone (especially moderators) who may notice that I am basically repeating a question that was already posted by me in another recent thread. The reason why I am reposting is because I want to filter my question down to its crux because the other question may not have been asked in the most clear way.
The Question:
The sceanrio is this. (1) I have a sales person dimension with a hierarchy. (2) In this hierarchy, Bill and Ted roll up to John. (3) Bill sells 10 units, Ted sells 8 units, and John sells 5 units.
When you process this hierarchy, what would you expect the total to be for John? (A) 23, which is the sum for Bill, Ted, and John or (B) 18, which is the sum of Bill and Ted only and overwritting John's number
I say (A) and I think most will choose the same, and all the examples I've been reflects (A).
I am so frustrated, I don't know what to do. I have spoken to everyone that I know on how to explain this error message, but my supervisor wants to dispute every thing I tell him.
Could someone PLEASE explain the following error message and how I can go about finding what fields that it's talking about?
e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 15., ; Info: Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 15.
I have two environments (2 seperate companies on seperate servers). In Environment #1 (referred to as Env1 from this point on) that runs the report and generates a few of these errors (10 for an example). In Environment #2, the same report only modified SLIGHTLY for the other company, runs the report and generates 50 errors. I'd be more than happy to post the log here if someone needs it.
Someone, please, save me from logfile hell. I have told this supervisor that "No data for the field" pretty much means that there was NO DATA RETURNED. We know for a FACT that these two environments are completely different. One runs on real servers and the other is running on partial real servers and virtual servers. We have identified several differences in the configurations, but he wants me to nail down EXACTLY what is causing this error. Please remember, "No Data" is not a good enough response for this person.
Please, for the love of all that is Chocolate, HELP!!!!!!!!!!!!!!!
Thanks in advance,
Jim Evans Microsoft Certified Application Developer
I am wondering if it is possible to use SSIS to sample data set to training set and test set directly to my data mining models without saving them somewhere as occupying too much space? Really need guidance for that.
I need to change my usernames in a column from JSmith to ABCSmith. What would be my update statement to make that change? I need something that would basically start at position 1, for a length of 1, then use that to replace with ABC...
Here is what I have been trying:
UPDATE tblusers, SET userLoginID = replace(userloginID, 1, 'ABC')
Hi I am having a problem in auditing the column data in tables.My requirement is i have write a trigger which is capable of auditing the columns which are going to be added in the future also with out using dynamic SQL.is there any way to do so. I feel if i can get the column data based on ordinal position then it is possible. Can any body suggest. My set Up is like this I have a base_table to be audited. I have a Audit_spec table which contains name of the table and columns to be audited. And Audit table which actually captures the table name,column name ,old value and new value. I have to audit only those columns in the Audit_spec spec. If schema changes(Like new column added) happens to base_table and I want that column to be audited.with out any changes to my trigger code i should handle the newly added column ..
I am getting the following errors in my reporting server log. Can anyone help me?
ReportingServicesService!processing!11!8/22/2006-01:04:04:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 10., ; Info: Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 10. ReportingServicesService!processing!11!8/22/2006-01:04:04:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 14., ; Info: Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 14. ReportingServicesService!processing!11!8/22/2006-01:04:04:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 18., ; Info: Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 18. ReportingServicesService!processing!11!8/22/2006-01:04:04:: e ERROR: Throwing Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 22., ; Info: Microsoft.ReportingServices.ReportProcessing.ReportProcessingException: There is no data for the field at position 22.
Hi Guru,After spening quite sometimes to watch my box, I've seen PAGEIOLATCH isa lead blocker in my SQL Server 2000 server. Below is the detailed:SPID lastwaittype waitresource blocked status cmd57 LCK_M_S KEY: 7:963690681:8 65 sleeping execute65 PAGEIOLATCH_SH 7:1:217904 0 sleeping selectI thought, latching should be very short-term synchronization. Fromsystemprocess table, I saw the latch waited in a minute sleepingwithout doing any work.My database is about 23GB and more than 5000 tables. The RAID subsystemis RAID1 with 1 disk mapped to C and D logically. Data files and tempdbfiles are located in one location. Tranlog file and log backup filesare located in the same location with different disk spindle.Currently, we are experiencing very slowness and IO bound. I'm ready torebuild the server by putting the RAID10 and 1 and distributingmultiple data files to different RAID10 and tempdb and log files toRAID1.Other than this, how to minimize the IO latch contention?Thanks so much,Silaphet,
I am trying to delete tables from data where the ModifiedDates older than 9 years in AdventureWorks2012 database . I get console notified that foreign keys are dropped but the delete statement is throwing errors. I am sure that somewhere the key constraints are not getting altered, but i'm not able to figure it out as i'm a relative beginner to T-SQL. The error and code:
The DELETE statement conflicted with the REFERENCE constraint "FK_SalesOrderHeaderSalesReason_SalesReason_SalesReasonID". The conflict occurred in database "AdventureWorks2012", table "Sales.SalesOrderHeader [System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.SMO") | Out-Null $option_drop = new-object Microsoft.SqlServer.Management.Smo.ScriptingOptions; $option_drop.ScriptDrops = $true;
I am wondering is there any way to select only a portion of a data set to train the mining model? In this case, I mean we dont need to split the dataset in advance, what I want to do is being able to select any random portion of a selected dataset to train a mining model. Any advices?
I am looking forward to hearing from you and thanks a lot in advance for your advices and help.
Do you have any stored procs or utility to track down the lead blocker as well as the blocked processes on SQL Server 2000? Similar to a tree structure with the lead blocker on top followed by the processes being blocked by the lead blocker.
Starting with a SQL Server 2000 database, my company built a new server with SQL Server 2005 and restored the SS2000 database from backups. We're now seeing some extreme issues where queries that took 30 seconds now take 3 minutes. Would this make you suspect index issues? Any recommendations on diagnosing and fixing this would be appreciated.
-- Business Rule, first name, middle name and last name can all be null-- ddlcreate table #cat (catID char(8) primary key, first_name varchar(15)null, middle_name varchar(2) null, last_name varchar(15) null)-- dml, populate sample datainsert into #catvalues ('Black123','ghost','','bigger')insert into #catvalues ('Arab0123','Hama','','Abbas')insert into #catvalues ('Mixed001','',null,null)insert into #catvalues ('Mixed002',null,null,null)insert into #catvalues ('Mixed003',null,'','Smith')insert into #catvalues ('White123','','','Talley')insert into #catvalues ('Yello123','Nick','H','Pisa')-- dml, name concatenation, get all or anyselect (first_name + ' ' + middle_name + ' ' + last_name) as namefrom #cat-- the above does not meet with requirement-- option 1select (IsNull(first_name,'') + ' ' + Case Len(middle_name) when 0 then'' else IsNull((middle_name + ' '),'') end + IsNull(last_name,'')) asnamefrom #cat-- option 2select (IsNull(first_name,'') + ' ' +IsNull(NullIf(Coalesce((middle_name + ' '),''),''),'') +IsNull(last_name,'')) as namefrom #catq:both option 1 and option 2 produces same result, which one is moredesirable?TIA.
hi I am new at MSSQL 2000 DBA thing. and trying to learn more about analysis service/data warehouse/data mining. so is any expert out there can Recommend some good books or web link article to read? Thanks
DECLARE @MaxCountHistogram TABLE  (  MaxId  INT IDENTITY PRIMARY KEY NOT NULL,  PublicationId  INT NOT NULL,  ProviderId    INT NOT NULL,  DateLog  DATETIME NOT NULL,  Amount     FLOAT NOT NULL  ) INSERT INTO @MaxCountHistogram VALUES(432,3,'20150530',10.2564),(432,3,'20150630',13.2564),(432,5,'20150530',8),(432,5,'20150630',13),(433,3,'20150530',9),(433,3,'20150630',11),(433,5,'20150530',13),(433,5,'20150630',21)
I need to take for each Publication and Provider  and getting the diferential between two different months, for example:
Period             Delta Amount           Provider    PublicationId 20150530          10.2564                   3 432 20150630           3  Result of (13.2564- 10.2564 )     3            432