Sampling Data Set Via Integration Services Data Flow For Data Mining Models Without Saving Training And Test Data Set?

Nov 24, 2006

Hi, all here,

Thank you very much for your kind attention.

I am wondering if it is possible to use SSIS to sample data set to training set and test set directly to my data mining models without saving them somewhere as occupying too much space? Really need guidance for that.

Thank you very much in advance for any help.

With best regards,

Yours sincerely,

View 5 Replies


ADVERTISEMENT

Is There Any Way To Train A Portion Of A Training Data Set From A Selected Dataset For Data Mining?

Jun 19, 2007

Hi, all experts here,



I am wondering is there any way to select only a portion of a data set to train the mining model? In this case, I mean we dont need to split the dataset in advance, what I want to do is being able to select any random portion of a selected dataset to train a mining model. Any advices?



I am looking forward to hearing from you and thanks a lot in advance for your advices and help.



With best regards,



Yours sincerely,



View 3 Replies View Related

How To Split The Data Into Training And Validation Sets When Doing Data Mining?

Jun 15, 2007

Could I ask how to spit the data into training and validation sets when doing data mining?



Thanks

View 1 Replies View Related

Where Can I Store The Mining Results From Mining Models In SQL Server 2005 Data Mining Engine?

Apr 26, 2006

Hi, all here,

I am wondering where can I store my mining results in data mining engine? For example, I got mining results like accuracy chart, decision trees, and other formats of results based on different mining algorithms I used for my data mining, so where can I actually store the results for reporting service use later? Is it possible to do that in SQL Server 2005?

Thanks a lot for any help and guidance in advance.

View 4 Replies View Related

How Is The 'Score' Value Derived In The Lift Chart/Mining Legend For Data Mining Models?

Sep 26, 2006

Hi,
I have just run a simple data set through a model to predict a simple true or false value (i.e. binary output)
The Lift Chart/Mining Legend in Analysis Services shows three results Score, Population Correct (%), and Predict Probability (%)

Population Correct I beleive is the percentage of predictions it got right out of the total number of predictions it tried to make. Is this correct?

However, I cant work out how the other two are derived in particular the 'SCORE'. To give a live example the scores were as follows:

Model Score Pop Correct Pred Probability
Decision Trees 0.83 76.59% 54.28%
Neural Network 0.75 67.63% 50.05%
Ideal Model 100.00%


Can anyone help with this and give a detailed explanation?

Many thanks,
S Rajput

View 4 Replies View Related

Where Can I Get Data Mining, Analysis, Integration And Reporting Services?

Jul 26, 2007

Hi,
I'm trying to learn about analysis, integration and reporting services. I have install SQL server 2005 management Studio Express. but I cant find these in the Start menu as mentioned in the tutorial Click Start > SQL Server 2005 > Business Intelligence Development Studio.(for reporting services).
what do I need to do? Please help me.

Regards.

View 1 Replies View Related

Data Mining Training Courses

Apr 27, 2007

I have been trying to use SQL 2005 data mining for about 8 weeks. I am becoming frustrated because I am not able to make progress nor am I able to exploit the power of the system.

I need a training course! I have asked Microsoft in UK for recommendations but they have been unable to help. I have searched for courses in the UK and US without sucess.

I am coming to the Microsoft BI event in Seattle - will there be any opportunities there to get help or find help? (In Seattle I intend to concentrate on the Excel add ins)

Thank you

View 5 Replies View Related

Integration Services(data Flow Error)

Apr 25, 2006



when executing my data flow package that contains only one source and one destination

OLE db source -> SQL server destination

the following errors occurs in my output

Error: 0xC0202009 at Data Flow Task(infraction action), SQL Server Destination [3600]: An OLE DB error has occurred. Error code: 0x80040E14.

Error: 0xC0202071 at Data Flow Task(infraction action), SQL Server Destination [3600]: Unable to prepare the SSIS bulk insert for data insertion.


Error: 0xC004701A at Data Flow Task(infraction action), DTS.Pipeline: component "SQL Server Destination" (3600) failed the pre-execute phase and returned error code 0xC0202071.


i've checked the structure of my source and destination table but nothing seems to be wrong

if someone have ever faced these errors help me :D

View 22 Replies View Related

Can We Automate The Data Mining Models Management?

Jun 12, 2006

Hi, all here,

I am having a question about automating data mining models managements. As we know in many businesses, patterns vary very frequently, therefore, the mining models created will need to be created again afterwards according to new rules appearing in the data. But can we make all these process automated like automatically assessing the mining model accuracy and automatically recreate the mining models based on predifined specifications? Would please any one here give me any idea about that?

Thanks a lot for any guidance and help for that.

With best regards,

View 3 Replies View Related

Integration Services :: Use Of Pointer In Data Flow Task

May 9, 2012

I'm a beginner in ssis. Use of Pointer in Data Flow task (Transformations)Royal PS

View 11 Replies View Related

SQL Integration Services Data Flow Task Slows Down

Dec 7, 2007

We are using an OLE DB Source for the Data Flow Source and OLE DB Destination for the Data Flow Destination. The amount of data being moved is about 30 million rows, and it is gather using a sql command. There is not other transformations in between straight from one to another. The flow starts amazingly fast but after 5 million rows it slows considerably. Wondered if anyone has experienced anything similar with large loads.

View 6 Replies View Related

What Data Mining Tasks Can Be Automated And Scheduled Via Integration Services Packages?

Jun 14, 2006

Hi, all here,

Would please any expert here give me any guidance about what Data Mining tasks can be automated and scheduled via Integration Services Packages? Also, If we automated the tasks, can we also automatically save the results of the tasks somewhere? Like if we automate assessing the accuracy of a mining model, then we wanna know the mining model accuracy later, therefore, we need to save all these results from the automated actions. Is it possible to realize this?

Thanks a lot in advance for any guidance and help for this.

With best regards,

Yours sincerely,

View 3 Replies View Related

Nested Tables With Data Mining Training Transform?

Nov 23, 2007

Hi ...I can't figure out how to put nested tables into the Data Mining Model Training Transform (SSIS). Can anybody help me? some example please...!!!??
Diego B.

View 3 Replies View Related

Integration Services :: How To Ignore Eventual Data Flow Failure

May 11, 2015

Data flow A take data from the Excel File A, Data B from Excel File B, Data C from Excel File C. What I'd like to do is that if something goes wrong on Data Flow A I would be alerted but the package should continue to running. The same for the DataFlow B, if A it's ok go on, if B fail send me the mail but continue until the end (so running the Data Flow C).

View 2 Replies View Related

Integration Services :: Using Sensitive Project Parameters In Data Flow Tasks

Feb 11, 2014

I have a requirement to read an encrypted file as a data source. I am not allowed to save an unencrypted text file version on disc  at any time for any length of time, therefore I created a custom source component that reads an encrypted csv file, decrypts it, and then passes each row of data to the pipeline and ultimately to an ole data destination. Basically it is just a text file reader with an added class that adds functionality that decrypts the file before the component sets columns or reads rows. 

The custom component, “Encrypted File Source”, has a custom property “encryptionkey” with the encryption required flag set to true (code below) and is declared as eligible to be set in the expressions.

IDTSCustomProperty100 EncryptionKey = ComponentMetaData.CustomPropertyCollection.New();
            EncryptionKey.Name =
"EncryptionKey";
            EncryptionKey.Description =
"Secure String key value to decrypt the file";
            EncryptionKey.Value =
string.Empty;
            EncryptionKey.ExpressionType =
DTSCustomPropertyExpressionType.CPET_NOTIFY;
            EncryptionKey.EncryptionRequired =
true;

I want to be able to set the password for the encrypted file in the SQL Agent job that executes the SSIS project. This means I have an environment with a variable, “DataPassword”, that is set to sensitive.  It maps to a Project parameter in the SQL Agent job that is also set to sensitive.  And I now I want to access that sensitive Project Password inside my data flow, specifically in the Encrypted File source task that I created and set my EncryptionKey to that Project Parameter. 

The problem is that SSIS says. 

"expression cannot be evaluated.  ... The Expression will not be evaluated because it contains sensitive parameter value "$Project::DataFilePassord" . Verify that the expression is used properly and that it portects sensitive information"
((Microsoft.DataTransformationsServices.Controls) "<v:shapetype coordsize="21600,21600" filled="f" id="_x0000_t75" o:preferrelative="t" o:spt="75"
path="m@4@5l@4@11@9@11@9@5xe" stroked="f">

[Code] ....

I am using SQL Server 2012, on a windows 7 box with VS2010 premium.

View 4 Replies View Related

Integration Services :: SSIS - Can Use Variables In A Data Flow Task Command

Jul 2, 2010

In my SSIS Data Flow Task, I have a query that retrieves data based on a couple of date parameters. Is there a way we can pass/use the Variables defined in the SSIS package in the query ?

(I am assigning values to those variables from C# code)

The query should look like this:

select ordernumber, customerid from salesorder

where statecode=3 and datefulfilled between @variable1 and @variable2

View 8 Replies View Related

Integration Services :: Data Flow Task Flat File Connection

Apr 29, 2015

I have a Data Flow Task within a ForEach loop container.  The source of the flow is ADO.NET connection and the destination is a Flat File Connection.  I loop through a collection of strings in the ForEach loop.  Based on the string content, I write some data to the same destination file in each iteration overwriting the previous version. I am running into following Errors:

[Flat File Destination [38]] Warning: The process cannot access the file because it is being used by another process.
[Flat File Destination [38]] Error: Cannot open the datafile "Example.csv".
[SSIS.Pipeline] Error: Flat File Destination failed the pre-execute phase and returned error code 0xC020200E.

I know what's happening but I don't know how to fix it.  The first time through the ForEach loop, the destination file is updated.  The second time is when this error pops up.  I think it's because the first iteration is not closing the destination file. How do I force a close of the file within Data Flow task or through a subsequent Script Task.This works within a SQL 2008 package on one server but not within SQL 2012 package on a different server.

View 5 Replies View Related

SSIS Data Mining Model Training Transform (Nested Tables)

Oct 26, 2006

I can't figure out how to put nested tables into the Data Mining Model Training Transform (SSIS). I can do a simple case table, but how do you get those nested tables with DM Training Transformation? Any ideas? Samples?

Thanks in advance,

-Young K

View 3 Replies View Related

Integration Services :: Clear SharePoint List Before Running SSIS Data Flow

Oct 19, 2015

I am using the SharePoint adapters from Codeplex that allow me to use SharePoint source and destination tasks in SSIS for SQL Server 2008 and SharePoint 2010.  I am able to pull the data from the SQL Server and insert it into the SharePoint List.

However, I prefer to just have fresh data every time, so I'd like to add a step to delete all the items in the list before inserting the new ones.  Is there a way I can configure the SharePoint SSIS destination task to clear all the items before I insert new ones?

View 3 Replies View Related

Integration Services :: Check Table For Existing Record Before Data Flow Task

Jun 1, 2015

Using SSIS 2012 (within Visual Studio) on Windows 7.

Before allowing my Data Flow task to fire, I'd like to check the target table (OLE DB Destination) for a specific date value in a specific field. I've seen how the Lookup Task is commonly used to check for dupes before inserting, but I'm not able to use that method because the data value I want to search the table for is contained in a Global Variable (let's say "MyVariableDate"). 

Is there any way to check for any records in a target table where Date1 = MyVariableDate (i.e. scanning the entire table for any occurrence of MyVariableDate in the Date1 field)?

View 12 Replies View Related

Integration Services :: Multiple Data Flow Tasks Within Foreach Loop Container

Nov 3, 2015

Suppose if I have a “Foreach Loop Container” that iterates over a list. Is it possible to execute different data flow tasks based on the input?

Example : List contains elements L1, L2 & L3.

ForEach Loop Container checks the input. If its L1 then it should execute DF Task1, If L2 then execute

DF Task2 and similarly for L3.

Is it possible to achieve this?

View 4 Replies View Related

Integration Services :: Event Handler Data Flow Fails When Running Package?

Sep 25, 2015

I have created an event that contains a Data flow tasks with OLE DB source & Excel Destination.

This event is executed/triggered based on an execute SQL task failure in the control flow Sequence container.

However, when I execute the Data Flow task of the Event Handler, it runs successfully but fails when I execute the whole package.

I get the below error message:

[OLE DB Source [21]] Error: SSIS Error Code DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER.  The AcquireConnection method call to the connection manager "TK463DW" failed with error code 0xC0202009.  There may be error messages posted before this with more information on why the AcquireConnection method call failed.

I have tried setting the property 'DelayValidation' to 'True' on all the Control Flow and Data Flow tasks on the package and on the Event Handler, but still I could not fix this.Not sure What I am missing. 

View 4 Replies View Related

Problems Predicting Unseen Cases Of Nested Tables Data Mining Models

May 27, 2007

Lets take the following example:



Movie train table:
ID Class
1 +
2 +
3 -
4 +
5 -



Actor train nested table:
ID MovieID Gender
1 1 F
2 1 M
3 1 F
4 1 F
5 2 M
6 2 M
7 2 F
8 3 F
9 3 F
10 4 M
11 4 M
12 4 F
13 4 F
14 5 F
15 5 M



We want to build a classifier model in order to predict the Class of a Movie based on the Gender of movie's actors. To deal with the nested table Analysis Services maps each record of the nested table to an attribute of the case table. These attributes are named Actor(n).Gender with n = 1..15, and so they are dependent on the nested table record numbers. Both Microsoft Decision Trees and Microsoft Naive Bayes algorihms use these attributes without any modification.



We are implementing a Relational Naive Bayes algorithm and we are planning to aggregate such attributes in order to make them independent of the nested table record numbers.


Next step we tried to predict some unseen cases and here we face with
a very huge problem.


Lets take more two tables of unseen cases:



Movie test table:
ID Class
6 +
7 NULL
8 NULL



Actor test nested table:
ID MovieID Gender
1 6 F
2 6 M
3 6 F
4 6 F
16 7 F
17 7 M
18 7 F
19 7 F
20 7 F
21 8 M
22 8 M
23 8 F



Predicting the movie 6 Class is not a problem since the movie actors were included in the training dataset and when the records are mapped to attributes because they already exist in the model. But when you
try to predict movies (7 an 8) with unseen actors all new attributes are simply ignored in the ALGORITHM:redict call (in_ulCaseValues is zero!) because they do not exist in the model!



What is the solution?

View 3 Replies View Related

Data Mining :: Informational (Data Mining) - Decision Trees Found No Splits For Model

Sep 29, 2015

I followed the tutorial posted at [URL] ...

Everything was ok until the last step where I had to process the mining structure which resulted in a warning

"Informational (Data mining): Decision Trees found no splits for model, Tbl Decision Tree Example."

What does this error mean? How do I resolve it? Also, I only see the first level in the Mining Model Viewer, I don't see the levels 2 and 3.

View 2 Replies View Related

Integration Services :: SSIS VB Script Loading Data Into Oracle DB Missing Some Data

Nov 10, 2015

I'm using Script Component to load data into Oracle DB due to the poor performance issue. Now, I found it will missing some data during the transmission. Please see the screenshot below: 

SQL Server:
Oracle:
DDL:

create table Person
(
BusinessEntityID Integer,
FirstName nvarchar2(50),
MiddleName nvarchar2(50),
LastName nvarchar2(50)
);

Result:

I follow up this article: [URL] ....

VB Script: 
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

[Code] ..........

View 8 Replies View Related

Integration Services :: SSIS - Managing Data Integrity When Importing Sharepoint Data

Sep 28, 2015

I setup this package to import data from a Sharepoint list to a SQL Server data table. The primary key of my SQL table is mapped to the Title column of my Sharepoint list. There is a possibility that duplicate values will be entered in the Title field of the Sharepoint list. So when importing data into my table via SSIS, my package always error-out when there it comes across duplicate values. how you others have managed data integrity when importing from a Sharepoint list with the Title column being mapped to the primary key of a table.

View 4 Replies View Related

Integration Services :: SSIS 2012 - Can't Drag Objects Or Resize In Control Or Data Flow

Feb 3, 2014

I recently upgraded to on 2012 SP1 CU5 and have found the SSDT gui for SSIS to be almost unusable. I can't drag or resize items. Any time i try they either automagically shrink to the tiniest possible size, shoot off to some extreme or just shake uncontrollably I didn't have these problems on previous versions (dont remember what It was).

Is there a fix for this?

View 9 Replies View Related

Integration Services :: How To Achieve Check Point Like Functionality In Data Flow Task Itself In SSIS

Jun 4, 2015

I have huge data and i am loading data from EXCEL to database table, after loading 80 percent data i am getting some error. My package got failed and it has lots of transformation and took around 6 hours to process completely because of that i don't want it to reload from start. if i run it again it should start from next record from where i got the error.

View 3 Replies View Related

Integration Services :: DefaultBufferMaxRows - Is It Determined By Row Length Of Data Flow Task Source Or Destination

Oct 18, 2015

We have a single generic SSIS package that is used to import several hundred iSeries tables into SQL. I am not looking to rewrite the process. But I am looking for ways to improve performance.

I have tried retain same connection, maximum insert commit size, lock table (tablock), removed some large columns, played with the log file location and size, and now I am working to tweak the defaultbuffermaxrows.

To describe the data flow task - there are six data flows tasks (dft)  working at the same time. Each dtf has their own list of iSeries tables and columns and the corresponding generic SQL table names. Each dtf determines their list of tables based on the number of columns to import. So there is dft30 (iSeries table has 1-30 columns to import), dtf60 (iSeries table has 31-60 columns to import), etc. The destination SQL tables are generically called Staging30, Staging60, etc. Each column in the generic Staging tables are varchar(100). The dtfs are comprised of an OLE DB Source and an OLE DB Destination.

The OLE DB Source uses a SQL Command from Variable to build a SELECT statement. The OLE DB Source uses a connection manager that uses an IBM iAccess IBMDA400 provider.  The SQL Command ends up looking like this for the dtf30. This specific example is importing from the iSeries table TDACLR and it only has two columns so it will be copied to the Staging30 table.

select TCREAS AS C1,TCDESC AS C2,0 AS C3,0 AS C4,0 AS C5,0 AS C6,0 AS C7,0 AS C8,0 AS C9,0 AS C10,0 AS C11,0 AS C12,0 AS C13,0 AS C14,0 AS C15,0 AS C16,0 AS C17,0 AS C18,0 AS C19,0 AS C20,0 AS C21,0 AS C22,0 AS C23,0 AS C24,0 AS C25,0 AS C26,0 AS C27,0 AS
C28,0 AS C29,0 AS C30,''TDACLR'' AS T0 from Store01.TDACLR

The OLD DB Source variable value looks like the following, but I am not showing the full 30 columns

select cast(0 AS varchar(100)) AS C1,cast(0 AS varchar(100)) AS C2,cast(0 AS varchar(100)) AS C3,cast(0 AS varchar(100)) AS C4,cast(0 AS varchar(100)) AS C5, ... cast(0 AS varchar(100)) AS C30.

The OLE DB Destination uses OpenRowSet Using FastLoad From Variable. The insert into Staging30 ends up looking like this.

insert bulk STAGE30([C1] varchar(100) ,[C2] varchar(100) ,[C3] varchar(100) ,[C4] varchar(100) ,[C5] varchar(100) , ...  ,[C30] varchar(100) ,[T0] varchar(20)

Of course we then copy and transform the Staging30 data to the SQL table that equals T0.

But back to defaultbuffermaxrows. Previously the dtfs had default values of 10000 for DefaultBufferMaxRows and 10485760 for DefaultBufferSize. I added a SQL task to SUM the iSeries column sizes, TCREAS and TCDESC in this example, and set the DefaultBufferMaxRows by dividing the SUM of the columns max_length into 10485760. But I did not see a performance improvement. Do you think that redefining the columns as varchar(100) for the insert is significant? Should I possibly SUM the actual number of columns (2) as 2x100 or SUM the 30x100?

View 4 Replies View Related

Integration Services :: SSIS Data Flow Items TAB Missing On Visual Studio 2013?

Sep 22, 2015

Basically i'm trying to create an SSIS workflow to download Sharepoint List data to SQL Server on a schedule of some kind.do we actually have to use the GAC install approach in order to get the Sharepoint List Destination and Sharepoint List Source entries to appear on the SSIS Project workflow entities?

View 4 Replies View Related

Mapping Of SQL Server Data Types To Integration Services Data Type

Oct 14, 2005

Does anyone know of any cross-references between SQL Server data types and the new data types introduced with SQL Server Integration Services? 

View 6 Replies View Related

Integration Services :: Best Way To Value Data Column In Data Pump From A Flat File

Aug 28, 2015

I have to value [CreateDate] in the data pump of my Flat File Source into my OLE DB Destination SQL Server Table. With a Variable within the SSIS Package or with a Derived Column task within the Data Flow between the Flat File Source and OLE DB Destination?

View 2 Replies View Related

How Can We Compare Other Data Mining Software Packages With SQL Server 2005 Data Mining?

Feb 23, 2007

Hi, all experts here,

I would like to know if there is any way to migrate third-party data mining packages with SQL Server 2005 data mining algorithms together then we can have a comparison among all of them to get the best results for training models.

I am looking forward to hearing from you.

Thanks a lot.

With best regards,

Yours sincerely,

View 1 Replies View Related







Copyrights 2005-15 www.BigResource.com, All rights reserved