Approach Help To Load Data From Flatfiles Into Relational Table Where Data Is Coming As Spaces In Few Columns From Flatfiles
Sep 18, 2007
Hi ,
My Input is a flat file source and it has spaces in few columns in the data . These columns are linked to another table as a foreign key and when i try loading them in a relational structure Foreigh key violation is occuring , is there a standard method to replace these spaces .
what approach should i take so that data gets loaded in a relational structure.
for example
Name Age Salary Address
dsds 23 fghghgh
Salary description level
2345 nnncncn 4
here salary is used in this example , the datatype is char in real scenario
what approach should i take to load the data in with cleansing the spaces in ssis
I was testing my packages today and my packages were running sucessfully when I didnt have any valid data in the flat file. One the reason was as not rows were returned from the flat file none of the validation script components returned any err.
How do I count the # of rows which were read from flat file from the package and continue only if there is more than one row.
I tried using conditional split but as I wont have the row count value availble till the dataflow task runs this didnt help.
Is it best for me to have two dataflow tasks one resturns the count of records from flat file and the other starts if there are any rows. Now my problem is if I have rows to process how to I transfer the flatfile data to validate from DataFlowTask1 to DataFlowTask2?
I have a script task whcih counts the rows and decides to the TaskResult but once the TaskResult is sucess how do I use the values read in DataFlowTask1?
Can anyone help me in knowing how to migrate the flatfiles to MS-Access.I couldnt find the provider i need to use? Should i use the OLEDB destination or someother,Please help.
I am trying to transfer the data from OLEDB source to Flatfile destination. My Client need the flatfile of Record Length 80 Bytes. What does this means? Does it mean the output width or does it mean the flatfile format should be fixed length? please explain?
I have one SSIS pkg when I executed this pkg I got following error message
Error 0xc02020a1: Data Flow Task: Data conversion failed. The data conversion for column "column58" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.". (SQL Server Import and Export Wizard)
Error 0xc020902a: Data Flow Task: The "output column "column58" (250)" failed because truncation occurred, and the truncation row disposition on "output column "column58" (250)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component. (SQL Server Import and Export Wizard)
Error 0xc0202092: Data Flow Task: An error occurred while processing file "D:ok_filesmusercm.txt" on data row 1. (SQL Server Import and Export Wizard)
[DTS.Pipeline] Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "Flat File Source" (1) returned error code 0xC0202092. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing. There may be error messages posted before this with more information about the failure.
[DTS.Pipeline] Error: SSIS Error Code DTS_E_THREADFAILED. Thread "WorkThread0" has exited with error code 0xC0047039. There may be error messages posted before this with more information on why the thread has exited.
I am using Flat file as a source & SQL 2005 as Destination.my flat file have "Tab" as a column delimiter and "{CR}-{LF}" as row delimiter and TEXT Qualifier is ["].
Advanced setting options for flatfiles were tried, but didn't work.
For solving this problem i am also using Script Component for column 58.
I am also geting more then 60 warnings of truncation
Hi , I am loading the Data into the Tables with the constraints on and redirecting the error rows into a seperate table is there a way to capture the error rows from a execute sql task by directly loading data without constraints and later adding them with the execute sql task and redirecting them to error table as this approach would make the loads quicker. the approach now that i am using is on a row by row basis ..... and if i drop constraints and load data and then add constraints will this deposit the same error rows as in case of the current approach please send me ur suggestions
I am loading data using SSIS 2008 from a table in SQL Server 2008 DB to excel 97 sheet pre-defined with column headers. All the columns in excel is has 'Text' format property and the columns in the SQL Server table are defined as nVarchar. One of the columns has trailing spaces in few rows in DB but after exporting to excel 97, the spaces are gone. We need to retain the whitespaces in the column values. How can we do that.
how I can load the CSV file data into the sql server table. I know there are ways like bulk insert and other to load the csv file data into the table. But in my case the table doesn't exist and has to be created at the run time. With simple insert in temp table we do like select * into #temp from tablename and that creates the temp table. So. I need something like that which create the temp table and load the data into it. because the CSV file would have different number of columns and names so I can not create the table structure in advance. I have to create the table at run time.
I want to load flat files into a single table. But the flat files can have variable number of columns upto a maximum of 10 columns. The table in my database has 10 columns in it. So in case if I load a flat file having 6 columns then rest of the columns in the table will have nulls. I don't want to use script task for this as I am not good in writing C#code.
So I have been trying to get mySQL query to work for a large database that I have. I have (lets say) two tables Table_One and Table_Two. Table_One has three columns: Type, Animal and TestID and Table_Two has 2 columns Test_Name and Test_ID. Example with values is below:
In Table_One all types come under one column and the values of all Types (Mammal, Fish, Bird, Reptile) come under another column (Animals). Table_One and Two can be linked by Test_ID
I am trying to create a table such as shown below:
This should be my final table. The approach I am currently using is to make multiple instances of Table_One and using joins to form this final table. So the column Bird, Reptile, Mammal and Fish all come from a different copy of Table_one.
For e.g
Select Test_Name AS 'Test_Name', Table_Bird.Animal AS 'Birds', Table_Mammal.Animal AS 'Mammal', Table_Reptile.Animal AS 'Reptile, Table_Fish.Animal AS 'Fish' From Table_One
[Code] .....
The problem with this query is it only works when all entries for Birds, Mammals, Reptiles and Fish have some value. If one field is empty as for Test_Two or Test_Three, it doesn't return that record. I used Or instead of And in the WHERE clause but that didn't work as well.
I have a transaction table having about 40 crore rows in source. It don't have timestamp and unique key columns. It have only Bill_month and Bill_Year columns. Actually for loading this table into staging I have added a new datetime column by adding default bill_date as 01. Then
* First we delete last 3 month data from staging tables. * Get last 3 months data from source table. * Load that 3 months data from source to staging table.
We do this because we only get update for last three months data. Now I have to include this transaction table as Fact table in DW. What will be the best practice for loading the fact table by picking data form staging table. Also we have to look up with dimensions for Foreign Keys.
* Should I implement the same method of deleting last 3 months records and loading them again.
I have a business need to create a report by query data from a MS SQL 2008 database and display the result to the users on a web page. The report initially has 6 columns of data and 2 out of 6 have JSON data so the users request to have those 2 JSON columns parse into 15 additional columns (first JSON column has 8 key/value pairs and the second JSON column has 7 key/value pairs). Here what I have done so far:
I found a table value function (fnSplitJson2) from this link [URL]. Using this function I can parse a column of JSON data into a table. So when I use the function above against the first column (with JSON data) in my query (with CROSS APPLY) I got the right data back the but I got 8 additional rows of each of the row in my table. The reason for this side effect is because the function returned a table of 8 row (8 key/value pairs) for each json string data that it parsed.
1. First question: How do I modify my current query (see below) so that for each row in my table i got back one row with 19 columns.
SELECT A.ITEM1,A.ITEM2,A.ITEM3,A.ITEM4, B.* FROM PRODUCT A CROSS APPLY fnSplitJson2(A.ITEM5,NULL) B
If updated my query (see below) and call the function twice within the CROSS APPLY clause I got this error: "The multi-part identifier "A.ITEM6" could be be bound.
2. My second question: How to i get around this error?
SELECT A.ITEM1,A.ITEM2,A.ITEM3,A.ITEM4, B.*, C.* FROM PRODUCT A CROSS APPLY fnSplitJson2(A.ITEM5,NULL) B, fnSplitJson2(A.ITEM6,NULL) C
I am using Microsoft SQL Server 2008 R2 version. Windows 7 desktop.
Does anyone have an example of performing a source to destination data load with another SQL Select Statement controlling source statement? What I would like to do is split up a huge data move by performing a loop on the source and modifying the source select "where" clause using values from a control table. I understand how to modify the source statement by using an expression statement with variables. Now I'm trying to figure out how to loop through a control table to drive the source task.
How's everyone doing with 2005? I still feel like my heals are dragging a bit. The "knowledge" out there still seems a bit sparse. Am I right?
I'm about 1/5 the way through a BI book and decided the course will probably be necessary as well: http://www.kimballgroup.com/html/kucourseMDWD.html There seems to be a lot to come to grips with. So that's all paid for.
It's been years since I've been on a course of any kind. Too busy working! What's the etiquette? Trousers and T-shirt? Tie? Sandles & three quarter lengths?
This particular course seems to cover the "general overall approach" - which I am pretty sure is a necesary part of what you need to be able to do. The book that is related to the course (which I have already bought) seems really helpful and valuable.
But I don't think it will cover the nitty gritty of SSIS - where most of the grunt work happens. They have another course for that. But I can't wait that long. You can spend your whole life just reading books!
Are all the new little things in 2005 SSIS just a little to specialised? A little too clever? Does it limit us? Or empower us? Can we use it to get the job done? Or will it be a case again of "clever workarounds" when we travel far down a road that we find out (when it's too late and we are committed) can't provide us with what we want. Is it really saving us time? Are we better off writing the scripts ourselves?
------------------------ Me: What do you want to know from your data warehouse? Client: Err...Emm...Everything Me: OK, that's great. That's all I need to know. I'll see you when it's done.
I have a few questions for you guys. I have a client application that can be offline or online. While offline, records can be added and need to be later synced to production.
I will use rda to pull the table down, and this is working fine. Now what if I have multiple tables where I want a foreign key relationship?
With rda I can only pull down one table at a time from everything I've read. Now say create a constraint after pulling the two or more tables down. While in offline mode I create a new record on two seperate tables with foreign key/primary key relationship.
When I do the push to the server will it automatically update the foreign key reference (locally) to the right one on the production server? Or will I get a duplicate primary key error? On the production server the primary key will be different because of the identity. This is very important because I will have multiple clients.
We are working on a DataWarehouse app. The DW has been loaded wiith transactional data from the start of September. and we want refresh the DW with a full load from the original source. This full load wil consist largely of the same records that we loaded initially in the DW but some records will be new and others will have changed.
During the load I want to direct input records NOT already in the DW to a "mods" table and ignore those input records that alreayd exist in the DW. Can SSIS help out with this task?
I have a requirement to load bulk of csv files to sql table. some times, some columns could not come in csv file(some times 100 columns and some times 80 cloumns). That time the package is getting failed. How to create a table dynamically based on csv file structure.
The column prod_Master.M2_Pct is defined as float.Yet for some of the rows, the value in [% of Total Target] comes up as null even though there is a number in prod.Measure column. There is valid non-null data in prod_Master.M2_Pct.
I tried LTRIM(RTRIM(prod.Measure)), but no change.
Use ProdDB SELECT TOP (100) PERCENT dbo.prod.ProdNo AS [Prod No], dbo.prod.ProdName AS [Prod Name], CASE WHEN dbo.prod.Measure = 'P1' THEN dbo.prod_Master.P1_Pct WHEN dbo.prod.Measure = 'P2' THEN dbo.prod_Master.P2_Pct WHEN dbo.prod.Measure = 'P3' THEN dbo.prod_Master.P3_Pct WHEN dbo.prod.Measure = 'P4' THEN dbo.prod_Master.P4_Pct
I am just now starting the switch from .NET 1.1 to .NET 2.0. I really like the new way of using the SQLDataSource and setting up Views declaratively as opposed to doing it all in code, which brings me to my question.In some of my applications I have a single Stored Procedure return multiple result sets to a single DataSet where I have a DataRelation set up. Then I can have nested DataGrids that use the GetChildRows() method to filter the results to display the hierarchical data. I would like to do something similar with the SQLDataSource and GridViews but haven't found a way to get multiple result sets.One thought I had was to create a Strongly Typed Dataset and then use the ObjectDataSource object, but I still didn't see a way to get child rows out of the datasource. I've seen an example that uses a <FilterParameter> to get nested data, but there is an extra trip made to the server for each parent item as it just put an extra parameter in the WHERE clause of the query.
Hi Please help for this simple problemDTS Transfer or any other method?I have Customer_Order Table and customer_Order_Details Table.For OrderID = 1, I have 3 rows of Order Details.I want to transferCustomer_Order Table for OrderID 1 in DTS, the system should transferOrder Table as well as Order details table Rows for ORderID =1.How to customize in DT or is there any way to get this data to transferfrom source DB to Dest DB?KAMAL KUMAR V--Posted via http://dbforums.com
I have developed some packages to load data into "Fact" tables in the data warehouse. Some packages are OK, other ones not. What is the problem?: some packages load fact tables with lots of "Lookup - Data Flow Transformation" into the "data flow task" (lookup against dimension tables) but they are very very slow, too much slow to be choosen as a solution.
Do you have any other solutions to avoid using "Lookup - Data Flow Transformation"? Any other solution (SSIS, TSQL and so on....) is welcome to speed up the Fact table loading process.
I want to encrypt certain data like password, ssn, credit card info etc before saving in database. Also, this encrypted data can be queried using standard SQL statements like:
select * from users where userid=454 and password = 'encrypted data'
The mechanism to encrypt data could be in a .net application. The code that does encryption/decryption should also be protected so that it doesnt work if it falls in wrong hands.
Can anyone suggest what would be the best way to accomplish above?
I wonder what would be the best (at to be honest - how to do it at all) to perform data normalization with SSIS. The scenario is as follows: I got plain table with several columns in it.Some of columns can be copied straight into destination tableSome columns (String) should be lookup in another table to get IDOn success just replace string with IDOn fail - create new record in lookup table and return newly created ID Thanks for any ideas and maybe short samples
Here I will describe my problem. 1. We are loading large amount of data from database on background thread which is starting on Application_start event in global.aspx.cs file.The data is later cached for subsquent request to improve the performance. 2. Now when we put the application on web farm garden, it is not able to load the application. 3. We are sending the request the servers through Router kind of application. 4 This application is working fine on single server enviornment.
Hi i am trying to use this query to pull all the publications stored in the database and all the authors contributing to that publication (1 to many relationship). I am trying to use a sub query so that i can display the results on one row of a gridview (including a consecutive list of all the authors). However i am recieving this error: Incorrect keyword near the word SET. ?
Maybe i need to add a temp column in the sub query to pull all the related authors for a single publication - but i dont know the sql for this? can anyone help?
Thanks SELECT ISNULL(Publication.month, '')+ ' ' + ISNULL(convert(nvarchar, Publication.year), '') as SingleColumn, Publication.publicationID, Publication.title FROM Publication WHERE Publication.publicationID IN (SELECT (convert(nvarchar, Authors.authorName)) FROM Authors INNER JOIN PublicationAuthors ON Authors.authorID = PublicationAuthors.authorID) AND Publication.typeID IN (SELECT PublicationType.typeName FROM PublicationType INNER JOIN PublicationType ON Publication.typeID = PublicationType.typeID
Hello everyone, this is my first post here so hopefully I am not asking a common question.
I am trying to create a flat dataset in SQL 2005. Basically I run a query and I get multiple rows for the same primary key. The query I am running is quite large and has several different tables connected to it, here is a small sample of what it looks like...
Typeid(Primary Key) Individual Address
1 Sam 912 Ave. J
1 Sam 913 Ave. Q
1 Sam 914 Ave. R
2 Mike 1000 Ave. O
3 Jill 1001 Ave. O
I want it to kind of look like this
TypeID Individual Address_1 Address_2 Address_3
1 Sam 912 Ave. J 913 Ave. Q 914 Ave. R
2 Mike 1000 Ave. O
3 Jill 1001 Ave. O
As I said before, this query is pretty big, and has several variables like Address where multiple rows are being taken by one Primary Key.
If it is not possible to do this in SQL 2005, is there a program that may be able to? Right now we are using SPSS as sort of a bandaid... we run the query in small portions like the one in the sample and then restructure the in sections but this takes several hours.
Anyways, thanks for any help that you may be able to provide.
I have played around with SSIS in addition to reading an SSIS book front to back, but I am still a little confused as to how to import an Xml file with relational data.
Basically I want to import the Xml data into three tables: categories, products and fields. A product can belong to one or more categories and has one or more fields which store information about the product.
Using the Xml Source component I can load the Xml from the file, but I can only output one section (category, product or field) at a time. Since the relationship is infered from the hierarchical structure of the Xml (e.g. the fields don't store an ID of the product they belong to), I am not sure how to import the relationships into my tables.
If anyone has any tips on how I can go about that, then it would be most appreciated :)