Fuzzy Grouping Matching Nulls To Empty Strings/spaces

May 30, 2007

Will the fuzzy grouping task match a null value to an empty string (or spaces)? I've got 5 columns I'm matching on, and one of them may be null for certain rows but an empty string for others. Given the 4 other columns may match, will this difference stop similar columns being grouped together?

(Someone's modified my grouped data since it was deduped, which takes a while, and I'm hoping for a quick answer on this).

Thanks in advance.

Ben

View 3 Replies

Do Not Keep NULLS Using SSIS Bulk Insert Task - Insert Empty Strings Instead Of NULLS

May 15, 2008

I have two SSIS packages that import from the same flat file into the same SQL 2005 table. I have one flat file connection (to a comma delimited file) and one OLE DB connection (to a SQL 2005 Database). Both packages use these same two Connection Managers. The SQL table allows NULL values for all fields. The flat file has "empty values" (i.e., ,"", ) for certain columns.

The first package uses the Data Flow Task with the "Keep nulls" property of the OLE DB Destination Editor unchecked. The columns in the source and destination are identically named thus the mapping is automatically assigned and is mapped based on ordinal position (which is equivalent to the mapping using Bulk Insert). When this task is executed no null values are inserted into the SQL table for the "empty values" from the flat file. Empty string values are inserted instead of NULL.

The second package uses the Bulk Insert Task with the "KeepNulls" property for the task (shown in the Properties pane when the task in selected in the Control Flow window) set to "False". When the task is executed NULL values are inserted into the SQL table for the "empty values" from the flat file.

So using the Data Flow Task " " (i.e., blank) is inserted. Using the Bulk Insert Task NULL is inserted (i.e., nothing is inserted, the field is skipped, the value for the record is omitted).

I want to have the exact same behavior on my data in the Bulk Insert Task as I do with the Data Flow Task.

Using the Bulk Insert Task, what must I do to have the Empty String values inserted into the SQL table where there is an "empty value" in the flat file? Why & how does this occur automatically in the Data Flow Task?

From a SQL Profile Trace comparison of the two methods I do not see where the syntax of the insert command nor the statements for the preceeding captured steps has dictated this change in the behavior of the inserted "" value for the recordset. Please help me understand what is going on here and how to accomplish this using the Bulk Insert Task.

View 2 Replies View Related

Storage And Performance Of NULLs And Empty Strings (was Noobish Question)

Feb 11, 2005

Am I right in assuming that when I have a column where all fields contain NULL, this does not increase the total data storage size if my database? Also, what kind of impact would it have on performance?

And what if I inserted "" in varchar columns? I would think the increase in size would be marginal?

The reason I'm asking is that I want to use an existing table and stored procedures for another purpose, but only need half of the columns. But it would significantly simplify application development.

View 3 Replies View Related

Difference Between The Fuzzy Lookup And Fuzzy Grouping In Ssis

Aug 14, 2007

Dear Friends,

i think fuzzy lookup

COMPARES WHAT WE ARE MAPING THE COLUMNS WITH SPELLING (IT WILL REJECT ATLEAST 1 LETTER IS DIFFRENT IN ANY RECORD MAPPED COLUMN) EX: RAVI != REVI

what is fuzzy grouping ???? please explain

regards
koti

View 3 Replies View Related

Fuzzy Phrase Matching

Oct 3, 2007

A column in my database contains phrases such as "Extreme Golf: The Showdown" or "Welcome to Happy Land". I need to write a search engine so that users could type in phrases such as "Golf Extreme Showdown" or "Happy Land" and the correct, or closest matched results will be returned. I don't need variations of words, just phrase keyword match based search. I know I could do this by using multiple LIKE %% statements OR'd together, but this would be too performance intensive. So, I have heard I should use charindex somehow to achieve this in a stored procedeure. Does anyone have any clue how to solve this problem? Thanks!

View 7 Replies View Related

Full Text With Fuzzy Matching

Jan 11, 2007

Andrew Worral writes "I am currenly working on a website that uses Full Text search to search the name of companies.

We are having trouble figuring out what tools are best suited for this with SQL 2005 Standard/Enterprise and how to implement them.

The first issue to address would be Misspelling of words. Such as Looking For "Davids Shoe Repare" and returning "David's Shoe Repair"

Besides the spelling in "Davids Shoe Repare" there is also the issue of the " ' " in David's which we have not come up with a good solution for yet. So a search for David will not returns "David's"

I have done a little looking into Fuzzy matching with Integration Services but I am not sure this is the right tool, nor am I sure of the overhead involved and any speed issues with this. Nor am I in any way overly familiar with Integration Services.

What would you suggest?

Thank you in advance!

-Andy"

View 7 Replies View Related

Fuzzy Matching - Address Cleansing

Oct 11, 2006

Hi *,

does anyone know if MS supports some kind of breaking strategy within Fuzzy Lookup/Grouping?

Besides that, I'd like to perform a address cleansing operation on a CRM database. I don't have a reference table (Street, Zip, LastLine, etc.) for that. Where can I get an appropriate database? Anyone has some experience with this issue?

Thanks a lot.
S.

View 5 Replies View Related

Fuzzy Grouping Error

Oct 16, 2005

I am using the Sept CTP, I am doing a fuzzy grouping on 1.5Mil records.

View 7 Replies View Related

Fuzzy Grouping Using Original Key

Nov 14, 2007

I managed to get fuzzy grouping working. The relevant output (_key_in and _key_out) are stored in a new table that is a copy of the old table + fuzzy grouping columns.

How do i get SSIS to store the _key_in and _key_out in the original table?
The new matching column _key_out refers to the new key: _key_in. How could i get SSIS translate that to a matching column that refers to my original key?

View 1 Replies View Related

Fuzzy Grouping In SSIS

Aug 2, 2007

hi focks,

WHAT IS THE USE OF Fuzzy Grouping IN SSIS

and please give me the example

regards
koti

View 1 Replies View Related

Fuzzy Grouping Errors

May 15, 2006

Hi - we have been evaluating using Fuzzy Grouping and Lookup for maintaining our large list of customer records. Initial testing with Grouping on about 300K records went great but now with a larger sample of 7.3 million records we are running into problems. It doesn't appear to be system limitation - the index is built reasonably quickly and without errors but when it starts the matching we get these errors:

[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: The ProcessInput method on component "Fuzzy Lookup" (86) failed with error code 0x8000FFFF. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.

[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread0" has exited with error code 0x8000FFFF.

[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.

[Fuzzy Grouping Inner Data Flow : OLE DB Source [1]] Error: The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020.

[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread1" has exited with error code 0xC0047039.

[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: The PrimeOutput method on component "OLE DB Source" (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.

[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "SourceThread0" has exited with error code 0xC0047038.

One thing we did find is that our test server didn't have SP1 installed and that seemed to help a lot (we were getting buffer errors prior to SP1). One other note - the desination table is populated with all the data but no scoring has been applied to it.

Does anyone have any ideas what could be causing this?

Thanks!

Keith Doyle

View 5 Replies View Related

Fuzzy Grouping In Parallel

May 11, 2007

Hello,

I have created a project to do de-dupification of addresses.

I understand that Fuzzy Grouping will take less time if it has lesser data volume to process.

My source feed file is sometimes huge. So I am splitting the input into multiple branches based on

the first letter of the city. There are 7 branches in the process.

Source File Feed
|
Split data into 7 groups
|
------------------------------------------------------------------------------------------------------------------------------------------
| | | | | | |
FzGrpg FzGrpg FzGrpg FzGrpg FzGrpg FzGrpg FzGrpg
| | | | | | |
Split Split Split Split Split Split Split
| | | | | | |
------------- -------------- -------------- -------------- -------------- -------------- --------------
| | | | | | | | | | | | | |
<- - - - - - - Write the Canonicals and Dupes from each of these splits into database - - - - - - - - ->

When I designed this I was hoping that each of the Fuzzy Grouping tasks will execute in parallel.

But in reality they are processing one after the other.

Is there anyway to make them execute in parallel?

Appreciate your help.

Thanks

KM

View 12 Replies View Related

Extracting The Duplicates Using Fuzzy Grouping

Jul 29, 2006

Hi,

I have an Oracle table called "Party" which contains Party_Id as primary key and have Party_Name, Party_Addr etc., as fields. We have lot more duplicate party details such as (party_name and party_addr) in this table. We are trying to aviod duplicates using FUZZY logic of SSIS.

1. Is any body suggest me how to create package to avoid duplicates using Fuzzy logic for this scenario(Step by step instructions are good for me to understand SSIS).

2. Could you please provide me some samples for FUZZY(Please send me a sample to my email)

View 1 Replies View Related

Fuzzy Grouping Causing Minidump

Oct 18, 2007

I was running a Fuzzy Grouping task on SQL Server Enterprise Edition SP1 without any issues. I then applied SP2 and now that same Fuzzy Grouping is causing a minidump and terminating the process.

First, does anybody know anything about this kind of issue?

Second, I tried to run the minidump file in Visual Studio but I cannot actually run the dump file in Visual Studio as I keep getting the following exception:

Debugging information for 'DtsDebugHost.exe' cannot be found or does not match. No symbols loaded.

Finally, I did obtain a random error on the server itself that displayed the GUID: 58FC39EB-9DBD-4EA7-B7B4-9404CC6ACFAB.

This GUID appears to be tied to a Dr. Watson error but, again, I cannot figure out what process is breaking.

Can somebody please help?

View 1 Replies View Related

Need Help In Address Cleansing-- Fuzzy Grouping?

Oct 4, 2007

Hi,

We do not have any Address Cleansing tools and the requirement is we have to cleanse the data, finding the best possible record which has all info and update other records accordingly.

I am Not sure we can do this Fuzzy Grouping Transformation.

Example:

I have Source table with following info.

Customer_id

Location_Address

Location_City
Location_State
Location_Zip
Location_County

TT101
252 HARVARD RD
ATLANTA
GA
30340
FULTON

TT101

30340

TT101
252 HARVARD RD
ATLANTA

TT102
125TEST
CUMMING

TT102
125 TEST DR
CUMMING
GA
30040
FORSYTH

TT102

GA
30040

Please let me know the solution

Thanks in advance.

View 4 Replies View Related

Dynamically Configuring Fuzzy Grouping

Jun 7, 2007

Hello,

I have been struggling with this for quite awhile so any help would be appreciated.

I need to know if there is away to populate the fuzzy grouping control dynamically. I know you programmatically design a package and customize it in C# but for our purposes we would like to control the SSIS package via database settings. When the settings change the package would then act different. Its a simple a package consisting of an Input - fuzzy grouping - conditional split - output. The connections are setup dynamically using parameters, expressions and a script task. Is there anyway I could do a similar thing for Fuzzy Grouping?

View 13 Replies View Related

Fuzzy Grouping Performance Issue

May 30, 2007

Hello All,

We have a SSIS package which includes Fuzzy Grouping in Data Flow. It takes two columns from source table and saves outputs in different table with match score etc. Following is the way we are doing it:
1. Load required data from table using OLEDB connection (source)
2. Sort the data
3. Apply Fuzzy grouping (using dedicated database instead tempdb and MinSimilarity = 0.6)
4. Send to destination table using OLEDB connection (destination)

In input table we have millions of records. It takes too long to execute and even sometime it fails after running 12 hours. Any suggestions for performance improvement are welcomed.

Appreciate your help.

Thanks and regards,
Ashish Basran

View 1 Replies View Related

Fuzzy Grouping Resource Usage

Nov 21, 2007

I have a few questions about the amounts of resources used by the fuzzy grouping transformation. I am running a little less than 5mil records through a fuzzy grouping that exact matches one column and fuzzy matches one. The server executing the package is a dual-core xeon with 2gb ram, running a default instance of sql 2005 enterprise.

I have been attempting to execute this package for a while now but it keeps erroring out for various reasons. At first, it was from a lack of available memory. I limited the memory usage of sql server to 256mb and set the buffer temp storage path, which alleviated those errors. However, now, my tempdb transaction log is growing significantly. It failed once for not being able to grow and reallocate quickly enough, but enlarging the auto-growth factor fixed that. Then, it filled up the volume the tempdb log was on, so now I have moved it to the san and am about to try again.

I was wondering, does anyone have a general idea on approximate resource usage by fuzzy grouping? Specifically, is there an approximate relation between the number of records grouped and the amount of ram/pagefile required? Also, on the database backend, how big can I expect the tempdb data/log files to get?

View 5 Replies View Related

Fuzzy Lookup / Grouping Design

Apr 7, 2008

Hi,

I need some advice on fuzzy lookup / grouping design.
I have a requirement that, I think, is between lookup and grouping transformations.

In one of our applications, users can enter manually a label for some information in the database.
Every month, I will store all the new data in our OLAP DB, and I want to group these labels with a fuzzy logic.
Historical data (already loaded) have to be grouped, as well as new data coming every month.

I have no predefined canonical data, so Fuzzy Lookup seems not adapted to my pb.
Fuzzy Grouping seems ok, but it would require to put historical data as well as new data as an input of the Fuzzy Grouping Transfo to constitute groups. This seems not efficient to me.

Any clue ?

M.D

View 1 Replies View Related

Fuzzy Grouping Similarity Calculations

Apr 30, 2008

Hi all,

My question is how to calculate the similarity by using SQL query, example LIKE % , order by.....? Now i'm doing a function same like fuzzy grouping but i do not know how to get the answer, mean how they get match with those selected row of data.

Hope my question is clear. How to write the correct query? What should i do? I 'm newbie in Integration Services, so i need ur explaination in step by step if there hv correction.

I am looking forward to hearing from you shortly and thanks a lot in advance.

Thanks!

rgds,
xuenly

View 3 Replies View Related

T-SQL (SS2K8) :: Specific Column Matching With Nulls

Apr 9, 2015

I'm working on a join between two tables where I only want one row returned... I'm matching on two columns between two tables. One of those columns in the target table could be null. I only want one record returned.

create table #vehicle(id int, vehiclemake varchar(10), vehiclemodel varchar(10), classtype varchar(1))
create table #class(id int, classtype varchar(1), value int)

insert into #vehicle values(1, 'AUDI', 'R8', 'A')
insert into #vehicle values(2, 'AUDI', null, 'B')

insert into #class values(1, 'A', 100)
insert into #class values(2, 'B', 1)

[Code] ....

Using the above example, if VehicleModel is anything other than 'R8' is specified then I want it to return the other class type record.

This is going to be used as a join within a bigger statement, so I'm not sure ordering and returning top 1 is going to work.

View 9 Replies View Related

SQL Server 2008 :: White Spaces Not Getting Ignored For Plan Guide Matching?

May 1, 2015

In sp_create_plan_guide documentation, it's written:

When SQL Server matches the value of statement_text to batch_text and @parameter_name data_type [,...n ], or if @type = 'OBJECT', to the text of the corresponding query inside object_name, the following string elements are not considered:

White space characters (tabs, spaces, carriage returns, or line feeds) inside the string.
Comments (-- or /* */).
Trailing semicolons

On SQL Server 2008 SP3, I created a plan guide for a query. Now, if I execute the query exactly how it was defined in the plan guide, SQL Server match it and use the plan guide to optimize the query.

However, if I add just a space between a column name and an operator in the WHERE clause, the plan guide is ignored. How come it doesn't ignore the extra space, like mentioned in the documentation?

View 3 Replies View Related

How To Concatenate Strings That Have Trailing Spaces?

Jul 20, 2005

I am trying to export data from a SQLServer database into a text fileusing a stored procedure. I want to be able to read it and debug iteasily; therefore, I want all the columns to indent nicely. This meansI need to append trailing spaces to a text string (such as "Test1 ")or append leading space in front of a text string that contains anumber (such as " 12.00"). Now, the stored procedure works fine whenI run it in Query Analyzer. But it doesn't work correctly when I runit using ISQL - All the columns are not indented. I am wondering whyit doesn't work in ISQL.This is what I want, and this is also what I get when I run the storedprocedure using Query Analyzer:Test1 , 2,Test1.txt , 1.00, 1.00Test22 , 2,Test22.txt , ,Test333 , 2,Test333.txt , 30.00, 30.00This is what I get if I run the stored procedure using ISQL(isql -S myserver -E -w 556 -h-1 -n -d mydb -Q "exec MyTest"):Test1, 2,Test1.txt, 1.00, 1.00Test22, 2,Test22.txt, ,Test333, 2,Test333.txt, 30.00, 30.00You can see that the result from ISQL has the following differences:1. It puts a space in front of each row.2. It appends enough spaces at the end of each line to makethe line length to be exactly 61 characters.3. It gets rid of the trailing space from each column.4. It leaves only one blank space if the column has nothingbut a serie of spaces.The following is the stored procedure that I am testing:create procedure MyTestasset nocount oncreate table #Test(Field1 varchar(10) null,Field2 varchar( 5) null,Field3 varchar(20) null,Field4 varchar(10) null,Field5 varchar(10) null)insert into #Test values( "Test1 ", " 2","Test1.txt ", " 1.00", " 1.00" )insert into #Test values( "Test22 ", " 2","Test22.txt ", " ", " " )insert into #Test values( "Test333 ", " 2","Test333.txt ", " 30.00", " 30.00" )select Field1 + "," +Field2 + "," +Field3 + "," +Field4 + "," +Field5from #Testdrop table #TestgoStrangely, the differences #3 and #4 only show up when I use theSELECT statement on a table. They don't show up when I use SELECTstatements to show constant text strings or string variables, likethis:set nocount onselect "Test1 " + "," +" 2" + "," +"Test1.txt " + "," +" 1.00" + "," +" 1.00"select "Test22 " + "," +" 2" + "," +"Test22.txt " + "," +" " + "," +" "select "Test333 " + "," +" 2" + "," +"Test333.txt " + "," +" 30.00" + "," +" 30.00"The result is like the following if I use constant text strings orstring variables:Test1 , 2,Test1.txt , 1.00, 1.00Test22 , 2,Test22.txt , ,Test333 , 2,Test333.txt , 30.00, 30.00I need to run it from ISQL because that is how I run _all_ my otherstored procedures. I don't want to do anything differently justbecause I need to run this stored procedure.Thanks in advance for any suggestion.Jay Chan

View 4 Replies View Related

Keep Duplicate With Highest Score Fuzzy Grouping

Jan 22, 2012

I have recently decided to dedupe my data but i am having a problem after running fuzzy grouping with the query on updating which duplicate to keep

_key_in is unique, _key_out is the duplicates so for example:

_key_in , _key_out , name , score , dedupe
1 , 1 , ron , 10 , purge
2 , 1 , ronn , 15 , keep
3 , 3 , john , 5 , keep
4 , 4 , matt , 15 , keep
5 , 4 , mat , 10 , purge
6 , 4 , matt , 15 , purge

I want to keep the _key_out with the higher score by setting the field de_dupe to 'keep' and the remainder to 'purge'. The score can also be the same within a duplicate so in the case it is the same i just need to keep one it doesnt matter which one. The query i have below nearly works but it marks duplicates with the same score as keep.

Code:
UPDATE b
SET b.dedupe_result = 'keep'
FROM
[BusinessListings].[dbo].[MongoOrganisationACTM1Destination] b
INNER JOIN

[Code] ....

View 2 Replies View Related

Fuzzy Grouping Seemingly Corrupting Data

Jan 10, 2007

I've seen one other post on this topic from October 2005 and I thought I'd bring it up again. I've a Fuzzy Grouping component in my data flow. The output data from it appears to be the result of records spliced into other records. This includes pass-through columns, not merely "clean" or similarity columns. For example (I've added the suffixes for illustrative purposes):

AddressLine1_in: 162 OAKMONT
AddressLine1_out: 162 OAKMONTLAMINATION INC

CityStateZip_in: Alexander, AR 72002-8539
CityStateZip_out: Alexander, AR 72002-8539116-7066

These are just pass-through columns, although "used" columns are seeing something similar (below.) Any others with this experience?

City_in: Alexander
City_out: Alexandertle Rock

View 1 Replies View Related

Fuzzy Grouping - First Name Similarities; Bill = William, Etc...

Aug 14, 2007

Hello,

I was wondering how Fuzzy Grouping deals with and handles first name similarities. Is there a way to configure it so that Anthony = Tony, Bill = William, etc��? I created a simple package with several rows containing similar first names and ran the fuzzy grouping on the first name column. I received only one possible duplicate of Will = William which was at 56%. I lowered the threshold down to 1% and still only one match.

Now I understand and appreciate the reasons for this but was wondering if this type of situation was considered and a way of dealing with it is available.

Thanks,
Beac

View 3 Replies View Related

Fuzzy Lookup And Grouping Training Dataset?

Mar 2, 2008

Hi All,

Is there a way the fuzzy lookup or grouping can be trained so that similarities and confidence values rely on previously matched strong links?

For example: I can link 80% of my two datasets using one strong identifier (say phone #) which I trust. My goal then, is to use the probability of matching of the rest of my linking fields (say Name,Address,Gender,DOB) in a "matched by phone number" pair to train a fuzzy lookup task to be done on the unlinked 20% of the datasets.

This "training set" would in theory influence the similarity and confidence values of the fuzzy output since each linking column would carry a different weight or contribution towards a confident match.

Does anyone out there knows how to do this in practice in SSIS?

View 1 Replies View Related

Fuzzy Grouping: Any Success With &&> 3 Million Records?

May 18, 2006

I have tried to process > 3 million Fuzzy grouping records on two different servers with no success. 3 mill works but anything above 4 mill doesn't. Some background:

We are trying to de-dup our customer table on: name (.5 min), address1 (.5 min), city (.5 min), state (exact). .8 overall record min score.
Output includes additional fields: customerid, sourceid, address2, country, phonenumber
Without SP1 installed I couldn't even get a few hundred thousand records to process
Two different servers - same problems. Note that SSIS and SQL Server are running locally on both
The higher end server has 4GB RAM, the other 2.5 GB RAM. Plenty of free disk space on both
SQL Server is configured to use 2 GB of RAM max
The page file is currently at 15GB

After running a number of test on both servers trying different batch sizes etc. the one thing I noticed is that it seems to always error out when SSIS takes over and starts chewing up all the available RAM. This happens after the index is created and SSIS starts "warming caches". On both servers SQL Server uses up about 1.6GB of RAM at this point while SSIS keeps taking over RAM until all physical RAM is used up.

Some questions:

Has anyone been able to process more then 3 million records and if so what is your hardware configuration?
Should we try running SSIS from a different server so it has access to the full amount of physical RAM? (so it doesn't have to fight for RAM with SQL Server)
Should we install Win 2003 Enterprise Server so we can add more RAM?
Any ideas why switching to the page file might be causing errors?

Thanks!!

Keith Doyle

View 17 Replies View Related

Matching Strings In Different Tables Of Same Database

Jul 20, 2005

I have a situation where I want to pull strings from one table of a SQL 2000database and find matches for it in other tables of the same database andhave those values returned. i.e. In one table I have prospects and I wantto match their names to a table that stores the names of prospects turnedinto customers. I want to write a query that looks through every entry andreturns a match for each corresponding value (from prospects to customers).So if "Smith" is found in prospects I want SQL to return "Smith" incustomers with full contact info.Any pointers on getting started on this is greatly appreciated. Or if youcould just point me to a reference. Obviously, I need to do some kind ofparsing. I just need to be pointed in the right direction.Thx.

View 2 Replies View Related

SQL Matching Two Multiple Valued Strings

Jul 20, 2005

I am a little stomped and wandering if someone might have an idea howto go about doing this.following on from this guidehttp://www.4guysfromrolla.com/webtech/031004-1.shtml on matching acomma-delimited string, I would like to expand on this and match twocomma-delimited string in a sproc.In my database, table A have a city field containing a comma delimitedstring ie 'sydney, new york, chicago'. I am passing a similarcomma-delimited string to a sproc and returning matcheing id.so, we have table A:id/city1/sydney, new york, chicago2/new york, san antonio3/beijing, sydney4/london,beijingpassing string 'sydney, new york'need to return: id 1,2,3 (1,2 match new york and 1,3 matching sydney)any ideas?

View 1 Replies View Related

Replace Nulls With Blank Spaces In Float Data Type

May 8, 2008

Hello,
I have a simple question. Is it at all possible to replace columns which has nulls with blank spaces for a float data type column.
The columns has null values( written)) in it in some rows and has numbers in other rows . I want to remove nulls before copying it to another file.
Thanks

View 7 Replies View Related

Fuzzy Grouping Transform Corrupts Pass-through Data

Aug 2, 2005

We are working with a client and are using Fuzzy Group transform for de-duping, and hierarchy creation for a national account list.

View 4 Replies View Related

Handling Strings With Embedded Nulls

Jan 30, 2008

I am in the process of converting DB2 mainframe data to SQL2005. During the conversion I ran into an issue with the DB2OLEDB provider not handling strings with embedded nulls. With the help of the Microsoft Tech support folks I was able to get a fix for the x64 DB2OLEDB provider to handle strings with embedded nulls.

The problem now however is that it appears that when the Data is copied to the Pipeline buffer it is truncated at the first null regardless of the DT_STR length. I have read where .NET is supposed to handle embedded nulls in strings but I am not sure what I need to do to get SSIS packages to handle this situation.

I know when I preview the query in the OLEDB provider within SSIS the data is correct, but as soon as it is passed to the SQL connector, a scripting component or a data conversion component the string is truncated at the first occurence of the embedded null.

I also tried doing a straight copy from the Data provider to a flat file, but the strings are once again truncated.

Is anyone else experienced any other similiar problems or found any resolutions to this type of problem I am getting down to crunch time on this conversion project and any help would be most appreciated.

View 4 Replies View Related