Fuzzy Search Component

Jul 30, 2007

Good day.

I'd like to add "fuzzy search" functionality to my application.

"Fuzzy search" in this topic means selecting (from DB table) rows, which have "fuzzy search" coefficient (calculated using etalon string) not less some_predefined_const. Fuzzy search coefficient calculating algorithm can be various.

So with etalon string "Margaret" "fuzzy search" can find "Nargaret", "Margoret", "Margret" etc.

IMHO time to develop, test and tune code must be quite long. I prefer to buy such "fuzzy search" component.

Does anybody know where can I get such component - server version (SQL 2005) or client version (.NET)?
How much can such component cost?


Fuzzy Search - Exposing SSIS Fuzzy Capabilities Outside Of SSIS?

Apr 15, 2008

I've been looking into ways to accomplish a fuzzy search and SSIS makes that possible if I want to do a bulk import or something like it. But what it I just want to look stuff up at any given time not haveing to run the package?

Is it possible to expose the fuzzy lookup outside of SSIS to for example t-sql?

Here's an example:
I want to lookup the music artist "Notorious BIG" but in the database it is "Notorious B.I.G." if I use the SSIS fuzzy lookup I basically get what I'm looking for. But how would I call this from a web application? So then I tried Full text search but this doesn't really work out as well.

Will I have to re-write the logic that the fuzzy lookup uses to enable it to work? i.e. using Full Text Indexes and FreeTextTable, ContainsTable, SoundEx and the like to somewhat even come close to what the Fuzzy Lookup has?

Fuzzy Search

Oct 8, 2007

Is there a built in capability in Sql server 2005 to do a search which can handle spelling errors. for eg.
We  are doing a search for "hanovr" and our database contains "hanover" . In cases when there is a spelling error searching using LIKE,CONTAINS,FREETEXT are not giving me the results. Is there an out of the box solution for this problem.
Please Advice.

Fuzzy Search?

Jan 23, 2006

How do I do a fuzzy search? If I have a table of full names, I'd like the user to be able to do a search and find the record, "Charles Montgomery Burns" with "Monty Burns" or "Montgomry" (mispelling).

Every major web site does this kind of thing (Amazon, Google, etc).

Someone suggested SOUNDEX, but this really doesn't fit the bill. Misspellings often don't use the same sound signature as the originals. Plus, that doesn't handle multi-word searchable texts very well.

Others have suggested tries or suffix trees. If I went this route, wouldn't I have to preload all data out of the database and into this custom structure upon app startup? Is there any way around that? Also, this solution seems like it would require a lot of dev time (building a custom suffix tree with fuzzy lookup capabilities).

Is there a commonly known and acceptable solution to this?

(sorry, also posted to MySQL group; I'm using both databases so a solution in either would be satisfactory)

How To Make A Fuzzy Search Engine

Dec 20, 2007

When i search for Peter.Peter,Pan was successfully retrieved from the DatabaseBut when i Search for Peter Pan.It found no results.How can i make it such tat it can find results for Peter Pan too?In another words, how to make my search engine more powerful,fuzzy and intelligent?My current Code:<asp:SqlDataSource ID="dsSearch" runat="server" ConnectionString="<%$ ConnectionStrings:csHPDB %>" SelectCommand="SELECT * FROM [employee_table] WHERE ([employee_name] LIKE '%' + @employee_name + '%') <SelectParameters> <asp:QueryStringParameter Name="employee_name" QueryStringField="Search" Type="String" /> </SelectParameters> </asp:SqlDataSource>

Difference Between The Fuzzy Lookup And Fuzzy Grouping In Ssis

Aug 14, 2007

Dear Friends,

i think fuzzy lookup


what is fuzzy grouping ???? please explain


Search Component For SQL Server

Jun 28, 2001

hi geeks,

I need to build a generic text search component for MS SQL server 7.0. It is basically one of those Database searches that appear on web sites. Is there any stored procedure or service that MS SQL server provides for text based search on a database. Mind u my requirement is not to search a table but aan entire database....
anyone done this before.....

keep the faith

The Component Metadata For Component DataReader Source (1113) Could Not Be Upgraded To The Newer Version Of The Component.

Oct 26, 2007


I have a package that has a data lfow task. this task imports data from a db2 database (using the IBM Ole DB provider fro db2) and adds it to sql server database table. This package was created on the server. then though version control (using TFS source control) I check out the package on my local machine. and when I open the package I get the foll 3 errors.

Error 1 Validation error. Import Account Num from BMGP_BDR: DTS.Pipeline: The component metadata for "component "DataReader Source" (1113)" could not be upgraded to the newer version of the component. The PerformUpgrade method failed.

Error 2 Error loading BMAG Download Xref Tables - bmag.dtsx: Microsoft.SqlServer.Dts.Pipeline.ComponentVersionMismatchException: The version of component "DataReader Source" (1113) is not compatible with this version of the DataFlow. [[The version or pipeline version or both for the specified component is higher than the current version. This package was probably created on a new version of DTS or the component than is installed on the current PC.]] at Microsoft.SqlServer.Dts.Pipeline.ManagedComponentHost.HostCheckAndPerformUpgrade(IDTSManagedComponentWrapper90 wrapper, Int32 lPipelineVersion)

Error 3 Error loading BMAG Download Xref Tables - bmag.dtsx: The component metadata for "component "DataReader Source" (1113)" could not be upgraded to the newer version of the component. The PerformUpgrade method failed.

Please advice.
Thank you.

The Component Metadata For Component DataReader Source Could Not Be Upgraded To The Newer Version Of The Component.

Jan 23, 2007


I have a package which reads an Access file from a folder. My connection manager to this file is .NET providers for OledbMicrosoft Jet 4.0 OLE DB Provider.

Package works from my computer. But when I execute it on the server as a SQL Agent job, I get

The component metadata for "component "DataReader Source" (1) could not be upgraded to the newer version of the component. The PerformUpgrade method failed.  

I copied the mdb file to a folder on the server which my packages have no problem reading data from.

My packages run under the same domain account as defined in proxies.

Appreciate a help.




SQL 2000 MS Search: Boolean Search Doesn't Work When Search By Phrase

Aug 9, 2006

I'm just wonder if this is a bug in MS Search or am I doing something wrong.

I have a query below

declare @search_clause varchar(255)

set @Search_Clause = ' "hepatitis b" and "hepatocellular carcinoma"'

select * from results

where contains(finding,@search_clause)

I don't get the correct result at all.

If I change my search_clause to "hepatitis" and "hepatocellular carcinoma -- without the "b"

then i get the correct result.

It seems MS Search doesn't like the phrase contain one letter or some sort or is it a know bug?

Anyone know?


Use Of A SSIS Variable Of Type “Object� Inside Script Component And Task Component

Mar 16, 2007

In a Data Flow, I have the necessity to use a SSIS variable of type €œObject€? inside Script Component and assign to it the content of 'n' variables of string type.
On exiting from the script the variable of type object should contain something like in the following lines:
On exiting from the data flow I will use the variable of type Object in a Script Task, by reading each element in a cyclic fashion.
Is there anyone who have experienced something like this? Could anyone provide any example of that?
Thanks in advance!

A Custom Component For Use As A VIEW In SSIS- Is It Possible To Create One MERGE Like Component With More Than 2 Inputs

Aug 13, 2007

Hi all
I'm into a project which uses a lot of views for joining 2 or more tables. Using the MERGE component in SSIS will be a huge effort coz it only has 2 inputs and I gotta SORT the input too.
Isnt it possible to have a VIEW like component that joins more than 2 tables and DOESNT need sorting??
(I've thought about creating views in database engine but it breaks my data floe in SSIS and is'nt a practical solution)

Reference To Preceeding Component From Custom Dataflow Transformation Component

Mar 30, 2006

I am writing a custom dataflow transformation component and I need to get the name of the preceeding component.

I have been trying to find a way to get a reference to the Package object, MainPipe object or IDTSPath90 object (connecting to the IDTSInput90 of my component) from my component because I think from there I can get to the information I want.

Does anyone have any suggestions?

TIA . . . Ed

Serious Script Component Bug - Clears Out All Code Inside Component

Nov 27, 2007

No idea where this bug crept in from. Have been using SSIS for 1.5 years now without hitting this problem.

I had a script component opening an XML document and parsing it using XPATH. I added some code that uses StreamReader / Streamwriter (closing one stream before starting the other). The code works without issue in my C# app.

And it ran without issue 2-3 times in SSIS. Then suddenly after running my package again, the script component says it completes successfully, yet nothing happens. I set a breakpoint on the first line of code - it never hits it. I add a msgbox as the first line of code - and it never displays.

I then close my package / exit out of ssis ... and then re-open it. When i open my script component, all of my code is GONE. All references that I added are gone.

I tried adding the streamreader/writer process to a dll I created from my c# app ... and added the DLL to the package -- same result.

I can reproduce this on 2 different computers.

Anyone experience this problem ? Any idea how to stop it ? Or debug it ?

Here is a slimmed down code sample of what causes the error :

Public Class ScriptMain
Public Sub Main()
Dim xmlDoc As New XmlDocument
MsgBox("xmlLoaded") --this doesn't display once the package starts "acting up"
Catch ex As Exception
UpdateXML("c:ulkasync_86281519_20070628045850225_4.xml", ex.Message)
End Try
Dts.TaskResult = Dts.Results.Success
End Sub
Private Sub UpdateXML(ByVal fileName As String, ByVal message As String)
Dim invalidChar As String = message.Trim().Substring(message.Trim().IndexOf("0x"), 4)
Dim rd As StreamReader = New StreamReader(fileName)
Dim xml As String = rd.ReadToEnd()
Xml = Xml.Replace(invalidChar, String.Empty)
xml = xml.Replace("", String.Empty)
xml = xml.Replace("<![CDATA[<![CDATA[", "<![CDATA[")
xml = xml.Replace("]]>]]>", "]]>")
Dim wr As StreamWriter = New StreamWriter(fileName)
Dim xdoc As XmlDocument = New XmlDocument()
Catch ex As Exception
UpdateXML(fileName, ex.Message)
End Try
End Sub
End Class

Fuzzy Lookup

Feb 16, 2007


I am using a fuzzy lookup to cleanse data from a sales line details table, during the import process. The sales order line details contains a filed called 'reference' and this is compared to a field called 'category' in another table.
Using data viewers to check through the cleansing process, I notice that the fuzzy lookup doesn't seem to match i.e.
tbl.salesline.reference = 'I3' -> tbl.sales.category ='I03'
the above is OK, but the lookup also returns the following
tbl.salesline.reference = 'I9' -> tbl.sales.category ='I01'
The value I9 doesnt exist, and is miskeyed by user entry, and should have been 'I99'. I would have expected the fuzzy lookup to pickup the I99 value as at least two of the chrs are matching, but no, it picks the first 'I*' in the table.
If I expand the fuzzy lookup to return more results, i.e. 5 per record, then it returns the first 5 results....I01, I02 I03 and so on.
Is there a way of improving the fuzzy lookup itself?

Nov 12, 2007

Hi all, I have been trying for a while now to clean some data that containes duplicate data using fuzzy grouping. I can get as far as identifying the duplicate data using fuzzy grouping but how do I get it out so I can insert non duplicate data a dimension table1?

What I am also stuck with is how do u set the data that isn't duplicate in the table1 as well, or is this done in the same step. Please help, deadlines are creeping in on me

Thanx for your time.

Fuzzy Lookup

Feb 6, 2008

The enterprise edition of SQL server includes some advanced BI features, for example the fuzzy lookup feature of IS. If the IS package lives on an enterprise edition of SQL server and the database the package it is targeting lives on a standard edition of SQL server can the advanced features be used? Can you run a fuzzy look against a database on a standard edition of SQL server when th IS package lives on an enterprise edition of SQL server? THANKS!

View 1 Replies View Related

Fuzzy Lookup

Jan 19, 2007

Hi Friends,

Can some body briefly explain me what is the difference between fuzzy lookup and fuzzy grouping?

thanks and regards

Help W/ Stored Procedure? - Full-text Search: Search Query Of Normalized Data

Mar 29, 2008

 Hi -  I'm short of SQL experience and hacking my way through creating a simple search feature for a personal project. I would be very grateful if anyone could help me out with writing a stored procedure. Problem: I have two tables with three columns indexed for full-text search. So far I have been able to successfully execute the following query returning matching row ids:  dbo.Search_Articles        @searchText varchar(150)        AS    SELECT ArticleID     FROM articles    WHERE CONTAINS(Description, @searchText) OR CONTAINS(Title, @searchText)    UNION    SELECT ArticleID     FROM article_pages    WHERE CONTAINS(Text, @searchText);        RETURN This returns the ArticleID for any articles or article_pages records where there is a text match. I ultimately need the stored procedure to return all columns from the articles table for matches and not just the StoryID. Seems like maybe I should try using some kind of JOIN on the result of the UNION above and the articles table? But I have so far been unable to figure out how to do this as I can't seem to declare a name for the result table of the UNION above. Perhaps there is another more eloquent solution? Thanks! Peter 

SQL Search :: Full Text Search With Single Character Returns All Rows

Jul 21, 2015

Our clients want to be able to do full text search with a single letter. (Is the name Newton, Nathan, Nick?, Is the ID N1, N2...). Doing a single character full text search on a table work 25 out of 26 times. The letter that doesn't work is 'n'. the WHERE clause CONTAINS(full_text_field, ' "n*" ') returns all rows, even rows that have no 'n' in them anywhere. Adding a second letter after the "n" works as expected.

Here is an example

create table TestFullTextSearch (
Id int not null,
AllText nvarchar(400)
create unique index test_tfts on TestFullTextSearch(Id);
create fulltext catalog ftcat_tfts;

[Code] ....

SQL Server 2014 :: Semantic Search Not Finding Keywords Identified By Full-Text Search?

Nov 6, 2014

I have a scenario of where the standard Full-Text search identifies keywords but Semantic Search does not recognize them as keywords. I'm hoping to understand why Semantic Search might not recognize them. The context this is being used in medical terminology and the specific key words I noticed missing right off the bat were medications.

For instance, if I put the following string into a FT indexed table

'J9355 - Trastuzumab (Herceptin)'
'J9355 - Trastuzumab emtansine'

The Semantic Search recognized 'Herceptin' and 'Emtansine' but not 'Trastuzumab'

Nor in

'J8999 - Everolimus (Afinitor)'

It did not recognize 'Afinitor' as a keyword.

In all cases the Base of Full-Text did find those keywords and were identifiable using the dmvsys.dm_fts_index_keywords_by_document.It does show the index as having completed.

why certain words might not be picked up while others would be? Could it be a language/dictionary issue? I am using English and accent insensitive settings?

Create Site Search Using Sql Server Full Text Search

Jul 24, 2007

would you use sql server "full text search" feature as your site index?  from some reason i can't make index server my site search catalog, and i wonder if the full text is the solution. i think that i wll have to you create new table called some thing like "site text" and i will need to write every text twice- one the the table (let's say "articles table") and one to the text. other wise- there is problems finding the right urlof the text, searching different tables with different columns name and so on...
so i thought create site search table, with the columns:
id, text, url
and to write every thing to this table.
but some how ot look the wrong way, that every forum post, every article, album picture or joke will insert twice to the sqr server...
what do you think? 

SQL Search :: Full Text Search Of PDF Files In A File Table

Mar 30, 2013

I have installed the Adobe iFilter 11 64 bit and set the path to the bin folder. I still cannot find any text from the pdf files. I suspect I am missing something trivial because I don't find much when I Bing for this so it must not be a common problem.Here is the code.

--Adobe iFilter 11 64 bit is installed
--The Path variable is set to the bin folder for the Adobe iFilter.
--SQL Developer version 64 bit on both Windows 7 and Windows 8.
USE master;


How Can I Search Throught DOCX (MS Word 2007) Documents By SQL Server 2005 Full Text Search Engine?

Dec 11, 2006

How can I search throught DOCX (MS Word 2007) documents by SQL Server 2005 Full Text Search engine?

Should I something download?

Fuzzy Phrase Matching

Oct 3, 2007

A column in my database contains phrases such as "Extreme Golf: The Showdown" or "Welcome to Happy Land". I need to write a search engine so that users could type in phrases such as "Golf Extreme Showdown" or "Happy Land" and the correct, or closest matched results will be returned. I don't need variations of words, just phrase keyword match based search. I know I could do this by using multiple LIKE %% statements OR'd together, but this would be too performance intensive. So, I have heard I should use charindex somehow to achieve this in a stored procedeure. Does anyone have any clue how to solve this problem? Thanks!

Fuzzy Lookup And Case

May 25, 2007


Could someone please help!

Im doing a fuzzy lookup based on 3 fields (Surname/DOB/Gender). The only difference between the two sets of data is the case of the first letter of the Surname.

Reference table has "Stuart" Lookup has "stuart", I have set Fuzzy Lookup Input for Surname to Ignore Case but still it won't match.

The DOB/Gender are Exsactly the same.

Why does this not work? I there a work around?

Many Thanks, Deano

Fuzzy Grouping Error

Oct 16, 2005

I am using the Sept CTP, I am doing a fuzzy grouping on 1.5Mil records.

View 7 Replies View Related

May 16, 2006

I am trying to run a SSIS package that contains a fuzzy lookup. I am using a flat file with about 7 million records as the input. The reference table has about 2000 records. The package fails after about 40,000 records with the following information:


Warning: 0x8007000E at Data Flow Task, Fuzzy Lookup [228]: Not enough storage is available to complete this operation.
Warning: 0x800470E9 at Data Flow Task, DTS.Pipeline: A call to the ProcessInput method for input 229 on component "Fuzzy Lookup" (228) unexpectedly kept a reference to the buffer it was passed. The refcount on that buffer was 2 before the call, and 1 after the call returned.
Error: 0xC0047022 at Data Flow Task, DTS.Pipeline: The ProcessInput method on component "Fuzzy Lookup" (228) failed with error code 0x8007000E. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
Error: 0xC0047021 at Data Flow Task, DTS.Pipeline: Thread "WorkThread0" has exited with error code 0x8007000E.
Error: 0xC02020C4 at Data Flow Task, Flat File Source [1]: The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020.
Error: 0xC0047039 at Data Flow Task, DTS.Pipeline: Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.
Error: 0xC0047021 at Data Flow Task, DTS.Pipeline: Thread "WorkThread1" has exited with error code 0xC0047039.
Error: 0xC0047038 at Data Flow Task, DTS.Pipeline: The PrimeOutput method on component "Flat File Source" (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
Error: 0xC0047021 at Data Flow Task, DTS.Pipeline: Thread "SourceThread0" has exited with error code 0xC0047038.


I have tried many things - changing the BufferTempStoragePath path to a drive that has plenty space, changed the MaxInsertCommitSize to 5,000...

What else can I do?


Fuzzy Lookup Problems.

Mar 8, 2006

Fuzzy lookup seems to be causing some problems to me. It seems to work at times and doesn't at other times. It would work a couple of times fine and give me the desired results but then without changing anything in the dataflow or the data the next few times it would not run at all and fail the pre-execute of the.

Now I'm currently getting the following error:

[Fuzzy Lookup [248]] Error: An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Login timeout expired". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "An error has occurred while establishing a connection to the server. When connecting to SQL Server 2005, this failure may be caused by the fact that under the default settings SQL Server does not allow remote connections.". An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Named Pipes Provider: Could not open a connection to SQL Server [233]. ".

[DTS.Pipeline] Warning: A call to the ProcessInput method for input 249 on component "Fuzzy Lookup" (248) unexpectedly kept a reference to the buffer it was passed. The refcount on that buffer was 2 before the call, and 1 after the call returned.

[DTS.Pipeline] Error: The ProcessInput method on component "Fuzzy Lookup" (248) failed with error code 0xC0202009. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.

Any help would be appreciated.

Fuzzy Lookup Error

Oct 18, 2006


I get the following error when I use Fuzzy Lookup in a Data Flow task with TransactionOption property set to €œRequired€?

[Fuzzy Lookup [61]] Error: An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80004005 Description: "Cannot create new connection because in manual or distributed transaction mode.".

When I Change the TransactionProperty to €œSupported€? it works fine.
I need the property set to Required for it does an undo in the event of a failure.
Any ideas on how to get the Fuzzy Lookup to work

Fuzzy Grouping Using Original Key

Nov 14, 2007

I managed to get fuzzy grouping working. The relevant output (_key_in and _key_out) are stored in a new table that is a copy of the old table + fuzzy grouping columns.

How do i get SSIS to store the _key_in and _key_out in the original table?
The new matching column _key_out refers to the new key: _key_in. How could i get SSIS translate that to a matching column that refers to my original key?

Fuzzy Logic Performance

Jan 12, 2006

I am just wondering if someone out there has tried some Fuzzy matching on databases of large scale i.e - about 20 million contact records. Suppose I wanted to perform matching/ grouping to 10 000 incoming messages. How fast does this usually take? What is the dependence on the number of fields chosen for the match?

Any insight is greatly appreciated,


Fuzzy Grouping In SSIS

Aug 2, 2007

hi focks,


and please give me the example


