Really Better Soundex?
Feb 6, 2008
Hi guys and gals (in case Tara reads this...)
We've installed this on our server:
http://microsoft.apress.com/feature/72/sql-server-soundexing
(all the udf attempts were way too slow; this uda is nice and quick)
But we have not been able to reproduce the supposed improved functionality. No matter what misspelled name we send it, soundex and soundex_nara return the same amount of clients (in a db of over half a million). The return codes are sometimes different for the two functions but it doesnt ever result in an actual difference in query results. Has anyone actually experienced a difference in returned results?
EDIT:
Have a look at this:
http://en.wikipedia.org/wiki/Levenshtein_distance
View 1 Replies
Apr 30, 2002
Can someone please explain how to do a query using the soundex function?
If I have the last name of Smith and I want to get similar matches
with soundex, what would be the syntax?
SELECT * FROM NAME WHERE LNAME = 'SMITH'
Thanks,
Kellie
View 1 Replies
View Related
Jul 1, 2004
Has anyone used SOUNDEX() practically?
Everytime i look through bol searching for a function i end up playing with soundex and diffference but have never been able (or wanted to take the time) to understand the values they generate..
just asking.
View 3 Replies
View Related
Oct 27, 2006
I have two tables in Access database. One table has 166,000 rows and another -75,000 rows.
Both tables consist of the ids, names and addresses.
I need to run a program, which was written in VBA, to find similar names and cities in these two tables, and then I need to populate the third table with id only from both tables. I forgot to mention I use Soundex function in Access. I try to run this program for 48 hours, and it still running. Do you think this is a good idea to use Access for this task ? Or could you please help me to find another solution?
Please let me know.
Rose
View 1 Replies
View Related
Feb 15, 2005
Hello,
Does the SOUNDEX function only apply to english (surnames) or can it be used for other languages e.g. Dutch.
Well, it can be used ofc. but does it produce meaningful output ?
Thanks,
Rick
View 3 Replies
View Related
Mar 6, 2002
Disgruntled with Soundex I went looking for a better phonetic matching algorithm.
Turns out there is a rather good one called Metaphone, which comes in two variants (Simple and Double)
I could find the source for this in C++, but I wanted to have it as a user function.
So here it is:
CREATE FUNCTION dbo.Metaphone(@str as varchar(70))
RETURNS varchar (25)
/*
Metaphone Algorithm
Created by Lawrence Philips.
Metaphone presented in article in "Computer Language" December 1990 issue.
Translated into t-SQL by Keith Henry (keithh_AT_lbm-solutions.com)
*********** BEGIN METAPHONE RULES ***********
Lawrence Philips' RULES follow:
The 16 consonant sounds:
|--- ZERO represents "th"
|
B X S K J T F H L M N P R 0 W Y
Drop vowels
Exceptions:
Beginning of word: "ae-", "gn", "kn-", "pn-", "wr-" ----> drop first letter
Beginning of word: "x" ----> change to "s"
Beginning of word: "wh-" ----> change to "w"
Beginning of word: vowel ----> Keep it
Transformations:
B ----> B unless at the end of word after "m", as in "dumb", "McComb"
C ----> X (sh) if "-cia-" or "-ch-"
S if "-ci-", "-ce-", or "-cy-"
SILENT if "-sci-", "-sce-", or "-scy-"
K otherwise, including in "-sch-"
D ----> J if in "-dge-", "-dgy-", or "-dgi-"
T otherwise
F ----> F
G ----> SILENT if in "-gh-" and not at end or before a vowel
in "-gn" or "-gned"
in "-dge-" etc., as in above rule
J if before "i", or "e", or "y" if not double "gg"
K otherwise
H ----> SILENT if after vowel and no vowel follows
or after "-ch-", "-sh-", "-ph-", "-th-", "-gh-"
H otherwise
J ----> J
K ----> SILENT if after "c"
K otherwise
L ----> L
M ----> M
N ----> N
P ----> F if before "h"
P otherwise
Q ----> K
R ----> R
S ----> X (sh) if before "h" or in "-sio-" or "-sia-"
S otherwise
T ----> X (sh) if "-tia-" or "-tio-"
0 (th) if before "h"
silent if in "-tch-"
T otherwise
V ----> F
W ----> SILENT if not followed by a vowel
W if followed by a vowel
X ----> KS
Y ----> SILENT if not followed by a vowel
Y if followed by a vowel
Z ----> S
*/
AS
BEGIN
Declare@Result varchar(25),
@str3char(3),
@str2 char(2),
@str1 char(1),
@strp char(1),
@strLen tinyint,
@cnt tinyint
set @strLen = len(@str)
set@cnt=1
set@Result=''
--Process beginning exceptions
set @str2 = left(@str,2)
if @str2 in ('ae', 'gn', 'kn', 'pn', 'wr')
begin
set @str = right(@str , @strLen - 1)
set @strLen = @strLen - 1
end
if@str2 = 'wh'
begin
set @str = 'w' + right(@str , @strLen - 2)
set @strLen = @strLen - 1
end
set @str1 = left(@str,1)
if @str1= 'x'
begin
set @str = 's' + right(@str , @strLen - 1)
end
if @str1in ('a','e','i','o','u')
begin
set @str = right(@str , @strLen - 1)
set @strLen = @strLen - 1
set@Result=@str1
end
while @cnt <= @strLen
begin
set @str1 = substring(@str,@cnt,1)
if @cnt <> 1
set@strp=substring(@str,(@cnt-1),1)
elseset@strp=' '
if @strp<> @str1
begin
set @str2 = substring(@str,@cnt,2)
if @str1in('f','j','l','m','n','r')
set@Result=@Result + @str1
if @str1='q'set @Result=@Result + 'k'
if @str1='v'set @Result=@Result + 'f'
if @str1='x'set @Result=@Result + 'ks'
if @str1='z'set @Result=@Result + 's'
if @str1='b'
if @cnt = @strLen
if substring(@str,(@cnt - 1),1) <> 'm'
set@Result=@Result + 'b'
else
set@Result=@Result + 'b'
if @str1='c'
if @str2 = 'ch' or substring(@str,@cnt,3) = 'cia'
set@Result=@Result + 'x'
else
if @str2in('ci','ce','cy')and@strp<>'s'
set@Result=@Result + 's'
elseset@Result=@Result + 'k'
if @str1='d'
if substring(@str,@cnt,3) in ('dge','dgy','dgi')
set@Result=@Result + 'j'
elseset@Result=@Result + 't'
if @str1='g'
if substring(@str,(@cnt - 1),3) not in ('dge','dgy','dgi','dha','dhe','dhi','dho','dhu')
if @str2 in ('gi', 'ge','gy')
set@Result=@Result + 'j'
else
if(@str2<>'gn') or ((@str2<> 'gh') and ((@cnt + 1) <> @strLen))
set@Result=@Result + 'k'
if @str1='h'
if (@strp not in ('a','e','i','o','u')) and (@str2 not in ('ha','he','hi','ho','hu'))
if@strp not in ('c','s','p','t','g')
set@Result=@Result + 'h'
if @str1='k'
if @strp <> 'c'
set@Result=@Result + 'k'
if @str1='p'
if @str2 = 'ph'
set@Result=@Result + 'f'
else
set@Result=@Result + 'p'
if @str1='s'
if substring(@str,@cnt,3) in ('sia','sio') or @str2 = 'sh'
set@Result=@Result + 'x'
elseset@Result=@Result + 's'
if @str1='t'
if substring(@str,@cnt,3) in ('tia','tio')
set@Result=@Result + 'x'
else
if@str2='th'
set@Result=@Result + '0'
else
if substring(@str,@cnt,3) <> 'tch'
set@Result=@Result + 't'
if @str1='w'
if @str2 not in('wa','we','wi','wo','wu')
set@Result=@Result + 'w'
if @str1='y'
if @str2 not in('ya','ye','yi','yo','yu')
set@Result=@Result + 'y'
end
set @cnt=@cnt + 1
end
RETURN @Result
END
K e i t h H e n r y
Edited by - khenry on 03/06/2002 06:41:15
View 8 Replies
View Related
Jul 23, 2005
Hi AllWe are using soundex (and later tried Nysiis) for fuzzy name searchsoftware. But we faced a lot of problems the search accuracy was not verygood also we saw a lot of misses of relevant names.There are many problems other than precision and accuracy, with soundex andNYSIIS.e.g.Look for Smith and it will come-up with around 250 very popular last names.That dones not help much when a user is searching for "John Smith". Also, itdoes not return Creighton for Kryton as the search string.I googled a little and saw soming called NamiX. Without contacting thecompany ( arizcon.com ) directly, I wanted to get feedback from newsgroupsor people who are experts at this. Has anyone used this software? If so, Isit as good as they claim?Thanks a million in advance.Steve Creighton(please remove .antispam from email address) or post back your answers tothis group
View 1 Replies
View Related
May 11, 2006
Hi
We have an application we are developing that will run worldwide. When we are searching data for peoples names I would like to use Soundex but I am not sure it works effectivly for any language other than English. Does anyone know if it is effective in Dutch, German, French etc?
View 7 Replies
View Related
May 11, 2006
Hello,
I need to compare movie names from two systems. In both systems these names are entered manually by operators. I would like to compare them and give a rating on how close these names are equal.
Stripping special characters, and spaces is just not enough. It can happen that they key in sligthly different names. I've tried to use soundex but as we have over 15000 movie titles over the years i'm getting to many equal soundexes to use this as a comparison key.
Any ideas if there are techniques to do this ...
Kind Regards
View 1 Replies
View Related