Introduction
phonetic algorithm matches two different words with
similar pronunciation to the same code, which allows phonetic similarity
based word set comparison and indexing.Often it is quite difficult to find atypical name (or surname) in the database, for example:
— Hey, John, look for Adolf Schwarzenegger.In this case, the use of phonetic algorithms can significantly simplify the problem.
— Adolf Shwardseneger? There is no such person!
Soundex algorithm is very useful for searching in lists of people in databases, as well as for using in a spell checker. They are often used in combination with the algorithms of fuzzy search, providing users with a handy search by name and surname in databases, lists of people and so on.
soundex sql soundex sql soundex sql soundex sql soundex in sql soundex in sql soundex in sql soundex in sql
Soundex
One of the first algorithms was Soundex invented in the 1910s by Robert Russell. This algorithm (its American version) matches words to the numerical index like A126. Its working principle is based on the partition of consonants in groups with ordinal numbers, which are then compiled to the resulting value. Later several improvements was suggested.Soundex Coding Guide
Number | Represents the Letters |
1 | B, F, P, V |
2 | C, G, J, K, Q, S, X, Z |
3 | D, T |
4 | L |
5 | M, N |
6 | R |
Disregard the letters A, E, I, O, U, H, W, and Y.
Every soundex code consists of a letter and three numbers, such as W-252. The letter is always the first letter of the surname. The numbers are assigned to the remaining letters of the surname according to the soundex guide shown below. Zeroes are added at the end if necessary to produce a four-character code. Additional letters are disregarded.
Examples:
- Washington is coded W-252 (W, 2 for the S, 5 for the N, 2 for the G, remaining letters disregarded).
- Lee is coded L-000 (L, 000 added).
Examples :
soundex sql server soundex sql server soundex code soundex code soundex code soundex code
F234 → Fusedale.
G535 → Genthner, Gentner, Gianettini, Gunton.
G640 → Garlee, Garley, Garwell, Garwill, Gerrell, Gerrill, Giral, Gorelli, Gorioli, Gourlay, Gourley, Gourlie, Graal, Grahl, Grayley, Grealey, Greally, Grealy, Grioli, Groll, Grolle, Guerola, Gurley.
H326 → Hadcroft, Hadgraft, Hatchard, Hatcher, Hatzar, Hedger, Hitscher, Hodcroft, Hutchcraft.
P630 → Parade, Pardew, Pardey, Pardi, Pardie, Pardoe, Pardue, Pardy, Parradye, Parratt, Parrett, Parrot, Parrott, Pearde, Peart, Peaurt, Peert, Perdue, Peret, Perett, Perot, Perott, Perotti, Perrat, Perrett, Perritt, Perrot, Perrott, Pert, Perutto, Pirdue, Pirdy, Pirot, Pirouet, Pirt, Porrett, Porritt, Port, Porte, Portt, Prate, Prati, Pratt, Pratte, Pratty, Preddy, Preedy, Preto, Pretti, Pretty, Prewett, Priddey, Priddie, Priddy, Pride, Pridie, Pritty, Prott, Proud, Prout, Pryde, Prydie, Purdey, Purdie, Purdy.
soundex number soundex number soundex number soundex number soundex sql server soundex sql server
Soundex in SQL:
ALTER FUNCTION fn_soundex(@Getname VARCHAR(500))RETURNS VARCHAR(500)
BEGIN
-- To Store First Letter
DECLARE @GetFirstLetter VARCHAR(20);
-- To Store Soundex Code
DECLARE @SetSoundexCode VARCHAR(20);
-- To Find length of string
DECLARE @Code_len INT;
-- Default Value
SET @SetSoundexCode = '0'
-- Get First Letter
SET @GetFirstLetter = substring(@Getname,1,1)
-- Copy Remain Character to another String
SET @Getname = substring(@Getname,2,len(@Getname))
-- Assign Code according to Letter
SET @Getname = replace(@Getname,'A',0)
SET @Getname = replace(@Getname,'E',0)
SET @Getname = replace(@Getname,'I',0)
SET @Getname = replace(@Getname,'O',0)
SET @Getname = replace(@Getname,'U',0)
SET @Getname = replace(@Getname,'H',0)
SET @Getname = replace(@Getname,'W',0)
SET @Getname = replace(@Getname,'Y',0)
SET @Getname = replace(@Getname,'B',1)
SET @Getname = replace(@Getname,'F',1)
SET @Getname = replace(@Getname,'P',1)
SET @Getname = replace(@Getname,'V',1)
SET @Getname = replace(@Getname,'C',2)
SET @Getname = replace(@Getname,'G',2)
SET @Getname = replace(@Getname,'J',2)
SET @Getname = replace(@Getname,'K',2)
SET @Getname = replace(@Getname,'Q',2)
SET @Getname = replace(@Getname,'S',2)
SET @Getname = replace(@Getname,'X',2)
SET @Getname = replace(@Getname,'Z',2)
SET @Getname = replace(@Getname,'D',3)
SET @Getname = replace(@Getname,'T',3)
SET @Getname = replace(@Getname,'L',4)
SET @Getname = replace(@Getname,'M',5)
SET @Getname = replace(@Getname,'N',5)
SET @Getname = replace(@Getname,'R',6)
-- Get Length of String
SET @Code_len = len(@Getname)
-- Remove Space from both side
SET @Getname = ltrim(rtrim(@Getname))
-- Check if length is greater then zero
WHILE @Code_len > 0
BEGIN
-- Compare two letter at same time to check duplicate
-- If character is repeating character then make it Zero...
-- We will remove Zero at the End
IF charindex(substring(@Getname,@Code_len,1),substring(@Getname,@Code_len-1,1)) > 0
SET @SetSoundexCode = @SetSoundexCode + '0'
ELSE
SET @SetSoundexCode = @SetSoundexCode + substring(@Getname,@Code_len,1)
SET @Code_len = @Code_len - 1
END
SET @SetSoundexCode = @GetFirstLetter + reverse(replace(@SetSoundexCode,'0',''))+'000'
RETURN LEFT(@SetSoundexCode,4)
END
GO
There are lot many variance in Soundex Search algorithm.
Referenced from SOUNDEX (Transact-SOL) soundex sql server soundex sql server
Useful Link:
http://www.archives.gov/research/census/soundex.html
soundex function soundex function soundex function soundex function sql soundex sql soundex sql soundex sql soundex
No comments:
Post a Comment