Soundex

From WikiMD's Wellness Encyclopedia

Phonetic algorithm for indexing names by sound


Soundex

  [[File:Script error: No such module "InfoboxImage".|frameless|alt=]]



ClassPhonetic algorithm
Data structure
Worst-case performance
Best-case performance
Average performance
Worst-case space complexity
Optimal



Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. The goal of Soundex is to encode homophones to the same representation so that they can be matched despite minor differences in spelling. Soundex is used primarily in genealogy and data management.

History[edit | edit source]

Soundex was developed by Robert C. Russell and Margaret King Odell and patented in 1918 and 1922. It was initially used in the United States Census to help match names despite variations in spelling.

Algorithm[edit | edit source]

The Soundex algorithm converts a name to a four-character code. The first character of the code is the first letter of the name, and the remaining three characters are numbers that encode the remaining consonants. Similar sounding consonants share the same number, while vowels are ignored unless they are the first letter.

Steps[edit | edit source]

1. Retain the first letter of the name. 2. Remove all occurrences of 'h' and 'w' except first letter. 3. Replace all consonants (include the first letter) with digits as follows:

  - b, f, p, v → 1
  - c, g, j, k, q, s, x, z → 2
  - d, t → 3
  - l → 4
  - m, n → 5
  - r → 6

4. Replace all adjacent same digits with one digit. 5. Remove all occurrences of a, e, i, o, u, y except first letter. 6. If the result is too short (less than 4 characters), pad with zeros. 7. If the result is too long, truncate to four characters.

Example[edit | edit source]

For example, the Soundex code for "Robert" is R163: - R (first letter) - o (ignored) - b → 1 - e (ignored) - r → 6 - t → 3

Applications[edit | edit source]

Soundex is widely used in genealogy for matching surnames that sound similar but are spelled differently. It is also used in data management systems to find duplicate records.

Limitations[edit | edit source]

Soundex has several limitations: - It is designed for English names and may not work well with names from other languages. - It can produce the same code for names that sound different. - It may not handle names with non-standard spellings well.

See also[edit | edit source]

Related pages[edit | edit source]


WikiMD
Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Search WikiMD

Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

Medical Disclaimer: WikiMD is not a substitute for professional medical advice. The information on WikiMD is provided as an information resource only, may be incorrect, outdated or misleading, and is not to be used or relied on for any diagnostic or treatment purposes. Please consult your health care provider before making any healthcare decisions or for guidance about a specific medical condition. WikiMD expressly disclaims responsibility, and shall have no liability, for any damages, loss, injury, or liability whatsoever suffered as a result of your reliance on the information contained in this site. By visiting this site you agree to the foregoing terms and conditions, which may from time to time be changed or supplemented by WikiMD. If you do not agree to the foregoing terms and conditions, you should not enter or use this site. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD