Skip to content

SoundEx Algorithm

Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. It is commonly used with databases to help with searching and is built-in to many database engines such as PostgreSQL and MySQL. SoundEx is not included with SQLite by default and there may be situations when you want to use it when searching.

Fortunately the algorithm is not all that difficult. You can read more about SoundEx on Wikipedia, but here are the general steps:

  1. Retain the first letter of the name and drop all other occurrences of a, e, i, o, u, y, h, w.
  2. Replace consonants with digits as follows (after the first letter):
    • b, f, p, v → 1
    • c, g, j, k, q, s, x, z → 2
    • d, t → 3
    • l → 4
    • m, n → 5
    • r → 6
  3. If two or more letters with the same number are adjacent in the original name (before step 1), only retain the first letter; also two letters with the same number separated by ‘h’ or ‘w’ are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter.
  4. If you have too few letters in your word that you can’t assign three numbers, append with zeros until there are three numbers. If you have more than 3 letters, just retain the first 3 numbers.

The SoundEx User Guide topic has a Xojo function you can use in your projects. With this function, you can save SoundEx values to a column in the database and use that for comparison when searching.