MySQL/PHP user input - detecting input that should be exactly similar but is spelled a little differently

Does anyone know if there are any (php) scripts out there that can detect user imput into a MySQL database that is differs slightly.

Or how you can handle such a problem without having to resolve to going "manually" through the database entries.

E.g. a script that can detect that

- Traversée Mont Blanc
- Salbitschijen, West Face

is very similar to

- Traversee Mont Blanc
- West face Salbitschijen

And asks if 1 should be replaced by 2 or vice versa...

thx !


MySQL has a phonetic matching function called SOUNDEX() that may help you.

You can use PHP to cycle through and use preg_replace (or similar) to switch out characters like your é and e, and then compare strings.

You could also use a spellchecking library against a dictionary to try to correct common typos/spelling errors, but that will take a bit more work.

Of course all these solutions are best used for GENERATING A REPORT, not making automatic changes. :)

Well, for the first case, you'll have to actually search for both, since they're separate characters; for the second, if you search for each word separately, you'll be able to find both cases. You'd be surprised how useless SOUNDEX() is when the "typo" is near the beginning of the word (not surprisingly, I guess, since the "sound" is different).

