Jump to content


Photo

MySQL/PHP user input - detecting input that should be exactly similar but is spelled a little differently


  • Please log in to reply
2 replies to this topic

#1 xenonoverlord

xenonoverlord
  • New Members
  • Pip
  • Newbie
  • 1 posts

Posted 13 February 2006 - 01:44 PM

Hello,

Does anyone know if there are any (php) scripts out there that can detect user imput into a MySQL database that is differs slightly.

Or how you can handle such a problem without having to resolve to going "manually" through the database entries.

E.g. a script that can detect that

- Traversée Mont Blanc
- Salbitschijen, West Face

is very similar to

- Traversee Mont Blanc
- West face Salbitschijen

And asks if 1 should be replaced by 2 or vice versa...

thx !

Tom

#2 wickning1

wickning1
  • Members
  • PipPipPip
  • Advanced Member
  • 405 posts

Posted 13 February 2006 - 03:39 PM

MySQL has a phonetic matching function called SOUNDEX() that may help you.

You can use PHP to cycle through and use preg_replace (or similar) to switch out characters like your é and e, and then compare strings.

You could also use a spellchecking library against a dictionary to try to correct common typos/spelling errors, but that will take a bit more work.

Of course all these solutions are best used for GENERATING A REPORT, not making automatic changes. :)

#3 fenway

fenway
  • Staff Alumni
  • MySQL Si-Fu / PHP Resident Alien
  • 16,199 posts
  • LocationToronto, ON

Posted 13 February 2006 - 05:15 PM

Well, for the first case, you'll have to actually search for both, since they're separate characters; for the second, if you search for each word separately, you'll be able to find both cases. You'd be surprised how useless SOUNDEX() is when the "typo" is near the beginning of the word (not surprisingly, I guess, since the "sound" is different).
Seriously... if people don't start reading this before posting, I'm going to consider not answering at all.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users