rodrico101 Posted August 18, 2008 Share Posted August 18, 2008 Hello, I am trying to write a script that generates misspelled words to find in searches. (ie search auction sites for bargains) I want to search by simple typos (ie Beatles => ebatles, beatels etc) and also by letters close to them on the keyboard (ie B => v,n.g.h.f etc) Not quite sure where to start. Once all the words are generated, I want to put them altogether and send to a web search Any help would be appreciated. Rod Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/ Share on other sites More sharing options...
Fadion Posted August 18, 2008 Share Posted August 18, 2008 I think there isn't any automated technique for achieving this. You may have a dictionary of common misspelled words and search within them. Or on reverse, have a dictionary of common words and when a user searches for "ebatles", scramble it until if finds a word in the database. It sounds dull but I have no smart ideas for this. Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/#findComment-619591 Share on other sites More sharing options...
unkwntech Posted August 18, 2008 Share Posted August 18, 2008 I have a system that generate regex's to catch intentionaly misspelled words, this might help you get started. <?php $pattern['a'] = '/[a]/'; $replace['a'] = '[a A @]'; $pattern['b'] = '/[b]/'; $replace['b'] = '[b B I3 l3 i3]'; $pattern['c'] = '/[c]/'; $replace['c'] = '(?:[c C (]|[k K])'; $pattern['d'] = '/[d]/'; $replace['d'] = '[d D]'; $pattern['e'] = '/[e]/'; $replace['e'] = '[e E 3]'; $pattern['f'] = '/[f]/'; $replace['f'] = '(?:[f F]|[ph pH Ph PH])'; $pattern['g'] = '/[g]/'; $replace['g'] = '[g G]'; $pattern['h'] = '/[h]/'; $replace['h'] = '[h H]'; $pattern['i'] = '/[i]/'; $replace['i'] = '[i I l ! 1]'; $pattern['j'] = '/[j]/'; $replace['j'] = '[j J]'; $pattern['k'] = '/[k]/'; $replace['k'] = '(?:[c C (]|[k K])'; $pattern['l'] = '/[l]/'; $replace['l'] = '[l L 1 ! i]'; $pattern['m'] = '/[m]/'; $replace['m'] = '[m M]'; $pattern['n'] = '/[n]/'; $replace['n'] = '[n N]'; $pattern['o'] = '/[o]/'; $replace['o'] = '[o O 0]'; $pattern['p'] = '/[p]/'; $replace['p'] = '[p P]'; $pattern['q'] = '/[q]/'; $replace['q'] = '[q Q]'; $pattern['r'] = '/[r]/'; $replace['r'] = '[r R]'; $pattern['s'] = '/[s]/'; $replace['s'] = '[s S $ 5]'; $pattern['t'] = '/[t]/'; $replace['t'] = '[t T 7]'; $pattern['u'] = '/[u]/'; $replace['u'] = '[u U v V]'; $pattern['v'] = '/[v]/'; $replace['v'] = '[v V u U]'; $pattern['w'] = '/[w]/'; $replace['w'] = '[w W vv VV]'; $pattern['x'] = '/[x]/'; $replace['x'] = '[x X]'; $pattern['y'] = '/[y]/'; $replace['y'] = '[y Y]'; $pattern['z'] = '/[z]/'; $replace['z'] = '[z Z 2]'; $word = str_split(strtolower($_POST['word'])); $i=0; while($i < count($word)) { if(!is_numeric($word[$i])) { if($word[$i] != ' ' || count($word[$i]) < '1') { $word[$i] = preg_replace($pattern[$word[$i]], $replace[$word[$i]], $word[$i]); } } $i++; } //$word = "/" . implode('', $word) . "/"; echo implode('', $word); Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/#findComment-619595 Share on other sites More sharing options...
rodrico101 Posted August 18, 2008 Author Share Posted August 18, 2008 Thanks...I will take a look at that... Rod Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/#findComment-619658 Share on other sites More sharing options...
natbob Posted August 19, 2008 Share Posted August 19, 2008 This may not work for you but you could use a function like levenshtein (http://www.php.net/manual/en/function.levenshtein.php) it will tell you how close two words are together, this would be used on the fly by comparing the search to the database of items, like this: <?php $short = -1; //how close of a match it is while($row = mysql_fetch_assoc($result)) { $lev = levenshtein(strtolower($_GET['word']), strtolower($row['name']), 1, 1, 1); if ($lev == 0) { //if there is a perfect match $close = $word; $short = 0; $id = $row['id']; break; } if ($lev <= $short || $short < 0) { //if there has been no match or if the current word is a better match $close = $word; $ans = $row['name']; $id = $row['id']; $short = $lev; } } ?> That code would take a mysql query result and output the closest match to the input ($_GET['word']). Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/#findComment-619717 Share on other sites More sharing options...
corbin Posted August 19, 2008 Share Posted August 19, 2008 You could start with http://php.net/soundex, but for that you will need a dictionary.... Edit: Oh! There's also a spelling module in PHP, but I don't remember what it's called. Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/#findComment-619726 Share on other sites More sharing options...
teng84 Posted August 19, 2008 Share Posted August 19, 2008 <a href="http://www.php.net/manual/en/function.pspell-check.php">this?</a> Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/#findComment-619774 Share on other sites More sharing options...
corbin Posted August 19, 2008 Share Posted August 19, 2008 Yup! Nice. Link to comment https://forums.phpfreaks.com/topic/120268-misspelling-generator/#findComment-619778 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.