newb Posted June 3, 2010 Share Posted June 3, 2010 how can i split compound words in a random php string for example, i have strings such as $word = 'ohmygod' is it possible for php to detect the individual words as joined together and make them to form: $word = 'oh my god' thanks. Quote Link to comment Share on other sites More sharing options...
newb Posted June 3, 2010 Author Share Posted June 3, 2010 another example is: is it possible for php to have something like how google is doing this: http://www.google.com/search?q=sportscar when i enter 'sportscar' in the search it says: did you mean 'sports car' and separates the two words. Quote Link to comment Share on other sites More sharing options...
Karlos94 Posted June 3, 2010 Share Posted June 3, 2010 I believe that is down to a JavaScript spell check or something along those lines, however if your not a fan of JavaScript there is a PHP alternative, but you'll need to do quite a bit. The PHP function: levenshtein() Quote Link to comment Share on other sites More sharing options...
newb Posted June 3, 2010 Author Share Posted June 3, 2010 ok, but the string is always some random combination of joined words like 'ohmygod' and 'laughoutloud' never always the same. however i need the words to be split up not joined. the function you've sent me to says i need to have a set array of words to check against in order for it to work. however, the string could be any combination of joined words. so i think this is no good for me.. Quote Link to comment Share on other sites More sharing options...
newb Posted June 3, 2010 Author Share Posted June 3, 2010 is it possible to use preg_split to split joined words? such as 'ohmygod' Quote Link to comment Share on other sites More sharing options...
jd307 Posted June 3, 2010 Share Posted June 3, 2010 I am by no means an expert here, but I *think* that this would require some kind of dictionary of words that a script could look up, as PHP itself does not know what would constitute each word in the list. For example, 'laughoutloud' would require a dictionary that had 'laugh', 'out' and 'loud' in it so your script could possibly extract the string based on each word. You could possibly do something similar to (tho this is a very long method and not very efficient): $word = "laughoutloud"; $dict1 = "laugh"; $dict2 = "out"; $dict3 = "loud"; // Get length of string $len = mb_strlen($word); $pos = 0; $string = ""; while($post < $len) { $string = $string . substr($word, $pos, 1); $pos = $pos + 1; if($string == $dict1 || $string == $dict2 || $string == $dict3) { break; } } echo $string; This will output $string as "laugh". Of course you'd have to modify this code so that it will do the rest of the word, as this figures out the first word and then quits the while loop, meaning you'd have to make it continue to process the "outloud" part of the initial string. You'd also have to develop the dictionary too. How does the word get constructed in the first place? If it has a list of word that it strings together then this could serve as your dictionary. Quote Link to comment Share on other sites More sharing options...
newb Posted June 3, 2010 Author Share Posted June 3, 2010 the compound word strings are from URL names that are joined together for example phpfreaks.com but id like to put a space inbetween 'php' and 'freaks' somehow. also how could i add a dictionary list to the php script.. Quote Link to comment Share on other sites More sharing options...
newb Posted June 3, 2010 Author Share Posted June 3, 2010 so i guess this is impossible ='( Quote Link to comment Share on other sites More sharing options...
newb Posted June 3, 2010 Author Share Posted June 3, 2010 well ive wrote up something using pspell but its pretty sloppy but it works..its able to split 3 joined words max as thats all i require.. if anyone else thinks they can write something better let me know.. <? $sentence = "laughoutloud"; function extractwords($sentence) { $pspell_link = pspell_new("en_us"); $size = strlen($sentence); for ($i = 0; $i < $size-1; $i++) { $currentword = substr_replace($sentence ,"",($size - $i)); if (pspell_check($pspell_link, $currentword)) { $firstword = $currentword; $remaining = substr($sentence, strlen($firstword)); //echo "currentword:$currentword<br>"; //echo "remaining:$remaining $secword<br>"; break; } } $size = strlen($remaining); for ($i = 0; $i < $size-1; $i++) { $secword = substr_replace($remaining ,"",($size - $i)); if (pspell_check($pspell_link, $secword)) { $secondword = $secword; $thirdword = substr($remaining, strlen($secondword)); //echo "currentword:$currentword<br>"; //echo "remaining:$remaining $secword<br>"; break; } } echo "$firstword<br />"; echo "$secondword<br />"; echo "$thirdword<br />"; } extractwords($sentence); ?> Quote Link to comment Share on other sites More sharing options...
jd307 Posted June 6, 2010 Share Posted June 6, 2010 I am no expert by an means, so the code example I have given is just basic. It looks like you are onto the kind of idea, the only problem that I have just thought of with this which I don't know how you would be able to solve is with words that comprise of other words, e.g.: yourself = your and self. Using my dictionary idea, if it detected the word "YOUR" it would separate that out and you would end up with YOUR and SELF as two different words with a space between instead of one whole word. The only other suggestion I can offer is if there is a limited number of possibilities that your URLs will give, then you could "hard code" or set up a database for the wordage... e.g. $sentance = "laughoutloud"; if ($sentance = "laughoutloud") $word = "Laugh Out Loud"; if ($sentance = "ohyummyihavepizza") $word = "Oh yummy I have pizza" and so on... I am sorry I cannot give any further assistance on this one because I have no other clue how else to tackle this... good luck tho! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.