gerkintrigg Posted November 29, 2009 Share Posted November 29, 2009 Hi everyone. I'm trying to count how many instances of specific words are in a piece of code (with the tags removed). I used this code: $plus=substr_count(strip_tags($page), $r['word']); but the issue is that it picks up words like "sealing" as containing words like "sea". I looked for it on the net and the closest I found was this: if ((strlen($body) > 180) && (strlen($body) > 1)) { $whitespaceposition = strpos($body," ",175)-1; $body = substr($body, 0, $whitespaceposition); } The only issue is that once I strip the tags, there may not be any white space before / after a word, so I'm not convinced it'll work as I want it to. Any suggestions? Quote Link to comment https://forums.phpfreaks.com/topic/183340-substr_count-whole-words-in-html/ Share on other sites More sharing options...
salathe Posted November 29, 2009 Share Posted November 29, 2009 You could use preg_match_all which returns the number of matches for the regexp provided to it (ie., a value for $plus). Quote Link to comment https://forums.phpfreaks.com/topic/183340-substr_count-whole-words-in-html/#findComment-967722 Share on other sites More sharing options...
gerkintrigg Posted November 29, 2009 Author Share Posted November 29, 2009 like this? $plus=preg_match_all('~\b'.$r['word'].'\b(?![^<]*?>)~', $r['word'], strip_tags($my_page)); It just seems to give a very very high number... Usually 285 regardless of whether I check 5 words or 2... Quote Link to comment https://forums.phpfreaks.com/topic/183340-substr_count-whole-words-in-html/#findComment-967740 Share on other sites More sharing options...
salathe Posted November 29, 2009 Share Posted November 29, 2009 Are you using exactly that code? If so, that makes no sense at all since the regex looks for a word but you only supply it with that word in the subject parameter and provide a string for what should be a by-ref variable. From the looks of things, you probably want more like: $plus = preg_match_all('/\b'.preg_quote($r['word'], '/').'\b/', strip_tags($my_page)); Quote Link to comment https://forums.phpfreaks.com/topic/183340-substr_count-whole-words-in-html/#findComment-967745 Share on other sites More sharing options...
gerkintrigg Posted November 29, 2009 Author Share Posted November 29, 2009 I just came across str_word_count. Is there a way of using that? I have tried: <?php $txt = "Text Text Web Web Text"; $words = str_word_count($summary); echo "Total words in summary: $words"; ?> this outputs "5". Is there a way of passing other parameters to this function to only match words like "Web" so that it outputs "2"? Or does that need to be a more complex regex issue? Quote Link to comment https://forums.phpfreaks.com/topic/183340-substr_count-whole-words-in-html/#findComment-967765 Share on other sites More sharing options...
salathe Posted November 29, 2009 Share Posted November 29, 2009 Is there a way of passing other parameters to this function to only match words like "Web" so that it outputs "2"? In short, no. Quote Link to comment https://forums.phpfreaks.com/topic/183340-substr_count-whole-words-in-html/#findComment-967768 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.