substr_count() whole words in HTML

gerkintrigg · November 29, 2009

Hi everyone.

I'm trying to count how many instances of specific words are in a piece of code (with the tags removed).

I used this code:

$plus=substr_count(strip_tags($page), $r['word']);

but the issue is that it picks up words like "sealing" as containing words like "sea".

I looked for it on the net and the closest I found was this:

if ((strlen($body) > 180) && (strlen($body) > 1)) { 
$whitespaceposition = strpos($body," ",175)-1; 
$body = substr($body, 0, $whitespaceposition); 
}

The only issue is that once I strip the tags, there may not be any white space before / after a word, so I'm not convinced it'll work as I want it to.

Any suggestions?

salathe · November 29, 2009

You could use preg_match_all which returns the number of matches for the regexp provided to it (ie., a value for $plus).

gerkintrigg · November 29, 2009

like this?

$plus=preg_match_all('~\b'.$r['word'].'\b(?![^<]*?>)~', $r['word'], strip_tags($my_page));

It just seems to give a very very high number... Usually 285 regardless of whether I check 5 words or 2...

salathe · November 29, 2009

Are you using exactly that code? If so, that makes no sense at all since the regex looks for a word but you only supply it with that word in the subject parameter and provide a string for what should be a by-ref variable.

From the looks of things, you probably want more like:

$plus = preg_match_all('/\b'.preg_quote($r['word'], '/').'\b/', strip_tags($my_page));

gerkintrigg · November 29, 2009

I just came across str_word_count.

Is there a way of using that?

I have tried:

<?php
$txt = "Text Text Web Web Text";
$words = str_word_count($summary);
   echo "Total words in summary: $words";
?>

this outputs "5".

Is there a way of passing other parameters to this function to only match words like "Web" so that it outputs "2"?

Or does that need to be a more complex regex issue?

salathe · November 29, 2009

Is there a way of passing other parameters to this function to only match words like "Web" so that it outputs "2"?

In short, no.

Sign In

substr_count() whole words in HTML

Recommended Posts

gerkintrigg

Link to comment

Share on other sites

salathe

Link to comment

Share on other sites

gerkintrigg

Link to comment

Share on other sites

salathe

Link to comment

Share on other sites

gerkintrigg

Link to comment

Share on other sites

salathe

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information