Jump to content

Common words


The Little Guy

Recommended Posts

I found a function online that searches a string, and finds words that are common in the string, i am not sure, but I think that the following line is not 100% right:

preg_match_all('/([a-z]*?)(?=s)/i', $string, $matchWords);

 

Say for example I have a string about breast cancer, it will find strings but it will only get the word "brea", when in the string the words:

breast

breasts

 

I would like to find full words, and also include words with postfixes and prefixes

 

any suggestions?

 

Here was the function I found:

function commonWords($string){
$stopWords = array('i','a','about','an','and','are','as','at','be','by','com','de','en','for','from','how','in','is','it','la','of','on','or','that','the','this','to','was','what','when','where','who','will','with','und','the','www');

$string = preg_replace('/ss+/i', '', $string);
$string = trim($string); // trim the string
$string = preg_replace('/[^a-zA-Z0-9 -]/', '', $string); // only take alphanumerical characters, but keep the spaces and dashes tooŊ	$string = strtolower($string); // make it lowercase

preg_match_all('/([a-z]*?)(?=s)/i', $string, $matchWords);
$matchWords = $matchWords[0];
foreach ( $matchWords as $key=>$item ) {
	if ( $item == '' || in_array(strtolower($item), $stopWords) || strlen($item) < 5) {
		unset($matchWords[$key]);
	}
}
$wordCountArr = array();
if ( is_array($matchWords) ) {
	foreach ( $matchWords as $key => $val ) {
		$val = strtolower($val);
		if ( isset($wordCountArr[$val]) ) {
			$wordCountArr[$val]++;
		} else {
			$wordCountArr[$val] = 1;
		}
	}
}
arsort($wordCountArr);
//$wordCountArr = array_slice($wordCountArr, 0, 10);
return $wordCountArr;                
}

Link to comment
https://forums.phpfreaks.com/topic/186657-common-words/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.