Jump to content

Need a few pointers


newbtophp

Recommended Posts

Hi, I'm in a bit of trouble.

 

If I have a list of words each on a separate line:

 

Example Words:

something
say
phpfreaks us all
something
help
php
php arithmetic
php functions

 

How can i place each word within a variable; (like with foreach)?

 

So at the end in this foreach if i echo that variable it should echo 5 words.

 

Also how would i retrieve the top recurring word from the words? (like for e.g. in this case its 'php' since that word is mentioned the most)

 

:-\

 

Link to comment
Share on other sites

//$array=$_POST['words'];

// would give you the array I make below.

// the code following it gives up the word count

//file_put_contents("$file","$str")//will write your string to $file

//$str = file_get_contents("$file")//will put the contents into a string

//$array = file("$file")//will put the contents into $array-with each line as an element

 

$array=array(
something,
say,
'phpfreaks us all',
something,
help,
php,
'php arithmetic',
'php functions'
);
$str = implode(" ", $array);

$word_array_check=array();

foreach($array as $word)
{
$count = substr_count($str,$word);
$check= "\"$word\" occurred $count times";
array_push($word_array_check,$check);
echo "$check<br>";
}

echo '<p>';
$word_array = array_unique($word_array_check);

foreach($word_array as $key=>$value)
{
  echo "$value<br>";
}

 

 

HTH

Teamatomic

Link to comment
Share on other sites

I could've sworn there was a built in function that would return an associative array containing each string and how many times it occurred. Maybe I'm just going crazy though. You could use something like this:

 

function count_word_occurrences($str, $case_sensitive = true)
{
$str = $case_sensitive ? $str : strtolower($str);
$occurrences = array();
foreach(str_word_count($str, 1) as $word)
{
	$word = $case_sensitive ? $word : strtolower($word);
	if(!in_array($word, array_keys($occurrences)))
	{
		$occurrences[$word] = preg_match_all("~\b$word\b~", $str, $matches);
	}
}
return $occurrences;
}

$text = <<<TEXT
The Law, as quoted, lays down a fair conduct of life, and one not easy to follow.
I have been fellow to a beggar again and again under circumstances which prevented 
either of us finding out whether the other was worthy. I have still to be brother 
to a Prince, though I once came near to kinship with what might have been a veritable 
King and was promised the reversion of a Kingdom army, law-courts, revenue and policy 
all complete. But, to-day, I greatly fear that my King is dead, and if I want a crown 
I must go and hunt it for myself.
TEXT;

print_r(count_word_occurrences($text));

 

Output:

 

Array
(
    [The] => 1
    [Law] => 1
    [as] => 1
    [quoted] => 1
    [lays] => 1
    [down] => 1
    [a] => 6
    [fair] => 1
    [conduct] => 1
    [of] => 3
    [life] => 1
    [and] => 6
    [one] => 1
    [not] => 1
    [easy] => 1
    [to] => 6
    [follow] => 1
    [i] => 6
    [have] => 3
    [been] => 2
    [fellow] => 1
    [beggar] => 1
    [again] => 2
    [under] => 1
    [circumstances] => 1
    [which] => 1
    [prevented] => 1
    [either] => 1
    [us] => 1
    [finding] => 1
    [out] => 1
    [whether] => 1
    [the] => 2
    [other] => 1
    [was] => 2
    [worthy] => 1
    [still] => 1
    [be] => 1
    [brother] => 1
    [Prince] => 1
    [though] => 1
    [once] => 1
    [came] => 1
    [near] => 1
    [kinship] => 1
    [with] => 1
    [what] => 1
    [might] => 1
    [veritable] => 1
    [King] => 2
    [promised] => 1
    [reversion] => 1
    [Kingdom] => 1
    [army] => 1
    [law-courts] => 1
    [revenue] => 1
    [policy] => 1
    [all] => 1
    [complete] => 1
    [but] => 1
    [to-day] => 1
    [greatly] => 1
    [fear] => 1
    [that] => 1
    [my] => 1
    [is] => 1
    [dead] => 1
    [if] => 1
    [want] => 1
    [crown] => 1
    [must] => 1
    [go] => 1
    [hunt] => 1
    [it] => 1
    [for] => 1
    [myself] => 1
)

 

If you're wanting to only get the top occurring words in the string you can use arsort and only take the first few elements of the array.

 

@teamatomic: There's a few problems with that method. Most notably that substr_count won't count only full words. So if you were searching for the word "a", it would also count all the a's that are parts of other words. Even if you padded spaces onto the string you're searching for that doesn't take into account other word barriers (like ",", ";", etc). However, the \b modifier in regex does.

Link to comment
Share on other sites

Or alternatively..

 

$words = str_word_count($text, 1);
print_r(array_count_values($words));

 

And to limit to only the top 3:

 

$words = array_count_values(str_word_count($text, 1));
arsort($words);
$words = array_slice($words, 0, 3);

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.