Jump to content


Photo

comparing arrays containing strings-> stopword removal-> code help not working!


  • Please log in to reply
No replies to this topic

#1 underwinefx

underwinefx
  • New Members
  • Pip
  • Newbie
  • 2 posts

Posted 20 September 2006 - 02:08 PM

I need a stopword removal implementation. I am flabbergasted with few issues.

For eg:

stopword.txt
is
the
that
this

<?
//loading the stopword from a text file 'stopword.txt' which contains a list of words containing one word per line and entered by pressing the ENTER key after every word

$stopword=array('is', 'the', 'that', 'this',);//I assume that the previous line gives the same effect as this line...am I wrong? and this LINE would NOT be in the code..since I will be using
$text="this is a quote to say that anything is possible";//contains stopwords that need to be removed.
//using explode to make the text string into separate array elements using <space> as the delimiter.
$keyword=explode(" ",$text);
$keyword=array_diff($keyword,$stopword);
?>

The above code works...

<?
$myFile = "stopword.txt";//the file containing the stopwords
//opening the stopword.txt file
$fh = fopen($myFile, 'r');

//reading the stopword.txt file
$theData = fread($fh, filesize($myFile));

//to get the data from the text file as separate array elements using explode
//$stopword=explode("\n",$theData);
$keyword=explode(" ",$text);
$keyword=array_diff($keyword,$stopword);

The above code does NOT work..why?

And I would Like to know what else can be done? To compare each element from one array to the other element and unset/remove it?

Should I use looping? I am a starter and I am at loss how to implement this ..

In brief:
1. load the stopword list from stopword.txt into stopword array
2. separate the string into keywords by using space as delimiter.
3. compare each word of stopword array with each element in the keyword array
4. remove the elements from keyword array which are present in stopword array
5. return results of new keyword array without the stopwords for further coding


How to implement this?

Regards,
Underwinefx




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users