torvald_helmer Posted April 10, 2007 Share Posted April 10, 2007 I want to count the number of documents that contains a specified term (word)? This is known as document frequency (DF). I have an array that contains a list of txt-files. Foreach of these files I read one, and foreach of the files I read each term (word). Inside these two loops I count how many times a term exist in one file, and I also remove multiple occurances of a term, so my result is an array of terms from a file (one occurance of each term), and an array with the number of times the word occur in the file. I also want to know in how many of the total number of files each of these terms exist? Can I do this inside the loop somehow? Or is there another smart way to solve this? Link to comment https://forums.phpfreaks.com/topic/46490-document-term-frequency/ Share on other sites More sharing options...
Barand Posted April 10, 2007 Share Posted April 10, 2007 this may be useful http://www.pgp.net/substr_count Link to comment https://forums.phpfreaks.com/topic/46490-document-term-frequency/#findComment-226166 Share on other sites More sharing options...
torvald_helmer Posted April 10, 2007 Author Share Posted April 10, 2007 It didn't work quite as I want to. I have tried this, but it doesn't seem to work either: foreach($terms as $term) { /*array of terms from one file */ foreach($files as $file) { /*array of all files */ if(in_array($term, $file)) { $DF = 'increase variable by one for each time'; } } } Link to comment https://forums.phpfreaks.com/topic/46490-document-term-frequency/#findComment-226198 Share on other sites More sharing options...
Barand Posted April 10, 2007 Share Posted April 10, 2007 Does the $files array contain filenames or the contents of the files? Link to comment https://forums.phpfreaks.com/topic/46490-document-term-frequency/#findComment-226205 Share on other sites More sharing options...
torvald_helmer Posted April 10, 2007 Author Share Posted April 10, 2007 $files is the list of files, I use this to make sure I go through all files. $terms contain the all the words from one file Link to comment https://forums.phpfreaks.com/topic/46490-document-term-frequency/#findComment-226211 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.