Jump to content


Photo

search script pos muddle


  • Please log in to reply
No replies to this topic

#1 arfa

arfa
  • Members
  • PipPip
  • Member
  • 28 posts

Posted 26 July 2006 - 08:38 PM

I am building a wee search script but am stuck trying to cover multiple occurences and get tangled in a myriad of if() loops.
I am searching standard html & text files, so no SQL

Scenario:
$simple_string = 'I am stuck trying to cover multiple occurences of keywords in proximity';
$querry = 'stuck multiple';

open file(s)
strip tags
if (strstr($simple_string,$querry) {
$pos[]=strpos($simple_string,$querry); }

In the example above two values are returned for $pos
I set up a +$end -$start for $pos to return a string and we have two strings - eg:
> am stuck trying
> cover multiple occurences

My ideal in the above example would be to return just one 'merged' string.
> am stuck trying to cover multiple occurences

######
How does $pos determine its position relative to another $pos???
######

Here is my script start so far:
$page_dir='./files_here';
$dh = opendir($page_dir); // just one DIR for now
        while ($files = readdir($dh)) {
            if (ereg(".\.htm$","$files")) { // test htm files
                $file_array[] = $files;    // all files as array
                $rec_num = count($file_array)-1;  }}
                      $one_bit='';  $hits='';  $text=''; $runner=''; $plod='';
foreach ($file_array as $incl) {
$str=file_get_contents("$page_dir/$incl");
                    $linker = str_replace('.htm','',$incl); // for result reporting
            $tagless = strtolower(strip_tags($str));
                            $full_len = strlen($tagless);  // total string length
                  for ($s=0; $s<=$str_num; $s++){          // loop each word of string
                  if (strstr($tagless,$str_pop[$s])) {      // found one word of string
                        $hits++;
                            $pos[] =strpos($tagless,$str_pop[$s]); // where is it
                if ($pos>=$long ) {$x=$pos-$long;} else{$x=$pos-$pos;}
                  if ($pos+$long<=$full_len) {$y=$pos+$long;} else{$y=$full_len;}
                            $less_end = $full_len-$y;
            $one_bit .= substr($tagless,$x,-$less_end)."<BR>"; // got one hit = one sting bit
}}}

I tried an array of $pos and then tried to compare proximity of results.
if else if or and...
I have tried several other different approaches to get tidy result strings but it gets very convoluted and fuzzy in my head.

Or,
maybe there another approach to this?

The site is relatively simple so I am not particularly concerned with weighting although suggestion on how to track this would also be welcome.

thanks - arfa




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users