Jump to content


Photo

Pulling source code from website


  • Please log in to reply
2 replies to this topic

#1 BladeMetal

BladeMetal
  • Members
  • PipPip
  • Member
  • 13 posts

Posted 13 April 2006 - 07:35 AM

Hey, I've got a problem. If anyone can help me I'll be most appreciative.
This is my code:
<?php

//initial feed url
$website = strtolower(file_get_contents("http://www.mywebsite.com"));

//set haystack as website variable
$haystack = $website;

//set needle to link identifier
$needle = "some text to find";

//determine how many times the needle is in the haystack
$foundtimes = substr_count($haystack,$needle);

//loop and save data to linkdb array
for ($i=0;$i<$foundtimes;$i++) {
       $pos[$i] = strpos($haystack,$needle,$i*350+4400);

       $linkdb[$i] = substr($website,$pos[$i],30);

    if (substr_count($linkdb[$i],'"')) {

        $linkdb[$i] = str_replace('"',' ',$linkdb[$i]);

    }

    //print out the farmed links
       echo $i." ".$pos[$i]." ".htmlentities($linkdb[$i])."<br>";

}

?>
My issue is in the for loop. The $needle contents occurs 18 times in the website I'm pulling the contents from. So $foundtimes = 18.
Is there a way to find each occurance of $needle in $haystack in such that once you find an occurance, you log its position, and when you loop back, start searching for the same string from the position of the previous string + 1?

The above code works to an extent. The $i*350+4400 in strpos function causes some of the text to be found more than once which is not ideal of course.

I've tried:
pos[0] = 0
for ($i=1;$i<=$foundtimes;$i++) {
       $pos[$i] = strpos($haystack,$needle,$pos[$i-1]);

       $linkdb[$i] = substr($website,$pos[$i],30);

    if (substr_count($linkdb[$i],'"')) {

        $linkdb[$i] = str_replace('"',' ',$linkdb[$i]);

    }

    //print out the farmed links
       echo $i." ".$pos[$i]." ".htmlentities($linkdb[$i])."<br>";

}
And it didn't work. All it returned was the same text from the same position.

If anyone has any ideas, please let me know.
Thanks in advance.

#2 michaellunsford

michaellunsford
  • Members
  • PipPipPip
  • Advanced Member
  • 1,023 posts
  • LocationLouisiana, USA

Posted 13 April 2006 - 02:39 PM

You might be getting hung up by giving it the exact position of the first occurance as the offset. Try adding one to the offset: $pos[$i-1]+1

#3 BladeMetal

BladeMetal
  • Members
  • PipPip
  • Member
  • 13 posts

Posted 13 April 2006 - 09:10 PM

Thanks heaps. I can't believe I didn't do that before. Thanks again, its much appreciated.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users