Jump to content

Archived

This topic is now archived and is closed to further replies.

BladeMetal

Pulling source code from website

Recommended Posts

Hey, I've got a problem. If anyone can help me I'll be most appreciative.
This is my code:
[code]
<?php

//initial feed url
$website = strtolower(file_get_contents("http://www.mywebsite.com"));

//set haystack as website variable
$haystack = $website;

//set needle to link identifier
$needle = "some text to find";

//determine how many times the needle is in the haystack
$foundtimes = substr_count($haystack,$needle);

//loop and save data to linkdb array
for ($i=0;$i<$foundtimes;$i++) {
       $pos[$i] = strpos($haystack,$needle,$i*350+4400);

       $linkdb[$i] = substr($website,$pos[$i],30);

    if (substr_count($linkdb[$i],'"')) {

        $linkdb[$i] = str_replace('"',' ',$linkdb[$i]);

    }

    //print out the farmed links
       echo $i." ".$pos[$i]." ".htmlentities($linkdb[$i])."<br>";

}

?>
[/code]
My issue is in the for loop. The $needle contents occurs 18 times in the website I'm pulling the contents from. So $foundtimes = 18.
Is there a way to find each occurance of $needle in $haystack in such that once you find an occurance, you log its position, and when you loop back, start searching for the same string from the position of the previous string + 1?

The above code works to an extent. The $i*350+4400 in strpos function causes some of the text to be found more than once which is not ideal of course.

I've tried:
[code]
pos[0] = 0
for ($i=1;$i<=$foundtimes;$i++) {
       $pos[$i] = strpos($haystack,$needle,$pos[$i-1]);

       $linkdb[$i] = substr($website,$pos[$i],30);

    if (substr_count($linkdb[$i],'"')) {

        $linkdb[$i] = str_replace('"',' ',$linkdb[$i]);

    }

    //print out the farmed links
       echo $i." ".$pos[$i]." ".htmlentities($linkdb[$i])."<br>";

}
[/code]
And it didn't work. All it returned was the same text from the same position.

If anyone has any ideas, please let me know.
Thanks in advance.

Share this post


Link to post
Share on other sites
You might be getting hung up by giving it the exact position of the first occurance as the offset. Try adding one to the offset: $pos[$i-1]+1

Share this post


Link to post
Share on other sites

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.