Jump to content

follow links found on page but not a search engine


DJphp

Recommended Posts

Hello,

 

I have a php script that opens a web page that displays daily posts.  Only 25 results are returned and if there are more than 25 results then the next results can be accessed by clicking on a 'Page 2' button.

 

My script:

 

//open Page 1 of daily posts

 

$urlPage1 = "http://www.domain.com/forum/search.php?do=getdaily";

$dataPage1= file_get_contents($urlPage1);

 

 

// pattern to match for the "Goto Page 2" link, and match it

 

$patternPage2 = "/<a[^>]+href=\"(search\.php\?[^\"]+).*26/i";

preg_match_all($patternPage2, $dataPage1, $matchesPage2);

 

 

// capture all instances of the match and create a URL to follow, using one element from the array

 

$countarrayPage2 = count($matchesPage2[0]);

for ($i=0; $i < $countarrayPage2; $i++) {

  $urlPage2[$i] = $matchesPage2['1']['1'];

}

$urlPage2['0'] = "http://www.domain.com./forum/" . $urlPage2['0'];

$urlPage2['0'] = str_replace("&", "&", $urlPage2['0']);

 

// Get the contents of the Page 2

 

$dataPage2 = file_get_contents($urlPage2['0']);

 

 

// concatenate the two results (Page 1 and Page 2)

 

$data = $dataPage1;

$data .= $dataPage2;

 

 

 

The problem is that the script does not seem to follow the second link ($urlPage2['0']) and so cannot concatenate the two pages.

 

is there a way to do this?

 

any help is appreciated.

 

cheers,

DJphp

 

 

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.