Jump to content

Scraping links/content


Omzy

Recommended Posts

I've created a scrape script which fetches all links on a page:

 

 
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$hrefs = $xpath->evaluate("/html/body//a");
for ($i = 0; $i < $hrefs->length; $i++) {
$href = $hrefs->item($i);
$url = $href->getAttribute('href');
echo "<br />Link: $url";
}

 

I now need to extend this further - it needs to go in to each link and get for example the H1 tag on the page. Can someone provide any sample solution?

Link to comment
https://forums.phpfreaks.com/topic/189488-scraping-linkscontent/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.