Jump to content

Scraping links/content


Omzy

Recommended Posts

I've created a scrape script which fetches all links on a page:

 

$dom = new DOMDocument();

@$dom->loadHTML($html);

$xpath = new DOMXPath($dom);

$links = $xpath->query("//a[@class='listinglink']");

$i=0;

foreach($links as $item)
{
$href = $links->item($i);
$url = $href->getAttribute('href');
echo '<a href="'.$url.'">'.$url.'</a><br/>';
}

 

I now need to extend this further - it needs to go in to each link and get the content of a particular <p> tag on the page. So for example the page output should be as follows:

 

Link 1

P tag content

 

Link 2

P tag content

 

...and so on. The Xpath of the P element I need is "/html/body/form/center/table/tbody/tr[5]/td[4]/table/tbody/tr[3]/td/p"

 

Can anyone assist me with this?

Link to comment
https://forums.phpfreaks.com/topic/190830-scraping-linkscontent/
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.