Jump to content

Xpath help


Omzy

Recommended Posts

Basically I generated an XPath query that finds and outputs a <P> tag from a given page.

 

The problem I am having is that this <P> tag contains <BR> tags within it's content. For example:

 

<p>

100 New Drive

<br>

New Town

<br>

Manchester

<br>

M1 AAA

</p>

 

 
$address = $xpath->evaluate("/html/body/table/tr[5]/td[4]/p");

echo $address->item(0)->nodeValue;

 

This outputs:

 

100 New DriveNew TownManchesterM1 AAA

 

Ideally I want the <P> tag to be created into an array which is split up upon each <BR> tag. I can then put the data from this array into their own fields in the database.

 

Anybody got any suggestions on how to do this?

Link to comment
https://forums.phpfreaks.com/topic/190975-xpath-help/
Share on other sites

Given your current code, the following should do what you're wanting or at least point you in the general direction.

 

$texts = $xpath->query('text()', $address->item(0));
foreach ($texts as $text) {
$addr[] = trim($text->wholeText);
}

print_r($addr);

 

The code should be pretty self-explanatory but basically it asks for the text nodes belonging to the paragraph and throws them onto the $addr array for later use.  The output, if all goes to plan, should be:

 

Array
(
    [0] => 100 New Drive
    [1] => New Town
    [2] => Manchester
    [3] => M1 AAA
)

Link to comment
https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007090
Share on other sites

salathe,

 

Perhaps you can help me out with my final xpath query -

 

I've created a scrape script which fetches all links on a page:

 

$dom = new DOMDocument();

@$dom->loadHTML($html);

$xpath = new DOMXPath($dom);

$links = $xpath->query("//a[@class='listinglink']");

$i=0;

foreach($links as $item)
{
$href = $links->item($i);
$url = $href->getAttribute('href');
echo '<a href="'.$url.'">'.$url.'</a><br/>';
$i++
}

 

I now need to extend this further - it needs to go in to each link and perform the xpath query from my original post. Do you have any idea how I can do this?

Link to comment
https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007099
Share on other sites

Your original post was accessing a paragraph's text, your latest a series of anchors. Without more details, help will only be guess-work as the two do not appear to correlate.

 

Give a sample of the HTML that you're accessing and what you want to do with it more precisely.

 

P.S. You're using the foreach loop in a strange way, it could be changed to foreach($links as $href) saving the need for the first and last lines within the loop.

Link to comment
https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007127
Share on other sites

Hi Salathe,

 

Basically the most recent code I posted is meant to grab all links (with a class value of 'listinglink') from a given page.

 

I now want to run the code from my original post upon each of those links. So basically it's going to go into each of those links, find the required <P> tag and output it's data underneath the link.

 

So a sample output would be:

 

Link 1

P tag content

 

Link 2

P tag content

 

...and so on.

 

I tried using another curl() within the foreach loop but it doesn't seem to work.

 

P.S. thanks for the helpful tip on the foreach loop!

Link to comment
https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007593
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.