Omzy Posted February 4, 2010 Share Posted February 4, 2010 Basically I generated an XPath query that finds and outputs a <P> tag from a given page. The problem I am having is that this <P> tag contains <BR> tags within it's content. For example: <p> 100 New Drive <br> New Town <br> Manchester <br> M1 AAA </p> $address = $xpath->evaluate("/html/body/table/tr[5]/td[4]/p"); echo $address->item(0)->nodeValue; This outputs: 100 New DriveNew TownManchesterM1 AAA Ideally I want the <P> tag to be created into an array which is split up upon each <BR> tag. I can then put the data from this array into their own fields in the database. Anybody got any suggestions on how to do this? Quote Link to comment https://forums.phpfreaks.com/topic/190975-xpath-help/ Share on other sites More sharing options...
salathe Posted February 4, 2010 Share Posted February 4, 2010 Given your current code, the following should do what you're wanting or at least point you in the general direction. $texts = $xpath->query('text()', $address->item(0)); foreach ($texts as $text) { $addr[] = trim($text->wholeText); } print_r($addr); The code should be pretty self-explanatory but basically it asks for the text nodes belonging to the paragraph and throws them onto the $addr array for later use. The output, if all goes to plan, should be: Array ( [0] => 100 New Drive [1] => New Town [2] => Manchester [3] => M1 AAA ) Quote Link to comment https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007090 Share on other sites More sharing options...
Omzy Posted February 4, 2010 Author Share Posted February 4, 2010 holy sh*t! that is beautful! thanks buddy! Quote Link to comment https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007092 Share on other sites More sharing options...
Omzy Posted February 4, 2010 Author Share Posted February 4, 2010 salathe, Perhaps you can help me out with my final xpath query - I've created a scrape script which fetches all links on a page: $dom = new DOMDocument(); @$dom->loadHTML($html); $xpath = new DOMXPath($dom); $links = $xpath->query("//a[@class='listinglink']"); $i=0; foreach($links as $item) { $href = $links->item($i); $url = $href->getAttribute('href'); echo '<a href="'.$url.'">'.$url.'</a><br/>'; $i++ } I now need to extend this further - it needs to go in to each link and perform the xpath query from my original post. Do you have any idea how I can do this? Quote Link to comment https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007099 Share on other sites More sharing options...
salathe Posted February 5, 2010 Share Posted February 5, 2010 Your original post was accessing a paragraph's text, your latest a series of anchors. Without more details, help will only be guess-work as the two do not appear to correlate. Give a sample of the HTML that you're accessing and what you want to do with it more precisely. P.S. You're using the foreach loop in a strange way, it could be changed to foreach($links as $href) saving the need for the first and last lines within the loop. Quote Link to comment https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007127 Share on other sites More sharing options...
Omzy Posted February 5, 2010 Author Share Posted February 5, 2010 Hi Salathe, Basically the most recent code I posted is meant to grab all links (with a class value of 'listinglink') from a given page. I now want to run the code from my original post upon each of those links. So basically it's going to go into each of those links, find the required <P> tag and output it's data underneath the link. So a sample output would be: Link 1 P tag content Link 2 P tag content ...and so on. I tried using another curl() within the foreach loop but it doesn't seem to work. P.S. thanks for the helpful tip on the foreach loop! Quote Link to comment https://forums.phpfreaks.com/topic/190975-xpath-help/#findComment-1007593 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.