jasonc Posted October 4, 2017 Share Posted October 4, 2017 The code is from a third-party site which I wish to get certain content from.I am wanting to search within the "b" class (there is only on in the source code!) for the <dt>item7</dt> then grab the content of the element that follows it. the 7- otherwise return an empty string $b = ' <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/html"> <head> </head> <body> <div class="b"> <dl> <dt>item1:</dt><dd>1</dd> <dt>item2:</dt><dd>2</dd> <dt>item3:</dt><dd>3</dd> <dt>item4:</dt><dd>4</dd> <dt>item5:</dt><dd>5</dd> <dt>item6:</dt><dd>6</dd> <dt>item7:</dt><dd>7-</dd> <dt>item8:</dt><dd>8</dd> <dt>item9:</dt> </dl> </div> </body> </html> '; $b = new SimpleXMLElement($b); echo $b->dl->dt; // echo the content of <dd> only if the previous <dt> node has the text 'item7' in it. Quote Link to comment Share on other sites More sharing options...
cyberRobot Posted October 4, 2017 Share Posted October 4, 2017 Have you tried the children() method? More information can be found here: http://php.net/manual/en/simplexmlelement.children.php Quote Link to comment Share on other sites More sharing options...
jasonc Posted October 4, 2017 Author Share Posted October 4, 2017 I have looked at the children() link but its gone completely over my head. It looks like the only way I can do this is the long way and search the html code for the <dt>item7:</dt> then grab the text within the next element if the text was found. I was just wondering if there was an easier way of doing this without all the extra code. Quote Link to comment Share on other sites More sharing options...
requinix Posted October 4, 2017 Share Posted October 4, 2017 There's a shorter method, but whether it's "easier" is debatable. $b->registerXPathNamespace("h", "http://www.w3.org/1999/html"); $dd = $b->xpath("//h:div[@class='b']/h:dl/h:dd[preceding-sibling::h:dt[position()=1]='item7:']")[0]; Find any DIV with class='b', then go to their DL children, then to their DD children but filter to the ones whose previous DT sibling is the value 'item7:'If the class isn't "b" then it won't match, but the XPath query can be adjusted to suit. Quote Link to comment Share on other sites More sharing options...
Phi11W Posted October 5, 2017 Share Posted October 5, 2017 > "The code is from a third-party site which I wish to get certain content from." Screen-scraping from other web sites is generally a Bad Idea. "I know Engineers, they love to change things!" You can build a carefully crafted script that works today and then, in a couple of weeks time and for no apparent reason, it suddenly stops working and you have to drop everything, chase around and rewrite your script to work with their new web page design. Sure it's "fun" the first couple of times you have to do this, but it gets "old" really quickly. You should be using a more stable, published API to get any data you require. Regards, Phill W. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.