Jagarm Posted March 27, 2009 Share Posted March 27, 2009 Hello everyone, I am trying to extract some stuff from a page, the following is what i have so far. $content = file_get_contents ( "http://www.dfo-mpo.gc.ca/media/news-presse-eng.htm" ); echo preg_match_all("/<li>(<[^\r]*?)<\/li>/", $content, $maches,PREG_SET_ORDER); echo "<pre>"; print_r ( $maches ); echo "</pre>"; If you view the source on that page you will see I am trying to extract whatever is in <li> and </li> and contains a hyperlink inside <li> and </li> I'm been trying so hard with no luck. I have the regex testbed that I test the regex, it works there but not with php. I would appreciate for your help. Thanks Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted March 27, 2009 Share Posted March 27, 2009 In cases where looking through tags on a site, one can use DOMDocument / xpath instead. So if I understand you correctly, you want to only fetch <li> tags with links within them? Perhaps something along the lines of: $dom = new DOMDocument; @$dom->loadHTMLFile('http://www.dfo-mpo.gc.ca/media/news-presse-eng.htm'); $xpath = new DOMXPath($dom); $aTag = $xpath->query('//li/a'); foreach ($aTag as $val) { echo 'href="' . $val->getAttribute('href') . '" - ' . $val->nodeValue . "<br />\n"; } Output: href="/media/news-presse-eng.htm" - News Releases href="/media/charges-inculpations-eng.htm" - Charges and Convictions href="/media/back-fiche-eng.htm" - Backgrounders href="/media/statement-declarations-eng.htm" - Ministerial Statements href="/media/speeches-discours-eng.htm" - Speeches href="http://www.glf.dfo-mpo.gc.ca/comm/nr-cp/alert-avis-e.php" - E-News href="/media/infocus-alaune-eng.htm" - Infocus href="/media/contacts-eng.htm" - Contacts . . . etc Quote Link to comment Share on other sites More sharing options...
Jagarm Posted March 28, 2009 Author Share Posted March 28, 2009 Thanks for the solution that is easier than what I was doing. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.