daydreamer Posted August 14, 2009 Share Posted August 14, 2009 I am trying to match <h3 class=r><a href="" and save just the URL's into an array. This is what I have so far: <?php $pattern='/class=r><a[\s]href="[A-Z0-9a-z.-]+"/'; preg_match_all($pattern, $data, $matches); print_r($matches); ?> What needs to be changed? Thanks. Quote Link to comment Share on other sites More sharing options...
MadTechie Posted August 14, 2009 Share Posted August 14, 2009 Well most URL's have slashes and colons so update the pattern to $pattern='%class=r><a\s+href="([/:A-Z0-9.-]+)"%i'; or try preg_match_all('/class=r><a\s+href="([^"]*)"/i', $data, $matches); $result = $result[0]; Quote Link to comment Share on other sites More sharing options...
daydreamer Posted August 14, 2009 Author Share Posted August 14, 2009 Ok thanks, the last one works perfectly. Quote Link to comment Share on other sites More sharing options...
nrg_alpha Posted August 14, 2009 Share Posted August 14, 2009 Alternatively, you could also resort to DOM/XPath for this sort of thing: Example: $html = <<<EOF <h3 class=r><a href="[url=http://www.whatever.com]www.whatever.com[/url]">Link</a> <table><td><tr>blah</tr></td></table> <h3 class=r><a href="[url=http://www.whatever2.org/somefolder/index.php]www.whatever2.org/somefolder/index.php[/url]">Link 2</a> EOF; $dom = new DOMDocument; @$dom->loadHTML($html); // change loadHTML to loadHTMLFile, and replace $html with the real url encased in quotes $xpath = new DOMXPath($dom); $aTag = $xpath->query('//h3[@class="r"]/a'); foreach ($aTag as $val) { $arr[] = $val->getAttribute('href'); } echo '<pre>'.print_r($arr, true); Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.