dreamwest Posted September 7, 2009 Share Posted September 7, 2009 Im trying to break apart a group of hyperlinks so im just left with the actual url: html here<a class='link' href='http://www.site.com/?q=6J4W2G4fASI'>html here html here<a class='link' href='http://www.site.com/?q=637yfuhqelhe'>html here html here<a class='link' href='http://www.site.com/?q=hilghye8oyhi'>html here etc... so the result would be : http://www.site.com/?q=6J4W2G4fASI http://www.site.com/?q=637yfuhqelhe http://www.site.com/?q=hilghye8oyhi Quote Link to comment Share on other sites More sharing options...
sKunKbad Posted September 7, 2009 Share Posted September 7, 2009 Do you mean you are trying to strip the links out of an existing HTML document? Quote Link to comment Share on other sites More sharing options...
dreamwest Posted September 7, 2009 Author Share Posted September 7, 2009 yes Quote Link to comment Share on other sites More sharing options...
thebadbad Posted September 7, 2009 Share Posted September 7, 2009 You can grab the URLs with regex: <?php $str = "html here<a class='link' href='http://www.site.com/?q=6J4W2G4fASI'>html here html here<a class='link' href='http://www.site.com/?q=637yfuhqelhe'>html here html here<a class='link' href='http://www.site.com/?q=hilghye8oyhi'>html here"; preg_match_all('~<a\b[^>]+href\s?=\s?[\'"](.*?)[\'"]~is', $str, $matches); echo '<pre>' . print_r($matches[1], true) . '</pre>'; ?> Note that $matches[1] will be an array of the URLs. Or alternatively use PHP DOM: <?php $urls = array(); $str = "html here<a class='link' href='http://www.site.com/?q=6J4W2G4fASI'>html here html here<a class='link' href='http://www.site.com/?q=637yfuhqelhe'>html here html here<a class='link' href='http://www.site.com/?q=hilghye8oyhi'>html here"; $dom = new DOMDocument(); $dom->loadHTML($str); $tags = $dom->getElementsByTagName('a'); foreach ($tags as $tag) { $urls[] = $tag->getAttribute('href'); } echo '<pre>' . print_r($urls, true) . '</pre>'; ?> Quote Link to comment Share on other sites More sharing options...
dreamwest Posted September 7, 2009 Author Share Posted September 7, 2009 You can grab the URLs with regex: <?php $str = "html here<a class='link' href='http://www.site.com/?q=6J4W2G4fASI'>html here html here<a class='link' href='http://www.site.com/?q=637yfuhqelhe'>html here html here<a class='link' href='http://www.site.com/?q=hilghye8oyhi'>html here"; preg_match_all('~<a\b[^>]+href\s?=\s?[\'"](.*?)[\'"]~is', $str, $matches); echo '<pre>' . print_r($matches[1], true) . '</pre>'; ?> Note that $matches[1] will be an array of the URLs. Sweet thanks! I knew preg_match_all would do it but couldnt get my head around the regex Quote Link to comment Share on other sites More sharing options...
dreamwest Posted September 7, 2009 Author Share Posted September 7, 2009 By the way which is faster, regex or php dom?? Quote Link to comment Share on other sites More sharing options...
bundyxc Posted September 7, 2009 Share Posted September 7, 2009 You most likely won't notice the difference either way, but my knowledge over the topic isn't very expansive. However, these guys seem to know what they're talking about. Quote Link to comment Share on other sites More sharing options...
sKunKbad Posted September 7, 2009 Share Posted September 7, 2009 PHP DOM is sweet. I like not having to need regex, and the OOP involved makes things more clear because my brain is in full OOP mode. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.