kyleabaker Posted December 4, 2013 Share Posted December 4, 2013 I'm trying to write regex to take an input string and return results of each anchor tag that is found. For example, in the following string, it should return 3 results: This is an test <a href="link1.html" data-modal="sdfdsf87ds87fdsf8bds8fb">string</a> example to <a href="link2.html">parse</a> and return <a class="someclass" href="link3.html">some</a> anchor tag results. My expected results are: 1. <a href="link1.html" data-modal="sdfdsf87ds87fdsf8bds8fb">string</a> 2. <a href="link2.html">parse</a> 3. <a class="someclass" href="link3.html">some</a> I'm trying to test this at http://regexpal.com/ and the problem I'm seeing is that my regex ( <a (.+)[^<]*</a> ) is selecting everything from the start of the first anchor tag to the end of the last anchor tag and I can't seem to figure out how to split these apart. Any suggestions so it returns each tag as a separate result in the match array? Thanks in advance! Link to comment https://forums.phpfreaks.com/topic/284529-regex-to-match-and-return-all-anchor-tags-in-a-string/ Share on other sites More sharing options...
dalecosp Posted December 4, 2013 Share Posted December 4, 2013 Try spaces, newlines, etc?Or, perhaps, use something like DOMDocument to read the HTML instead of a regexp. Link to comment https://forums.phpfreaks.com/topic/284529-regex-to-match-and-return-all-anchor-tags-in-a-string/#findComment-1461274 Share on other sites More sharing options...
requinix Posted December 4, 2013 Share Posted December 4, 2013 Or, perhaps, use something like DOMDocument to read the HTML instead of a regexp.That. Very that. Not only are regular expressions the wrong tool for dealing with HTML, DOMDocument is actually better at doing what you want. getElementsByTagName Link to comment https://forums.phpfreaks.com/topic/284529-regex-to-match-and-return-all-anchor-tags-in-a-string/#findComment-1461276 Share on other sites More sharing options...
dalecosp Posted December 4, 2013 Share Posted December 4, 2013 Yeah, that's what I use; he didn't say if it was a requirement ... never can tell when people are doing coursework ;) Link to comment https://forums.phpfreaks.com/topic/284529-regex-to-match-and-return-all-anchor-tags-in-a-string/#findComment-1461282 Share on other sites More sharing options...
.josh Posted December 5, 2013 Share Posted December 5, 2013 I agree that in general a DOM parser would be better for general DOM parsing/manipulation, but regex isn't a bad alternative if what you are looking for is regular. If that is all you want, this regex should work ($anchors will hold the results): preg_match_all('~<a\s+.*?</a>~is',$string,$anchors); If however you want to parse individual attributes or just the "text" of the anchor etc. then using a DOM parser would definitely be better. since you are using regex buddy, <a\s+.*?</a> is the actual pattern and is are modifiers for making it case-insensitive (i) and also allowing the dot to match newline chars (s), in the event that the "text" inside the anchor tags has newline chars (so IOW make sure to add those flags in regex buddy) Link to comment https://forums.phpfreaks.com/topic/284529-regex-to-match-and-return-all-anchor-tags-in-a-string/#findComment-1461297 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.