Lexas Posted July 10, 2009 Share Posted July 10, 2009 Hello guys. I'm trying to use the Wordpress plugin WordPress Easy Contents, but it is with a "bug" that I'm trying to fix. The following expression is meant to catch HTML tags, like h1 for exemple preg_match_all('#\<'.$element.'>(.+?)\</'.$element.'>#si', $content, $matches, PREG_SET_ORDER); $element is the tag element that must be cautch, $cotent is the input text to be searched. The problem is this expressions only works if the tag has no ID and no class. For exemple, <h1> works, but <h1 class="anything"> doesn't. I've tried a lot of combinations to mean "anything from here until '>'" but nothing worked. Any idea of what can be used here? Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/ Share on other sites More sharing options...
thebadbad Posted July 10, 2009 Share Posted July 10, 2009 preg_match_all("#<$element(\s+[^>]+)?>(.+?)</$element>#si", $content, $matches, PREG_SET_ORDER); Added an optional subpattern: 1 or more whitespace characters followed by 1 or more characters not a >. Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/#findComment-872646 Share on other sites More sharing options...
nrg_alpha Posted July 10, 2009 Share Posted July 10, 2009 preg_match_all("#<$element(s+[^>]+)?>(.+?)</$element>#si", $content, $matches, PREG_SET_ORDER); Added an optional subpattern: 1 or more whitespace characters followed by 1 or more characters not a >. Conversely, you could also simply use: preg_match_all("#<$element[^>]*>(.+?)</$element>#si", $content, $matches, PREG_SET_ORDER); In your case, should there be some attribute(s) after the $element tag name, you will be capturing it. If there is no need to capture, you can use non-capturing parenthesis: (?: ... ), but I find simply using the negated character class easier. Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/#findComment-873241 Share on other sites More sharing options...
thebadbad Posted July 11, 2009 Share Posted July 11, 2009 In your case, should there be some attribute(s) after the $element tag name, you will be capturing it. If there is no need to capture, you can use non-capturing parenthesis: (?: ... ), but I find simply using the negated character class easier. You're right that I should have used non-capturing parentheses, simply forgot it. Consider this sample string to see why I added the whitespace(s): <acronym title="PHP Freaks"><a href="http://php.net/">PHP</a>F</acronym> When $element = 'a', your pattern would (wrongfully) capture the green part. Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/#findComment-873413 Share on other sites More sharing options...
nrg_alpha Posted July 11, 2009 Share Posted July 11, 2009 In your case, should there be some attribute(s) after the $element tag name, you will be capturing it. If there is no need to capture, you can use non-capturing parenthesis: (?: ... ), but I find simply using the negated character class easier. You're right that I should have used non-capturing parentheses, simply forgot it. Consider this sample string to see why I added the whitespace(s): <acronym title="PHP Freaks"><a href="http://php.net/">PHP</a>F</acronym> When $element = 'a', your pattern would (wrongfully) capture the green part. Right.. I see what your saying now. In that case, we could simply insert a \b word boundery inside the opening tag in the pattern: <$element\b[^>]*> This way, if $element = 'a', it will ignore tags like <acronym> or <abbr> for example and will find the actual anchor tags (and thus bypass the need for a group checking for a space, then anything not a >, all of which is optional). Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/#findComment-873525 Share on other sites More sharing options...
thebadbad Posted July 11, 2009 Share Posted July 11, 2009 True, using a word boundary would be more appropriate Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/#findComment-873535 Share on other sites More sharing options...
nrg_alpha Posted July 11, 2009 Share Posted July 11, 2009 It was a good catch on your part though.. looking at the OP's pattern, then looking at yours, I wasn't sure what you were getting at (hindsight has 20/20 vision they say ). Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/#findComment-873537 Share on other sites More sharing options...
thebadbad Posted July 11, 2009 Share Posted July 11, 2009 I didn't go in detail on purpose, to let you figure it out yourself Link to comment https://forums.phpfreaks.com/topic/165431-preg_match_all-passing-through-possible-classes/#findComment-873560 Share on other sites More sharing options...
Recommended Posts
Archived
This topic is now archived and is closed to further replies.