mac_gabe Posted November 10, 2010 Share Posted November 10, 2010 Beginner question… This pattern and replace works (it's for changing br tags separating links in an html list to li tags): $pattern_strip_1= '/\<a(.*?)a\>\<br \/\>/'; $replace_strip_1= '<li><a$1a></li>'; But this one doesn't: $pattern_strip_1= '/\<a href(.*?)a\>\<br \/\>/'; $replace_strip_1= '<li><a href$1a></li>'; I don't get it - it's just a few more letters (" href"), so why should it make any difference? I'm using it in preg_replace. Thanks for any shedding of light. Quote Link to comment https://forums.phpfreaks.com/topic/218291-why-does-one-replace-work-and-the-other-not/ Share on other sites More sharing options...
ManiacDan Posted November 10, 2010 Share Posted November 10, 2010 The code you're matching is generally the reason why a match won't work. I bet you have <a class="something" href="url.php" /> -Dan Quote Link to comment https://forums.phpfreaks.com/topic/218291-why-does-one-replace-work-and-the-other-not/#findComment-1132629 Share on other sites More sharing options...
.josh Posted November 11, 2010 Share Posted November 11, 2010 To be more specific, your 2st pattern expects your to find the literal "<a href" as the beginning of the substring. So if your href attributes are not the first in attribute in your link, the match will fail. I am assuming based on the context of your "before" and "after" that you probably have anchor tags that aren't actually url links and therefore want to exclude them by only looking for tags that have the href attribute. I'm not sure why you have an "a" at the closing of your link tags but assuming that is really supposed to be there, this should work: $pattern_strip_1= '~<a([^>]*)href([^>]*)a><br />~'; $replace_strip_1= '<li><a{$1}href{$2}a></li>'; this has a couple of additional optimizations: - Use negative character class instead of match-all with lazy quantifier. It is safer and more efficient. - You don't need to escape all that stuff, only the ones that are the same as your pattern delimiter or mean something special to the regex engine. The only thing you had in there that really needed escaping was the / because you used that as your delimiter. And to that, when you are working with html, it's a good idea to choose a delimiter that doesn't commonly show up in html (like ~) so that you don't have to make your pattern more confusing by having lots of escaped chars in it - It's always a good idea to wrap your captured vars ($1, $2, etc...) in {..} to avoid ambiguity. Basically it lets php know that you meant $1 not $1a, etc... - Even with this pattern, there are some assumptions made..for example: ----- It assumes there is no spacing between the anchor tag and the br tag, and on the same line. ----- It also assumes there is a space and a / in the br tag which is technically correct markup, but most browsers will recognize <br> or <br/> so this pattern doesn't account for that. ----- It assumes everything will be in lowercase. regex is case-sensitive unless you specify otherwise Quote Link to comment https://forums.phpfreaks.com/topic/218291-why-does-one-replace-work-and-the-other-not/#findComment-1133100 Share on other sites More sharing options...
mac_gabe Posted November 17, 2010 Author Share Posted November 17, 2010 Wow many thanks for the replies - I've just seen this (screwed up my notifications somehow so thought this post hadn't been replied to), so will now check to see if what you say explains it - have a feeling that it will! Quote Link to comment https://forums.phpfreaks.com/topic/218291-why-does-one-replace-work-and-the-other-not/#findComment-1135413 Share on other sites More sharing options...
mac_gabe Posted November 17, 2010 Author Share Posted November 17, 2010 I bet you have <a class="something" href="url.php" /> Indeed some of my links start with <a class="something" href="url.php" /> !! Thanks :-) I'm not sure why you have an "a" at the closing of your link tags Good point! Didn't need that :-) - You don't need to escape all that stuff Thanks, that has really cleaned things up. - It's always a good idea to wrap your captured vars ($1, $2, etc...) in {..} to avoid ambiguity. Basically it lets php know that you meant $1 not $1a, etc... I just tried this, but for some reason it seems to be inserting the curly brackets into the HTML?? Works OK in this instance without. Will investigate this and the "negative character class" which is also new for me. Thanks for all your help - I've learnt so much. Quote Link to comment https://forums.phpfreaks.com/topic/218291-why-does-one-replace-work-and-the-other-not/#findComment-1135424 Share on other sites More sharing options...
ManiacDan Posted November 17, 2010 Share Posted November 17, 2010 I just tried this, but for some reason it seems to be inserting the curly brackets into the HTML?? Works OK in this instance without. Will investigate this and the "negative character class" which is also new for me.The curly brackets are only necessary inside double-quoted strings where you're using PHP variables. $1 and $2 are technically not PHP variables, and they're inside single quotes anyway, so you don't need them. Did you get it working? -Dan Quote Link to comment https://forums.phpfreaks.com/topic/218291-why-does-one-replace-work-and-the-other-not/#findComment-1135495 Share on other sites More sharing options...
mac_gabe Posted November 17, 2010 Author Share Posted November 17, 2010 Yup, works great, thanks ! Quote Link to comment https://forums.phpfreaks.com/topic/218291-why-does-one-replace-work-and-the-other-not/#findComment-1135532 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.