Botsta Posted September 7, 2011 Share Posted September 7, 2011 I have the following code to replace a smile with an image: $post = '.'; $post = preg_replace('#\W:\)\W#', '<img src="smile.png" />', $post); It only replaces the smiley if it's not next to an alpha-numeric character. The problem I have with this is that it also replaces the characters beside it (in this case it's a fullstop) but I want to keep them. Is there any way to check for \W and not replace it? Quote Link to comment Share on other sites More sharing options...
xyph Posted September 7, 2011 Share Posted September 7, 2011 Your example fails due to a lack of a non-word character AFTER the smiley. That's something you might want to change To solve your problem, you want something like this <?php $post = '..'; $post = preg_replace('#(\W):\)(\W)#', '\1<img src="smile.png" />\2', $post); echo $post; ?> I've wrapped the non-word characters in capturing groups, and used \1 and \2 to represent anything captured in those groups for the replacement. Quote Link to comment Share on other sites More sharing options...
.josh Posted September 8, 2011 Share Posted September 8, 2011 Alternative, using zero-width assertion instead of captured groups: $post = '..'; $post = preg_replace('~\B:\)\B~', '<img src="smile.png" />', $post); echo $post; Quote Link to comment Share on other sites More sharing options...
Botsta Posted September 8, 2011 Author Share Posted September 8, 2011 Alternative, using zero-width assertion instead of captured groups: $post = '..'; $post = preg_replace('~\B:\)\B~', '<img src="smile.png" />', $post); echo $post; This works great but doesn't work on the following smilies (which all have a number or letter in them): > 0.o How can I fix this? Thanks. Quote Link to comment Share on other sites More sharing options...
xyph Posted September 8, 2011 Share Posted September 8, 2011 Zero-width assertions are for very specific searches, and not something as generic as yours. Try my solution. Quote Link to comment Share on other sites More sharing options...
Botsta Posted September 10, 2011 Author Share Posted September 10, 2011 Zero-width assertions are for very specific searches, and not something as generic as yours. Try my solution. Your solution works exactly how I want it to apart from one thing. $string = ' :) '; When this string is processed, only the middle smiley gets replaced by an image because the other two have nothing next to them. Is there any way of allowing there to be nothing next to the smiley as well as a non alphanumeric character? Thanks for your help. Quote Link to comment Share on other sites More sharing options...
xyph Posted September 10, 2011 Share Posted September 10, 2011 Your example fails due to a lack of a non-word character AFTER the smiley. That's something you might want to change I noticed This is tricky, and requires lookaheads/behinds. /(?<=(^|\W)):\)(?=($|\W))/m Quote Link to comment Share on other sites More sharing options...
Botsta Posted September 10, 2011 Author Share Posted September 10, 2011 I get the following error when using that regex: Warning: preg_replace(): Compilation failed: lookbehind assertion is not fixed length at offset 10 This is the code I used: $post = ''; $post = preg_replace('/(?<=(^|\W)):\)(?=($|\W))/m', '\1<img src="smiley.png" />\2', $post); echo $post; Quote Link to comment Share on other sites More sharing options...
xyph Posted September 10, 2011 Share Posted September 10, 2011 Bah, silly PCRE garbage... We can hack around this though. /((?<=^)|(?<=\W)):\)((?=$)|(?=\W))/m Rather than include the OR within the look-around, we will lookaround OR lookaround. This slows down the function big-time though, requiring 5 backtracks per failed attempt, as opposed to one. I'm not sure if there's any other way to do it though. Tested this one Don't forget, images require an ALT with HTML4/XHTML I'll see if I can make a more efficient RegEx. [EDIT] Here's a better expression that only needs a single step to fail, and only uses a look-behind. <?php $expr = '/(?<=(^|\B))($|\W)/m'; $str = 'zomg foo bar :D '; echo preg_replace( $expr, '\1^grin^\2', $str ); ?> Hope this helps, and hope you understand the expression. Quote Link to comment Share on other sites More sharing options...
Botsta Posted September 10, 2011 Author Share Posted September 10, 2011 /(?<=(^|\B))($|\W)/m This doesn't work with smilies that have a number or letter at the beginning - xD 0.o /((?<=^)|(?<=\W)):\)((?=$)|(?=\W))/m This one works perfectly but if you find an expression that's faster then please post it. Thanks for all your help. Quote Link to comment Share on other sites More sharing options...
.josh Posted September 11, 2011 Share Posted September 11, 2011 Bah, silly PCRE garbage... pfft...most regex engines do not support variable length lookbehinds, and the few that do are very clunky and slooow at best. Quote Link to comment Share on other sites More sharing options...
xyph Posted September 11, 2011 Share Posted September 11, 2011 I was blaming PCRE for my mistake - just being cynical. From what I understand, it's more PHP's implementation. Java allows finite variable length/altering length look-arounds, and .NET framework allows full expressions inside. What this does for speed, I have no idea. @Botsta - The problem is zero-width assertions. \B will not match the first letter in your smilie, and we can't use \W because ^, which matches the start of the string, is also zero-width. PHP's implementation of PCRE won't allow differing lengths (0,1) of matches within a look-around statement. Will keep looking for solutions. The big issue here is you're trying to use one statement for many complex expressions. You'll want to use the slow example for the smilies that won't work with the quicker expression, and the quick expression for everything else. Quote Link to comment Share on other sites More sharing options...
salathe Posted September 11, 2011 Share Posted September 11, 2011 What about $post = ':):D8)'; ? Quote Link to comment Share on other sites More sharing options...
xyph Posted September 11, 2011 Share Posted September 11, 2011 Matching that kind of string while making sure there are non-word characters before and after others is extremely complex and not worth it. If he wanted to match a string like that, he'd want to forget about the above requirement. I think that's why his non-working examples all had spaces between them. IMO, he should ditch that requirement, and instead, have a [nosmilie] tag, or tags where smilies will not be parsed if he's worried about breaking up code or other. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.