redbullmarky Posted January 28, 2007 Share Posted January 28, 2007 Hi allI'll be the first to admit I suck at regex stuff...I have an array of smilies, and array of filenames. I understand that, with the use of preg_quote, I can convert my smilies into a string that's suitable for regex.However - just need help with the pattern to search/replace under certain situations : each side of the actual smily MUST have one of the following:1, a space2, start/end of line3, punctuation (comma, fullstop, bracket, colon, semicolon, questionmark, exclaimation, etc)My problem is when I use a pattern through preg_replace, the character i'm checking for either side gets replaced. So I guess the question is - how can I preg_replace, yet leave the surrounding character (whatever it is) intact?Cheers ;)Mark Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/ Share on other sites More sharing options...
c4onastick Posted January 28, 2007 Share Posted January 28, 2007 Hey Mark,You can use lookaround for stuff like this. effigy is the lookaround master around here. [code]$pattern = array( '/(?<=[\s,.;:\'"?!&%$]);-\)(?=[\s,.;:\'"?!&%$])/');$replacements = array( '<img src=\'smiley_pic.jpg\'>');$test = 'That\'s great!;-) This is a lot of fun ;-)!'; $test = preg_replace($pattern, $replacements, $test);[/code]The only problem with this, based on your criteria, is that lookbehinds have to be fixed length, so you can't use alternation to get that start of line anchor. Maybe effigy will know a way around this, I can't think of one off the top of my head.I have to be honest, I rarely use preg_quote, but you should be able to concat that lookbehind and the lookahead with your smiley to build the array. Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171240 Share on other sites More sharing options...
redbullmarky Posted January 28, 2007 Author Share Posted January 28, 2007 whoalookahead/lookbehinds are a bit new to me. what can it/they do for me? the preg_ stuff in your example kinda clouds the smily bit as lots of : and ) etc - but lets say i have something like:[code]<?php$text = ":) hello this is some text (this is some more in brackets with smily at the end ;)) ok :)!";$search = array(':)', ';)');$replace = array('smile', 'wink');// all the preg_quote / preg_replace stuff here?>[/code]so far, i have a loop which escapes all the smily array and adds the start and ending /, then just a simple:[code]$text = preg_replace($search, $replace, $text);[/code]to do the business. so i guess i'm looking to make sure that brackets, spaces, start/end, etc, are all treated the same. If your example does that, any chance you can split it into its parts so i can see what's what? :)cheers for your helpMark Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171245 Share on other sites More sharing options...
c4onastick Posted January 28, 2007 Share Posted January 28, 2007 [quote author=redbullmarky link=topic=124422.msg515503#msg515503 date=1170007930]If your example does that, any chance you can split it into its parts so i can see what's what? :)[/quote]Sure,[code]$pattern = array( '/(?<=[\s,.;:\'"?!&%$]);-\)(?=[\s,.;:\'"?!&%$])/');[/code]Here the basic match is:[code]/;-\)/[/code]which will match this smiley ' ; - ) ' (I added spaces so it wont get parsed in this post), where '/' are the delimeters.This:[code](?<=...)[/code]Is a "positive assertion lookbehind", which works kind of like an if statement. In English that essentially means match '; - )' only if it is preceded by one of the characters in the character class I've got in there, [\s,.;:\'"?!&%$], in this case. (I escaped the ' so PHP wont get confused)This part:[code](?=...)[/code]Is essentially the same thing, except a "positive assertion lookahead". This:[code](?=[\s,.;:\'"?!&%$])[/code]Means match '; - )' only if one of the characters in this class immediately follows it.The really cool thing about lookahead and lookbehind (collectively lookaround) is that it doesn't "consume" any characters in the match. It essentially just checks to see if it is there and allows the match to succeed if it is, but with out making the lookaround part of the final match.I'll have to do some digging on that start of string thing. In order to not get stuck in huge loops, lookbehinds must be a fixed length, which stinks because I'd normally just use alternation to match the beginning of the line:[code]preg_match('/(?:^|[\s,.;:\'"?!&%$])smiley/m', $foo);[/code]'(?:...)' are non-capturing parenthesis.I'd probably just use two passes. This one:[code]$pattern = array( '/(?<=[\s,.;:\'"?!&%$]);-\)(?=[\s,.;:\'"?!&%$\n])/' // I added the newline character here);[/code]And something like:[code]$pattern = array( '/^;-\)/m');[/code]The first one should get all the smilies except the ones at the true start of the line. The second one will just match smilies at the beginning of the line.Hope that helps! Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171386 Share on other sites More sharing options...
effigy Posted January 29, 2007 Share Posted January 29, 2007 [quote author=c4onastick link=topic=124422.msg515498#msg515498 date=1170007283]...lookbehinds have to be fixed length, so you can't use alternation to get that start of line anchor.[/quote]Ah, but you can. Technically, alternation is fixed length because you're saying either/or--of course, the "either" and "or" parts have to be fixed length.The code below should get you by unless you're working with Unicode and locales.[code]<pre><?php $tests = array( ':) :D :X', 'Text:) :(Text :O', 'Text:(Text', ' :) ', ':)', ); $find = array(':)', ':(', ':D', ':O', ':X'); $replace = array('--smile--', '--frown--', '--grin--', '--surprise--', '--silence--'); foreach ($tests as $test) { echo $test, ' => '; $i = 0; foreach ($find as $smiley) { $test = preg_replace('/(?<=^|[\s\W])' . preg_quote($smiley). '(?=\z|[\s\W])/', $replace[$i], $test); ++$i; } echo $test, '<br>'; }?></pre>[/code] Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171609 Share on other sites More sharing options...
c4onastick Posted January 29, 2007 Share Posted January 29, 2007 Oh me of little faith! Thanks effigy. Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171628 Share on other sites More sharing options...
c4onastick Posted January 29, 2007 Share Posted January 29, 2007 AH! I figured out why it didn't work for me. I used:[code](?<=(?:^|...))[/code]Which doesn't work by the way... I think because it hides the alternation from the compiler in the lookaround.[code](?<=^|...)[/code]Does work. Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171656 Share on other sites More sharing options...
redbullmarky Posted January 29, 2007 Author Share Posted January 29, 2007 fantastic replies from the both of you - cheers lads ;DAfter some hacking away I ended up with a bit of a mix and match of both but working perfectly.Thanks again!Mark Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171688 Share on other sites More sharing options...
effigy Posted January 29, 2007 Share Posted January 29, 2007 [quote author=c4onastick link=topic=124422.msg515918#msg515918 date=1170053139]AH! I figured out why it didn't work for me. I used:[code](?<=(?:^|...))[/code]Which doesn't work by the way... I think because it hides the alternation from the compiler in the lookaround.[code](?<=^|...)[/code]Does work.[/quote]Interestingly enough, I tried the same pattern in Perl and it was interpreted as a variable length lookbehind. I was able to get around this by "inverting" the pattern to[tt] (?:^|(?<=[\s\W]))[/tt]. This also works in PHP.On a side note, it may be better to check for[tt] \s\W [/tt]before[tt] ^[/tt], because a beginning of line anchor can only occur once--unless you're in multi-line mode--, therefore giving you less chances of failure in your alternation checks. Quote Link to comment https://forums.phpfreaks.com/topic/36073-solved-help-with-smily-regex/#findComment-171875 Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.