hannylicious Posted May 11, 2011 Share Posted May 11, 2011 Hey gang, Probably a simple solution, but I'm having issues with the regex for this - $str = ' --- ----- ------f-oo-oo----'; $str = preg_replace('/^(-|\s)+-(.*[a-zA-Z0-9])[^\-]-+$/' , '$2', $str); // This should be 'f-oo-oo' now - but it produced 'f-oo-o' echo $str; In the example above I'm trying to get rid of all leading spaces/hyphens, and any trailing spaces/hyphens - in my regex it works to get rid of the leading things, but chomps off the last character of the string I'm trying to get (i.e. f-oo-oo becomes f-oo-o) The application I'm using this for is a bit more complex - I'm replacing spaces with hyphens in the title of articles. Some of the hyphenated titles have leading hyphens as well as trailing hyphens. I've noticed that in some of my matches I'm replacing there are still trailing hyphens after this regex runs, I think this is because those are on a 'new line', and I'm not sure how to check for that in regex either. As an added bonus, if anyone knows how to make the output only lower-case that would rock!! Quote Link to comment Share on other sites More sharing options...
JAY6390 Posted May 11, 2011 Share Posted May 11, 2011 Since you're using php, you can forget the need for the regex at all, and just use $str = trim($str, '- '); in it's place Quote Link to comment Share on other sites More sharing options...
hannylicious Posted May 11, 2011 Author Share Posted May 11, 2011 Great idea Jay! Unfortunately that will strip all the hyphens won't it? I still need the ones in between the words to remain preferably. Quote Link to comment Share on other sites More sharing options...
JAY6390 Posted May 11, 2011 Share Posted May 11, 2011 Nope, trim removes any of those characters from the start and end until it finds one that isn't in the list. It outputs f-oo-oo Quote Link to comment Share on other sites More sharing options...
JAY6390 Posted May 11, 2011 Share Posted May 11, 2011 Also, if you want the output lowercase, use strtolower() Quote Link to comment Share on other sites More sharing options...
hannylicious Posted May 11, 2011 Author Share Posted May 11, 2011 You're completely right Jay, great fix. Thanks a ton for those two fixes, they work perfectly and are so much more simple! That makes life so much easier!! Quote Link to comment Share on other sites More sharing options...
JAY6390 Posted May 11, 2011 Share Posted May 11, 2011 No problem Quote Link to comment Share on other sites More sharing options...
.josh Posted May 11, 2011 Share Posted May 11, 2011 Yep, trim() is definitely the better route. However, to answer your question about preg_replace()... $str = ' --- ----- ------f-oo-oo----'; $str = preg_replace('/^(-|\s)+-(.*[a-zA-Z0-9])[^\-]-+$/' , '$2', $str); // This should be 'f-oo-oo' now - but it produced 'f-oo-o' echo $str; preg_replace() has to match your pattern as a whole in order to make a replacement. Your pattern says: Start at the beginning of the string, match one or more hyphens or "whitespace" characters (but only capture the first matched char), followed by a hyphen, followed by (and capture) 0 or more of any character followed by one number or letter (case-insensitive) (stop 2nd captured group). Then match one of anything that is not a hyphen, followed by one or more hyphens after that, followed by end of string. If you run that against a preg_match, you will see the following matches: Array ( [0] => Array ( [0] => --- ----- ------f-oo-oo---- ) [1] => Array ( [0] => - ) [2] => Array ( [0] => f-oo-o ) ) Your first piece ^(-|\s)+ matches this part: --- ----- ------f-oo-oo---- but your captured group (-|\s) only matches the last green - Then you have a literal "-" after that, which matches the final "-" before "f" (red part): --- ----- ------f-oo-oo---- Next, you have (.*[a-zA-Z0-9]) The .* will greedily match everything until the very end of the string, but then give up characters until it can match the rest of your pattern. Well the next thing in your pattern is your [a-zA-Z0-9] character class which matches a letter or number, so .* will match the blue part, and the character class will match that last "o" (orange part): --- ----- ------f-oo-oo---- But wait..then you have a negative character class that says match anything that is not a hyphen, so the .* has to back up one more time and give up the 2nd to last "o" as well, and then the [a-zA-Z0-9] can match that, and the [^\-] can match the last "o" (black): --- ----- ------f-oo-oo---- So the 2nd capture ($2 - the part with the 2nd parenthesis wrapped) in total is f-oo-oo Finally you have -+$ which matches one or more hyphens and then end of string, which matches the ending green: --- ----- ------f-oo-oo---- Soo...if you want to do it the regex way, a pattern more like this will work: $str = ' --- ----- ------f-oo-oo----'; $str = preg_replace('/^(-|\s)+|(-|\s)+$/' , '', $str); // This should be 'f-oo-oo' now - but it produced 'f-oo-o' echo $str; This pattern says: Start at beginning of string and match one or more hyphens or whitespace characters, OR match one or more hyphens or whitespace characters followed by end of string, and replace with "" (nothing). Quote Link to comment Share on other sites More sharing options...
hannylicious Posted May 11, 2011 Author Share Posted May 11, 2011 Crayon, Thanks so much for the in-depth response. The trim() worked out really nice for what I wanted to do. I really appreciate this info too as I have been struggling to come to grips with regex and how it works and this gives me a much more clear explanation! Thanks! I'm really glad I've come across this forum, you guys are the best! Hopefully after some time I'll be able to help others in the same manner! Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.