Jump to content

preg_replace() - Cuts off last character?


hannylicious

Recommended Posts

Hey gang,

 

Probably a simple solution, but I'm having issues with the regex for this -

 

$str = '  --- -----     ------f-oo-oo----';
$str = preg_replace('/^(-|\s)+-(.*[a-zA-Z0-9])[^\-]-+$/' , '$2', $str);
// This should be 'f-oo-oo' now - but it produced 'f-oo-o'
echo $str;

 

In the example above I'm trying to get rid of all leading spaces/hyphens, and any trailing spaces/hyphens - in my regex it works to get rid of the leading things, but chomps off the last character of the string I'm trying to get (i.e. f-oo-oo becomes f-oo-o)

 

The application I'm using this for is a bit more complex - I'm replacing spaces with hyphens in the title of articles.  Some of the hyphenated titles have leading hyphens as well as trailing hyphens.  I've noticed that in some of my matches I'm replacing there are still trailing hyphens after this regex runs, I think this is because those are on a 'new line', and I'm not sure how to check for that in regex either.

 

As an added bonus, if anyone knows how to make the output only lower-case that would rock!!

 

 

Link to comment
Share on other sites

Yep, trim() is definitely the better route.  However, to answer your question about preg_replace()...

 

$str = '  --- -----     ------f-oo-oo----';
$str = preg_replace('/^(-|\s)+-(.*[a-zA-Z0-9])[^\-]-+$/' , '$2', $str);
// This should be 'f-oo-oo' now - but it produced 'f-oo-o'
echo $str;

 

preg_replace() has to match your pattern as a whole in order to make a replacement.  Your pattern says: Start at the beginning of the string, match one or more hyphens or "whitespace" characters (but only capture the first matched char), followed by a hyphen,  followed by (and capture) 0 or more of any character followed by one number or letter (case-insensitive) (stop 2nd captured group). Then match one of anything that is not a hyphen, followed by one or more hyphens after that, followed by end of string.  If you run that against a preg_match, you will see the following matches:

 

Array
(
    [0] => Array
        (
            [0] =>   --- -----     ------f-oo-oo----
        )

    [1] => Array
        (
            [0] => -
        )

    [2] => Array
        (
            [0] => f-oo-o
        )

)

 

Your first piece ^(-|\s)+ matches this part:

 

  --- -----    ------f-oo-oo----

 

but your captured group (-|\s) only matches the last green -

 

Then you have a literal "-" after that, which matches the final "-" before "f" (red part):

 

  --- -----    ------f-oo-oo----

 

Next, you have (.*[a-zA-Z0-9])  The .* will greedily match everything until the very end of the string, but then give up characters until it can match the rest of your pattern.  Well the next thing in your pattern is your [a-zA-Z0-9] character class which matches a letter or number, so .* will match the blue part, and the character class will match that last "o" (orange part):

 

  --- -----    ------f-oo-oo----

 

But wait..then you have a negative character class that says match anything that is not a hyphen, so the .* has to back up one more time and give up the 2nd to last "o" as well, and then the [a-zA-Z0-9] can match that, and the [^\-] can match the last "o" (black):

 

  --- -----    ------f-oo-oo----

 

So the 2nd capture ($2 - the part with the 2nd parenthesis wrapped) in total is f-oo-oo

 

Finally you have -+$ which matches one or more hyphens and then end of string, which matches the ending green:

 

  --- -----    ------f-oo-oo----

 

 

Soo...if you want to do it the regex way, a pattern more like this will work:

 

$str = '  --- -----     ------f-oo-oo----';
$str = preg_replace('/^(-|\s)+|(-|\s)+$/' , '', $str);
// This should be 'f-oo-oo' now - but it produced 'f-oo-o'
echo $str;

 

This pattern says: Start at beginning of string and match one or more hyphens or whitespace characters, OR match one or more hyphens or whitespace characters followed by end of string, and replace with "" (nothing). 

 

 

Link to comment
Share on other sites

Crayon,

Thanks so much for the in-depth response.  The trim() worked out really nice for what I wanted to do.

 

I really appreciate this info too as I have been struggling to come to grips with regex and how it works and this gives me a much more clear explanation!  Thanks!

 

I'm really glad I've come across this forum, you guys are the best!  Hopefully after some time I'll be able to help others in the same manner!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.