Preg_replace not stripping exclamation marks

Picabrillo · February 8, 2007

<?php
$val = "Normal (and bold) text!!";
$newval = preg_replace('/[^A-z0-9 -\']/', '', $val)
echo $newval;
// Outputs "Normal and bold text!!"  
?>

Why is it doing this? Have I missed something? It should only be allowing what I've asked for in the pattern - I don't see any exclamation marks being allowed.

Thanks in advance

effigy · February 8, 2007

space-\' creates a range of characters between a space and a single quote; resulting in the allowance of !, ", #, $, %, and &. If you want to include a hyphen, make sure it's either the first character in the character class (best practice), or escaped.

The same applies to A-z which includes characters you may not expect.

For further reference see the ASCII table.

Picabrillo · February 8, 2007

Thanks effigy. Replaced

'/[^A-z0-9 -\']/

' with

'/[^A-z0-9-\'\s]/'

and its working fine now.

effigy · February 8, 2007

Thanks effigy. Replaced
'/[^A-z0-9 -\']/
' with
'/[^A-z0-9-\'\s]/'
and its working fine now.

I wouldn't do that; it can be confusing.

Did you look into the A-z range? That's going to allow [, ], \, ^, _, and `.

Picabrillo · February 9, 2007

I would be interested to know why it seems confusing - to me it looks fine, but then again I should be getting into the practice of making it understandable to others. What would've you done differently effigy?

As for A-z range, I see what you mean. Thanks for ASCII link, that was much better than some i've been to. I assume that A-Za-z would work better?

obsidian · February 9, 2007

I would be interested to know why it seems confusing - to me it looks fine, but then again I should be getting into the practice of making it understandable to others. What would've you done differently effigy?

As for A-z range, I see what you mean. Thanks for ASCII link, that was much better than some i've been to. I assume that A-Za-z would work better?

I would personally go with one range of characters and simply make it case insensitive for readability, but effigy may have an even better idea:

<?php
$newval = preg_replace('|[^-a-z\d \']|i', '', $val);
?>

effigy · February 9, 2007

I would be interested to know why it seems confusing.... What would've you done differently effigy?

The best practice I suggested; placing the hyphen first in the list. It can be confusing for others because you have back to back ranges with the "loose" hyphen mixed in. If someone has to the change the pattern, they might create a subtle bug without realizing it.

As for A-z range, I see what you mean. Thanks for ASCII link, that was much better than some i've been to. I assume that A-Za-z would work better?

Yes.

I would personally go with one range of characters and simply make it case insensitive for readability, but effigy may have an even better idea:

Actually, for such a small, simple pattern, A-Za-z without the /i would be best. This saves the engine from doing the extra work of case conversion.

Picabrillo · February 9, 2007

Thanks guys, that was really helpful.

One more thing - Other than obviously typing less, are there general performance benefits for using shorthand character classes? From the impression I'm getting, the gains are small until you start dealing with more complex patterns (validating e-mail addresses or URIs maybe). Again, I would guess its as much to do with whether it's confusing to read or not.

effigy · February 9, 2007

Yes and no. Less typing means less chances to make an error and also less code to look at it. These are huge benefits when trying to minimalize human error. (And don't forget the intuitive negated shorthands, \D and \S for instance.) The non-typing benefits come in to play when Unicode and locales are considered; however, if you know you're working with Unicode, I think it's better to use the \p{...} properties.

Sign In

Preg_replace not stripping exclamation marks

Recommended Posts

Picabrillo

Link to comment

Share on other sites

effigy

Link to comment

Share on other sites

Picabrillo

Link to comment

Share on other sites

effigy

Link to comment

Share on other sites

Picabrillo

Link to comment

Share on other sites

obsidian

Link to comment

Share on other sites

effigy

Link to comment

Share on other sites

Picabrillo

Link to comment

Share on other sites

effigy

Link to comment

Share on other sites

Archived

Browse

Activity

Important Information