Jump to content

[SOLVED] Help with Apostrophes in Regular Expressions


Recommended Posts

Hi guys,

 

I have been desperately working on a regexp that should match only letters, spaces, apostrophes and dashes (i.e. John O'Reilly, John-Smith, etc.). The regular expression code now follows:

 

case "letters_only":

     

if (isset($fields[$field_name]) && preg_match("/[^a-z-\'\\s]+\$/i", $fields[$field_name])  )

 

//This statement will not work!

//if (isset($fields[$field_name]) && preg_match("/^[^a-z-\'\\s]+\$/i", $fields[$field_name])  )

 

          $errors[] = $error_message;

 

The problem exists when I enter a string such as 2'Hare. It accepts this as valid. What is wrong in my regexp.

 

Thanks.

Something along these lines?

 

$str = array('John O\'Reilly', 'John-Smith', 'John Smith', 'Jaques Rémi', 'Jill_Henning');
foreach($str as $val){
    echo $val . ' => ';
    echo (preg_match('#^[a-z\' -]+$#i', $val))? "Correct format<br />\n" : "Incorrect format<br />\n";
}

 

Output:

John O'Reilly => Correct format
John-Smith => Correct format
John Smith => Correct format
Jaques Rémi => Incorrect format
Jill_Henning => Incorrect format

 

EDIT:

What is wrong in my regexp.

 

A few things you need to consider... If you want to detect a dash as a literal within a character class, either place it as the very first character or the very last, otherwise, it is sits somewhere in the middle, you must escape it \-. If you don't, it becomes a range, much like 0-9 means a range from well, zero through nine.

 

You don't have to escape the \s, as this is already recongnized as a shorthand character class for whitespace characters (speaking of which, if you indeed need to check for all whitespaces characters, simply replace the literal space in my pattern with \s).

 

Nor do you escape the final $ character, as by doing so, you are not implying 'end of string', but looking for a literal dollar sign (not what you want in this case).

Indeed.. granted, I was going by what the OP stated:

regexp that should match only letters, spaces, apostrophes and dashes

So my pattern does do this...We could however easily tighten things up a bit...

 

'#^[a-z]+[ -][a-z]\'?[a-z]+$#i'

 

Granted, since this is all case insensitive, it will accept stuff like 'john-smith', or 'John o'RiLeY'... so I'm not sure how much fine tuned control the OP is looking for. If we want to get even more strict and insist that the first letters of the first and last name be caps (and in the event of an apostrophe in the last name, assume that the next character afterwards must also be capitalized as well, [and we want to do this all in regex], we can also make use of conditionals and go with something like this:

 

$str = array('John O\'Reilly', 'John O\'reilly', 'John-Smith', 'john-smith', 'John smith', 'John Smith', 'Jaques Rémi', 'Jill_Henning', 'j-o-h-n', '\'\'j\'\'');
foreach($str as $val){
    echo $val . ' => ';
    echo (preg_match('#^[A-Z][a-z]+[ -][A-Z](\')?(?(1)[A-Z])[a-z]+$#', $val))? "Correct format<br />\n" : "Incorrect format<br />\n";
}

 

It really depends on how deep the rabbit hole goes I suppose...

Hmm.. I don't use PHP 4 (locally nor remotely). Something tells me you should switch to a provider that uses PHP 5. There have been enough advancements between those versions.

 

I don't see why numbers would be accepted however, as the pattern is only using a-z (case insensitive if you are using one of the code segments with the i modifier after the closing delimiter), space, dash and apostrophe.

I have now tried your solution in a separate php script on the remote server with PHP 4 and it works. It must be my scripts. I would like to premise that I am an amateur PHP developer presently developing a paypal like front-end onto a backend banking server for our parish.. I would greatly appreciate if one of you would look through the scripts to see if some variable or erroneous quoting has taken place causing the First Name and Last Name field to pass even if I type 2233John etc.

 

I am attaching the scripts for debugging help. I greatly appreciate all the input thus far.

 

Many thanks.

 

[attachment deleted by admin]

Sorry Mate... I don't have time to sift through large chunks of code. Please understand that while these forums are there to help people out, I don't think the spirit of these forums include a place to dump large amounts of code with the expectations of someone to take considerable amounts of their time to problem solve them (not trying to sound difficult or rude here). If members are willing, all the more power to you.

 

But from a quick glance, I can see so much preg and split statements going on, something tells me your code structure needs to be considerably simplified (a note on split, as it states in the manual, often, preg_split is faster.. in the event you don't need to split something using regex, explode is often the faster choice).

 

Just looking through your validate_form.php, I can see things within regex like in your case:donations you have the pattern: /[^1|2|3]/

What this is saying is anything that is not a 1, or a pipe, or a 2, or a pipe, or a 3. If you just want to say, anything that is not a 1,2 or 3 you can use something like [^123], as this is a character class, and character classes look for a single character and compares if that character in the string matches (or doesn't match) any character listed within it.

 

If your intention is to say, from the start of the string, look for either a 1,2 or 3, your pattern could be: /^[123]/.

 

In case "letters_only2": you done escape the \s in the pattern /^[a-zA-Z\\s\'.-]+/i nor do you need to list A-Z as you already have the i modifier listed after the closing delimiter which checks for both upper and lower case characters...

It could simply be /^[a-z\s\'.-]+/i

 

In case "letters_only3": you have the pattern /[^a-zA-Z \.]/ You can a) add an 'i' modifier after the closing delimiter to check for upper and lowercase, and b) you don't need to escape the dot, because it is inside a character class, the dot is treated as a literal.. so your pattern could be: /[^a-z .]/i

 

In case "letters_only4":

You have the pattern /[^a-z'0-9-.()?:,!\"\\s]+\$/i

As I noted earlier within the thread, if you want to detect a dash within your character class, you need to either put is as the very first (or last character) or escape it.. but you have 0-9-. This creates a messy range.. (and once again, don't escape \s.. That dollar sign at the end I am assuming you mean end of string? If so, you don't escape that either...so your pattern could be:

/[^a-z0-9'.()?:,!\"\s-]+$/i

 

None of the suggestions above will solve your root problem I don't think, but there is a matter of cleaning up things to be sure.

All in all, I have a feeling your code can be greatly simplified, as there seems to be simply too much stuff going on to look efficient.

Thanks for you timely input. I have now cleaned up my code (i.e. fixed faulty character class defintions, added explode function,etc .

 

FYI, the true nature of the problem behind the regexp not working on the remote server was due to the fact that magic_quotes_gpc was enabled in the php.ini file and not on my local PC's php.ini.

 

 

 

 

I went about solving the problem by adding a small php.ini file to the directory housing the php scripts. Everything finally started working!!!! :-)

 

ADDED THIS TO MY DIRECTORY

 

magic_quotes_runtime=off

magic_quotes_gpc=off

register_globals=on ; only as an example

Since it's your thread, *I think* only you will have access to it.. It's not just below the bottom breadcrumb navigation 'PHP Freaks Forums > PHP Coding > PHP Help > PHP Regex > Topic: Help with Apostrophes in Regular Expressions'? No big deal if you don't find it... Was more of a friendly reminder.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.