Jump to content

preg_match error


kartul

Recommended Posts

Hello all!

I'm trying to create registration and login system for my page and I want usernames contain only a-zA-Z0-9._- characters. So I use preg_match in if statement to check that (is this right way?) but it gives me an error. I googled but no answer for me.

Here is the error:

Warning: preg_match() [function.preg-match]: Compilation failed: range out of order in character class at offset 13 in C:\wamp\www\i-have-a-secret\index.php on line 18

 

And the if statement that produces the error message:

$username = (isset($_POST['username']) && strlen($_POST['username']) <= 25 && strlen($_POST['username']) >= 2 && preg_match('([a-zA-Z0-9._- ]+)', $_POST['username'])) 
        ? $filter->process($_POST['username']) // just returns mysql safe string
        : false;

Link to comment
Share on other sites

Now, the error is gone, thank you! *sits and thinks for a minute* but it doesn't really solve what I was hoping for...

 

I have new question now, how do I get that if username contains characters other than what's in this regexpression, the if statement fails? Is there are some kind of function that I doesn't know of? Or what is the right way to do it?

 

Link to comment
Share on other sites

Try adding '^'.  Which basically means, not:

'([^a-zA-Z0-9._ -]+)'

Wow that did the trick! To my understanding, the ^ means beginning of a string. But I guess in some context it's not, lol.

Thanks a lot man!

Link to comment
Share on other sites

Try adding '^'.  Which basically means, not:

'([^a-zA-Z0-9._ -]+)'

Wow that did the trick! To my understanding, the ^ means beginning of a string. But I guess in some context it's not, lol.

Thanks a lot man!

Yes it does, when it's not in the brackets.  I'm not sure of the technical terms, but here is a good tutorial:

http://www.phpfreaks.com/tutorial/regular-expressions-part1---basic-syntax

 

Link to comment
Share on other sites

Try adding '^'.  Which basically means, not:

'([^a-zA-Z0-9._ -]+)'

Wow that did the trick! To my understanding, the ^ means beginning of a string. But I guess in some context it's not, lol.

Thanks a lot man!

Yes it does, when it's not in the brackets.  I'm not sure of the technical terms, but here is a good tutorial:

http://www.phpfreaks.com/tutorial/regular-expressions-part1---basic-syntax

 

How could I have missed it.

Thank you!

Link to comment
Share on other sites

Just in case things aren't clear, lets go over the fixes offered.

 

Fixing the "range out of order in character class" error caused by [a-zA-Z0-9._- ] (since that is the only character class in the regex!). 

 

Ranges are things like a-z, which covers the range of characters starting at a and ending at z (so, the alphabet), and 0-9.  Breaking apart your character class, there are in fact four ranges specified: 1. a-z, 2. A-Z, 3. 0-9 and 4. _-<space> where <space> is a literal space character.  It is this final, accidental, range which causes the problem. 

 

It is worth going on a little bit of a tangent here and mentioning that the start/end characters for ranges must be in order, meaning that z-a is not valid.  Take a look at a table of ASCII characters and note that the ASCII number of the range characters must be from low to high. Taking the values for underscore and space, 95 and 32 respectively, we can see that the order is incorrect just like z-a

 

If you were to have written the character class as [a-zA-Z0-9. -_], there would have been no error message because it is perfectly okay to have a range of "space to underscore"... however (again, take a look at the ASCII table) that would allow matching of any character between space and underscore, things like !, %, =, and @!

 

Negating the character class

 

First, that is the technical term for [^…].  That changed the regex from asking, "can I match a sequence of one or more characters from the character class, anywhere within the subject string?" to "can I match a sequence of one or more characters that are not from the character class, anywhere in the subject string?"  The idea works, but it is also possible to make life simpler.

 

Currently you have [^a-zA-Z0-9._ -]+, however all that you really care about is finding any occurrence of an invalid character.  If we can match one bad character, and exit from trying to match anything more at that point, then that makes life easier on the regex engine doing all of the hard work.  So, instead of asking for "one or more", we can just ask for "one"; this is done simply by removing the + repetition quantifier (i.e. we don't care about matching the character class more than once).  This is a very minor point, but always worth keeping in mind as no-one likes doing more work than is really necessary, computers included!

 

P.S. Good luck with the rest of the system.  :)

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.