Jump to content

[SOLVED] Alphanumeric plus hypens, periods, and underscores.


Recommended Posts

I'm working on a CMS that requires to download MySpace profile pages. The URL format is standard, except for the username portion. The username can contain letters, numbers, hyphens, and periods.

 

Valid

_n.E.E.d.l.e_

_____NEEDLE-123

_-5678...needle-_

 

Invalid

n e e d l e

!@#$%^&*

needle'd!

 

I was using this regex:

^[a-zA-Z0-9_]*$

 

But it allowed ! (although I'm not sure how/why), and didn't allow hyphens/periods.

Thank you for your time.

^[a-zA-Z0-9_.-]*$

 

If the hyphen is placed as the first or last character in the character class, it is treated as a literal (as opposed to a range), and meta characters like the dot lose their special meaning within a character class, and as a result do not require escaping. So one solution could be:

 

#^[a-z0-9_.-]*$#i

 

Granted, the star modifier (zero or more times) means it can match nothing.. so if there is a minimum length for username, you could use an interval instead:

 

#^[a-z0-9_.-]{3,}$#i    // minimum 3 characters long...

 

Also note that patterns like these could also match an username like ....----a---.. So I'm not sure what kind of restrictions you have in place to check for those kind of wacky situations.

 

 

I would actually consider ....----a---.. valid, so this isn't a problem. The application just has to request a page from MySpace, and if it were to request "myspace.com/....----a---..", then MySpace would just say:

 

Invalid Friend ID.
This user has either cancelled their membership, or their account has been deleted.

 

And I use that error message to tell if the username is invalid. The regex that I requested was simply so that users wouldn't use ! or anything simliar, as that would just get a "Page cannot be found". The same is true when it comes to crazy ASCII characters.

 

With that said, I have a problem. I'm using the following snippet of code, and I'm sure that due to a syntactical error, it isn't working how it should.

 

$post = preg_replace('/#^[a-z0-9_.-]{1,}$#i/', '', 'd!@#agsd$%^&*');
print_r($post); //Outputs d!@#agsd$%^&

 

As you can see, nothing has changed in the string. In my mind, the regex is supposed to do this:

 

Go to the string, and replace every character that is NOT alphanumeric/period/hyphen, with '' (deleting it completely). Once this is done, declare this new string as $post.

 

Obviously my logic is flawed. Thanks for the help guys!

okay so since you are wanting to replace things, you need to remove the ^ and $ also, doing {1,} is effectively the same as just using +, not that you need it at all in a preg_replace in this instance, since preg_replace will replace all instances anyways.  Also, it needs to be a negative character class, since you want to replace anything that is not those characters.

 

$post = preg_replace('/#[^a-z0-9_.-]#i/', '', 'd!@#agsd$%^&*');

Except you'll also need to remove the delimiting / characters. nrg chose a hash sign as delimiter. You can use a forward slash as well, but you can't just add that, you'll have to change it.

 

$post = preg_replace('#[^a-z0-9_.-]#i', '', 'd!@#agsd$%^&*');

Except you'll also need to remove the delimiting / characters. nrg chose a hash sign as delimiter. You can use a forward slash as well, but you can't just add that, you'll have to change it.

 

$post = preg_replace('#[^a-z0-9_.-]#i', '', 'd!@#agsd$%^&*');

 

Honestly, I just had no clue what I was doing. I saw "/[regex]/"[ so much, I just thought that the forwardslash was part of the syntax of a regular expression.  :shrug:

 

Thanks for all the help guys. I'm trying to learn regex, but it's a slow process. haha.

Honestly, I just had no clue what I was doing. I saw "/[regex]/"[ so much, I just thought that the forwardslash was part of the syntax of a regular expression.

 

In pcre, delimiters (typically seen as the forward slash) can be any non alpha numeric, non whitespace ASCII character (except for the backslash).

So you can use #[regex]#, or /[regex]/ or ~[regex]~, or ![regex]!, or.... you get the idea...

 

EDIT - just be mindful that if you use a character inside your pattern that is also used as your delimiters, you escape them (using the backslash), otherwise, you'll run into unknown modifier messages.. not pretty.

Thanks! I'm currently trying to learn from this tut: http://www.phpro.org/tutorials/Introduction-to-PHP-Regex.html

 

It's the best I've found. Thanks for helping me understand regex delimiters. I didn't know that you'd need any, as it's a string. I figured that the beginning and end of the string would work as delimiters. ;)

 

But I'm just going to assume that you can put functions/operators outside of the delimiters.

It's because they're called PCRE (Perl Compatible Regular Expressions). In Perl you'd do:

 

if ($string =~ m/a-z0-9/i) {
print "It's an alphanumeric string!";
}

 

You also use that notation in other places, like in the editor called vi(m).

well yeah, it's pcre but they could have still made the modifiers a separate argument and use the same syntax. They could have easily made the preg_xx functions have no delims, wrapped in quotes, modifiers as seperate argument, and build it internally for the regex engine.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.