Jump to content

regex username


9three

Recommended Posts

Hey,

 

I'm trying to create a pattern, but I'm very new to regular expressions.

 

I'm checking for a username input. It needs to match the following:

 

1. Must start with either a number or letter

2. May contain one underscore or one hyphen (I don't know if this is possible, unless I create two patterns, one matching underscore and another matching a hyphen)

3. Must end with a letter

 

This is what I got:

 

preg_match('#^[a-z0-9]([_-a-z])$#i', $username)

 

I've seen people using # and / to start a pattern. I don't know which one is correct or a matter of preference? Dollar sign?! I understand the i is to allow upper and lowercase?

 

thanks for any help

Link to comment
Share on other sites

I do not think you can accomplish all that you want with 1 regex, and not that you would want to. Doing multiple would give you input to tell the user what was wrong.

 

<?php
function checkUsername($username) {
// first character check
$first = substr($username, 0, 1);
if (preg_match('~[a-z]~i', $first) < 1)
	return "{$username} must contain a letter for the first character.";

// last character check
$last = substr($username, -1, 1);
if (preg_match('~[a-z]~i', $last) < 1)
	return "{$username} must contain a letter for the last character.";

// underscore check (test for 1 as you only want 1 occurance)
$underscore = false;
$numUnderscore = count(explode("_", $username));
if ($numUnderscore > 2)
	return "{$username} cannot contain more than 1 underscore.";
elseif ($numUnderscore == 2)	
	$underscore = true;

$hyphen = false;
$numHyphens = count(explode("-", $username));

if ($numHyphens > 2)
	return "{$username} cannot contain more than 1 hyphen.";

if ($numHyphens == 2 && $underscore)
	return "{$username} can only contain a hyphen (-) or an underscore (_) not both.";
elseif ($numHyphens == 2)
	$hyphen = true;

if (!$hyphen && !$underscore)
	return "{$username} must contain a hyphen (-) or an underscore (_).";

return true;	
}

$usernames = array("test_valid", "1notvalid", "notvalid", "not-valid-d", "noted1", "not__valid", "vali-d");

foreach ($usernames as $username) {
$message = checkUsername($username);
if (!is_bool($message))
	echo $message . "<br />";
elseif ($message)
	echo $username . " was valid.<br />";
}

?>

 

This way your user knows what they did wrong and can fix it properly.

Link to comment
Share on other sites

My attempt (assuming I understand the criteria correctly):

 

$userName = '3fhj_kfere3-dj4u';
$validation = 'false'; // guilty till proven innocent 
if(preg_match('#^[a-z\d].*?[a-z]$#i', $userName, $match)){ // initially must start with a number or letter, and end with a letter
$validation = (!strpbrk(substr(strpbrk($match[0], '-'),1),'-') && !strpbrk(substr(strpbrk($match[0], '_'),1),'_'))? 'true' : 'false'; // if both evaluate to false, there is no 2 or more dashes and/or underscores
}
echo $validation;

 

@OP, while you declare the beginning and ending requirements of the string, you don't specify what kind of acceptable characters can exist in between, therefore, in my pattern, I used .*? (which accepts anything [other than a newline] zero or more times, lazily). EDIT - If this is not the case, you must be specific with this regard.

 

So the idea behind this is simple...$validation is set to false from the get go... then, the initial regex handles the starting and ending character requirements.. if passed, we then set the value of $validation to either true or false via a ternary operator based on the results:

 

Check if there is a dash found (if there is a dash found, the string starting from the dash onwards is returned), and if so, check to see if there is another dash within that (after the initial dash via substr). Hence, the result must be false for this as well as the same for the underscore for $validation to be true, otherwise, it is false.

 

So from what I gathered (and this is where I may have admittedly misunderstood), you allow only one of each (dash / underscore) max if either (or both) are found within the username.

 

If this is NOT the case, and instead, you mean to say, if there is one of either, there can be no more of either afterwards, a simple modification can be made.. change the ternary operator line from the above one to:

 

$validation = (!strpbrk(substr(strpbrk($match[0], '-'),1),'-_') && !strpbrk(substr(strpbrk($match[0], '_'),1),'-_'))? 'true' : 'false';

 

 

Link to comment
Share on other sites

Yeah, it looks like it would.. but I just know from experience that the .*? will need to be replaced with actual character requirements (no offense to the OP, but some info always seems to be 'left out', or 'forgotten'.. I just used .*? initially as the OP doesn't make note of what is (or isn't) acceptable in between the initial and closing character.

 

And just to show how much of an off R-tarded day I'm having, I reexamined my pattern and thought to myself, 'why the hell did I even bother making that dot match all lazy?'.

If .* was in fact acceptable, just let it match all the way to the end of string (no lazy forward checking), and let the engine simply backtrack once at the end and see if that final character matches the final character class in the pattern (granted, on a small simple single username check, the speed difference between .* and .*? would be infinitesimal).

Link to comment
Share on other sites

and yeah, none of this does any kind of check against "acceptable" things, like if the user were to include profanity in their name.

 

Or if the user wanted to use weird characters like Ô or / or " or ' or ; or Q@!$#^%()_+=| :)

 

It sucks to be vague, as you get what you ask for.

Link to comment
Share on other sites

Well, on the topic of being vague, I know we have a sticky for that kind of stuff.. I just really, really wish people would read them (and better yet, abide by them). Would make these threads soooo much smoother in problem solving..

well if the problems were easy they'd hardly be problems now would they ;)

 

mmm...no... the regex would return false if it found those.

 

I don't follow...

 

ah foobar moment...see, I'm not perfect :P I forgot about the .*'s

Link to comment
Share on other sites

well damn, I can't believe none of us thought about that.  There's no reason why those quantified classes can't just be combined with their respective starting/ending classes, using the + instead of the *.  doh!

 

edit:

 

oh wait, I knew there was a reason that was done.

 

the OP said it had to end with a letter, so the end of the pattern had to have 2 separate classes.  No reason the first one can't be combined though... also, you forgot the ? after the [_-] as OP said it's optional (also it should be [-_]. - should always come first in a class unless you want a black hole to open up and swallow the universe).

 

preg_match('/^[a-z\d]+[-_]?[a-z\d]*[a-z]$/i',$string);

 

 

 

and of course, we could, as before, debate about whether to make the + greedy or not.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.