Jump to content

standard form input validation technique help


boo_lolly

Recommended Posts

i'm having a hard time finding what i'm looking for online. i figured the forums would be a good place. i have a standard contact form that is supposed to validate the following inputs:

 

full name      (only alpha and spaces, nothing else)

phone number (only ints and dashes (-), nothing else)

email  (i have a validation function for this)

arrival date  (only digits, dashes, slashes, in the form of month, day, year)

number of nights (only digits, nothing else)

number of guests (only digits, nothing else)

 

i'm having trouble with ereg_replace(). i want a pattern that will return false if there is anything outside of the pattern that doesn't match in the string. so if it is false, the user entered something wrong. i need help finding these patterns, not so much implementing them in my functions. can anyone guide me?

Link to comment
Share on other sites

/([^a-zA-Z\ ])/

 

full name  (only alpha and spaces, nothing else) --->  preg_match(/^[a-zA-Z ]+$/,Test Name);

phone number (only ints and dashes (-), nothing else) --->  preg_match(/^[0-9\-]+$/,123-4567);

email  (i have a validation function for this)

arrival date  (only digits, dashes, slashes, in the form of month, day, year) ---> preg_match(/^[0-9]{1,2}[\-\/\\][0-9]{1,2}[\-\/\\]([0-9]{2}|[0-9]{4})$/,01/01/1977);

number of nights (only digits, nothing else)  --->  preg_match(/^[0-9]+$/,1234567);

number of guests (only digits, nothing else)  --->  preg_match(/^[0-9]+$/,1234567);

Link to comment
Share on other sites

thanks for the help BenInBlack. i'm still a little confused tho. i'm not sure what all the forward slashes before and after a pattern mean. i'm also not really sure what would constitute a parenthesis. for example, i've been working on my own regex pattern for phone number validation. here's the pattern

 

(1)?(\-?)(\d{3})(\-)?(\d{3})(\-)?(\d{4})

 

it's supposed to return true for the following phone number entries:

1-555-235-2523

1555-123-5235

555-235-2351

5553523125

 

there doesn't have to be a one in front, and there doesn't have to be hyphens, but there HAS to be an area code, followed by the number. i'm not sure if i've gone parenthesis happy, or if it needs it. also, i'm having trouble figuring out a pattern match. you see, i'd like to have a pattern for the beginning of the phone number, if the user doesn't type in a 1 in the beginning (for the area code), then it cannot start with a hyphen (-). however, area codes may start with a 1, like 123-444-5555, and that would be fine, but this: '-123-444-555' would not be ok. does that make sense?

 

anyway, i'm looking for some constructive criticism, i'm really new to regex. any pointers for this particular pattern?

Link to comment
Share on other sites

thanks for the help BenInBlack. i'm still a little confused tho. i'm not sure what all the forward slashes before and after a pattern mean. i'm also not really sure what would constitute a parenthesis. for example, i've been working on my own regex pattern for phone number validation. here's the pattern

 

(1)?(\-?)(\d{3})(\-)?(\d{3})(\-)?(\d{4})

 

it's supposed to return true for the following phone number entries:

1-555-235-2523

1555-123-5235

555-235-2351

5553523125

 

there doesn't have to be a one in front, and there doesn't have to be hyphens, but there HAS to be an area code, followed by the number. i'm not sure if i've gone parenthesis happy, or if it needs it. also, i'm having trouble figuring out a pattern match. you see, i'd like to have a pattern for the beginning of the phone number, if the user doesn't type in a 1 in the beginning (for the area code), then it cannot start with a hyphen (-). however, area codes may start with a 1, like 123-444-5555, and that would be fine, but this: '-123-444-555' would not be ok. does that make sense?

 

anyway, i'm looking for some constructive criticism, i'm really new to regex. any pointers for this particular pattern?

The slashes at the beginning and end of the regex are the delimiters, they essentially mark where the pattern starts and ends. Regex is handy in this respect in that you can define your own delimiter. I use '%' a lot for using regex on HTML or XML since there's lots of slashes in the text (this saves you from having to escape them all).

 

Parenthesis in regex are used for two things, capture and grouping. What are you going to do with the telephone number in the above example? People have a billion different ways of entering phone numbers (ex. 555-123-4567, 555.123.4567, 555123456, etc. like your example above) so it might be easier to just strip everything that's not a number and then make sure there's 9 to 10 digits there.

 

In PHP that'd look something like this:

$phone_no = '1-555-123-4567';

$phone_no = preg_replace('/[^\d]/', '', $phone_no);

if(strlen($phone_no) == 10){
        echo '10 digits';
        //Do something with 10 digits
}elseif(strlen($phone_no) == 11){
        echo '11 digits';
        //Do something with 11 digits
}else{
        echo 'Busted phone number';
        //Throw an error
}

 

Hope that helps!

Link to comment
Share on other sites

thank you c4onastick for clearing that up. yeah, i know people enter in their phone number tons of ways but i kinda wanted to write a regex pattern that will clear this up once and for all. regex is dynamic enough to have a pattern that can match any phone number string written any way, and i want to find it. an all encompassing phone number validation regex pattern if you will. one that will take into account some people put a '1' in front of their area code, some do not, some put their area code inside parenthesis, and others don't use dashes or slashes at all. i'm sure there's a way to do it.

 

i have considered using preg_replace to make things easier, but if this phone number validation function i'm writing throws an error, i want to put the exact string the user inputted in the form field back in its place, with nothing done to it. of course, i could store it in the session and then manipulate the string with my function, keeping the original copy safe in the session, and use it later if it throws an error. but that's just too easy =)

 

i'm still looking for some advice. also, i'm having trouble wrapping my head around the preg_match() function with the pattern using a '^' in the beginning. like, the '^' says anything that doesn't match this, return false? or what? if it doesn't not match this, return true? i'm not sure when it's necessary to use it. can anyone explain?

Link to comment
Share on other sites

Ah! The picture is becoming clearer!

 

'^' (like so many things in programming) means two different things depending on it's context. Outside of a character class (things inside [square brackets]) it's an anchor. Meaning it matches a position in a string not a character. '^' matches the beginning of the string.

 

However, as the first character inside a character class (like I used above) it negates the character class.

[^\d]

Means match anything that's not a digit.

Link to comment
Share on other sites

I suppose to take a stab at answering your original question about a phone number validating regex, I'd do something like this:

/^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$/

Validation gets tricky and ugly quick in regex. This might not be any easier to wrap your head around but here's the break down (I've bolded the parts in question):

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

Matches an optional '1' at the beginning, as in 1-888-555-1234.

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

Matches the optional punctuation between the '1' and the area code, as in 1-888-555-1234.

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

These match optional parenthesis around the area code, as in (888)555-1234.

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

Matches the three digit area code.

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

This is where it really gets complicated, this is a regex conditional (if, then, else) construct. It says if the first parentesis (blue) matched, match whatever was matched before. Otherwise, match one of the optional separators.

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

Again match three digits, as in 888-123-4567.

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

Here's the next tricky part, it's possible the first capture will fail so in that case the (green) parenthesis will have our delimiter in it. In the case that that also fails, match a required delimiter out of the character class. (You could also make this optional to allow numbers like 8881234567 or (888)1234567 in)

 

Regex: ^1?([-_. ])?\(?\d{3}\)?(?(1)\1|([-_. ])?)\d{3}(?(2)\2|([-_. ]))\d{4}$

Finally, match four digits at the end of the string, as in 888-123-4567.

 

Now it is true that you don't have to use that conditional regex here, but if you leave it out, it'll let things like 1-888_123.4567 in which I'm not sure you want. The regex for that would look like this:

/^1?([-_. ])?\(?\d{3}\)?([-_. ])?\d{3}([-_. ])\d{4}$/

 

I fear I may have muddied the waters a bit, but hopefully that helps.

 

Cheers!

Link to comment
Share on other sites

Here is a website I frequently refer to when creating tricky regexps:

http://www.regular-expressions.info/

 

There also exist sites where you can type a subject string in one field and a regexp in another and it will highlight the matching part; that makes it easy to test your regexp without having to edit code, load page, repeat.

 

One tip for creating long regexps, separate them into more than one variable.  For example:

<?php
  // Create a regexp to match an e-mail -- THIS IS NOT A WORKING EXAMPLE, IT ONLY ILLUSTRATES USING SEVERAL
  // VARS TO CREATE A SINGLE REGEXP, WHICH INCREASES READIBILITY (sometimes)
  // An email is: <localpart>@<domain>
  $local = '...'; // part of the regexp that matches local part
  $domain = '...'; // part of the regexp that matches domain
  $email_regexp = "/{$local}@{$domain}/";
  // Now we could easily create a regexp that accepts multiple emails separated by commas or semicolons
  $mult_email_regexp = "/({$email_regexp})([,;][ ]*{$email_regexp})*/";
?>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.