Shadowing Posted March 31, 2012 Share Posted March 31, 2012 hey guys im trying to check for matches Matches 800-555-5555 | 333-444-5555 | 212-666-1234 Non-Matches 000-000-0000 | 123-456-7890 | 2126661234 Getting hit with alot of parse errors with this. (!preg_match('^(\d{3}-\d{3}-\d{4})*$)', $_POST['current_number']) Quote Link to comment Share on other sites More sharing options...
ragax Posted March 31, 2012 Share Posted March 31, 2012 Hi Shadowing, two questions: 1. Why the star in (\d{3}-\d{3}-\d{4})*? This allows the regex to match "" (empty string) as well as 555-2222-2222555-2222222 2. How do I know that 123-456-7890 is a non-match? (What is the rule in plain English, so that I can start thinking about the regex?) Wishing you a fun weekend. [Edit: typo ("these" instead of "this") ] Quote Link to comment Share on other sites More sharing options...
Shadowing Posted March 31, 2012 Author Share Posted March 31, 2012 Thanks for the responce ragax i got it off a site that was listed in the read first thing on the forum but couldnt get it to work lol what i really want is this Matches 800-555-5555 | 333-444-5555 | 212-666-1234 Non-Matches 000-000-0000 | 123-456-7890 | 2126661234 its for a phone number. only want 3 characters - 3 characters - 4 characters and only numeric Quote Link to comment Share on other sites More sharing options...
ragax Posted March 31, 2012 Share Posted March 31, 2012 Yes, but what is the rule to tell the difference between: Non-Matches 000-000-0000 | 123-456-7890 and Matches 800-555-5555 | 333-444-5555 They both have three digits, a dash, three digits, a dash, four digits? What is the rule, in plain English, to know that one is okay and the other one is not okay? Quote Link to comment Share on other sites More sharing options...
.josh Posted April 1, 2012 Share Posted April 1, 2012 Hello Shadowing, It looks like you want to 1) Match for a standard American telephone number 2) Reject certain phone numbers because they are fake. #1 is fairly easy, especially if you take the approach of not caring what the actual format is (different people use different formats; some people wrap the area code in parens, some people use hyphens while others use dots, etc...) by just stripping out anything that is not a number and then counting for 10 digits. #2 on the other hand is pretty arbitrary. The only realistic thing you can do is either a) have a whitelist of all valid U.S. phone numbers. This is not feasible because it is a very large, ever changing list. Or b) make a blacklist of the more common bullshit numbers people try to use. This is by far more feasible, but it will never be perfect, and there's no way to really make anything perfect, short of attempting option "a". One possible compromise would be to make a whitelist of all currently known area codes. The list isn't all *that* long, but more importantly, even though the area code list can change...it is not something that changes very often. You can even see from the link a bunch that aren't actually in use yet but are reserved or up and coming. Yeah..still kinda a pita to maintain but you shouldn't have to check very often. So anyways, I separated area code and local number black lists into two separate arrays in code below so it will be easier to go that route if you wanna. But also since it is separated, will have the added bonus of catching numbers like '1110000000'. But as far as the local number blacklist...TBH I would not recommend adding anything else to that list. For example, one of your "baddie" phone numbers "6661234" ...this is more than likely a valid number. I know for a fact that "666" is a valid local prefix in some areas, and I'm almost 100% positive most phone companies just randomly pick the last 4 digits or pick the "next available increment" when assigning you a phone number. So anyways, with all that in mind, here is my suggested solution (this is example code to get the concept...you may wanna organize this how you see fit, according to whatever other code you have...for example, make the blacklist some object property etc...): function validatePhoneNumber ($number) { // black list of common fake area codes $blacklist['a'] = array('000','111','222','333','444', '555','666','777','888','999', '123','098','987'); // black list of common fake local numbers $blacklist['l'] = array('0000000','1111111','2222222','3333333','4444444', '5555555','6666666','7777777','8888888','9999999', '4567890','7654321','1234567','6543210'); // strip everything but the numbers $number = preg_replace("~[^0-9]~","",$number); // number is bad if not 10 digits if ( strlen($number)!=10 ) return false; $a = substr($number,0,3); $l = substr($number,-7); // number is bad if area code in blacklist if ( in_array($a,$blacklist['a']) ) return false; // number is bad if local number in blacklist if ( in_array($l,$blacklist['l']) ) return false; // number is good return true; } Quote Link to comment Share on other sites More sharing options...
ragax Posted April 1, 2012 Share Posted April 1, 2012 Superb answer, .josh! This answer gives me the motivation to start a text file with a list of useful threads for the "common expressions" stickie: Phone numbers: matching with blacklists, etc Quote Link to comment Share on other sites More sharing options...
Shadowing Posted April 10, 2012 Author Share Posted April 10, 2012 Thanks for the reply Josh Acctually all i really want is a filter to make sure they are keeping this format 333-333-3456 when users type in their numberI display a onfocus example of the format. thats a nice blacklist phone number function you build though. Quote Link to comment Share on other sites More sharing options...
xyph Posted April 10, 2012 Share Posted April 10, 2012 The dashes are redundant, and can be added after. What really matter is they've entered exactly 10 digits, and it doesn't match anything in the black-list. If you wanted to add dashes afterwards, it's quite easy. <?php $digits = 604763847; // can be string or int, doesn't matter if( strlen($digits) != 10 ) echo 'Bad phone number length'; else { $digits = substr($digits,0,3).'-'.substr($digits,3,3).'-'.substr($digits,6); echo $digits; } ?> Josh's answer covers numbers entered with hyphens as well, by first removing them, and then checking the length. This saves you from having to use the RegEx engine at all, speeding things up. He may want to add trim() to his solution, or replace white-spaces with an empty string as well (so numbers like '604 763 8470' match as well). If you MUST have numbers submitted in the exact formatting of ddd-ddd-dddd, you could use something like this: #^(\d{3})-(\d{3})-(\d{4})$# Assert position at the beginning of the string «^» Match the regular expression below and capture its match into backreference number 1 «(\d{3})» Match a single digit 0..9 «\d{3}» Exactly 3 times «{3}» Match the character “-” literally «-» Match the regular expression below and capture its match into backreference number 2 «(\d{3})» Match a single digit 0..9 «\d{3}» Exactly 3 times «{3}» Match the character “-” literally «-» Match the regular expression below and capture its match into backreference number 3 «(\d{4})» Match a single digit 0..9 «\d{4}» Exactly 4 times «{4}» Assert position at the end of the string (or before the line break at the end of the string, if any) «$» Using preg_match with the 3rd argument will return all of the capturing groups (to check against a black-list) as well as the entire string. Quote Link to comment Share on other sites More sharing options...
.josh Posted April 10, 2012 Share Posted April 10, 2012 He may want to add trim() to his solution, or replace white-spaces with an empty string as well (so numbers like '604 763 8470' match as well). // strip everything but the numbers $number = preg_replace("~[^0-9]~","",$number); This already strips whitespaces and will effectively trim() it; it strips anything that is not a number. And then this: // number is bad if not 10 digits if ( strlen($number)!=10 ) return false; checks to see if what is left over is 10 chars. So basically, the visitor can enter in a phone number in any format they wish, (even " [123]....(456)~7890~ " if they really really wanted to - I hear that's the unofficial format of some smaller towns in Nebraska backwoods areas). Quote Link to comment Share on other sites More sharing options...
xyph Posted April 10, 2012 Share Posted April 10, 2012 He may want to add trim() to his solution, or replace white-spaces with an empty string as well (so numbers like '604 763 8470' match as well). // strip everything but the numbers $number = preg_replace("~[^0-9]~","",$number); This already strips whitespaces and will effectively trim() it. Ooops! I just buzzed through your code. I made the assumption that you used str_replace('-','',$input) to avoid using the RegEx engine at all. Next time, I'll look closer. I was stuck in the idea that this solution didn't need RegEx at all. Quote Link to comment Share on other sites More sharing options...
Shadowing Posted April 14, 2012 Author Share Posted April 14, 2012 ahh thanks Josh. thats pretty simple Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.