Jump to content

Recommended Posts

Is there a certain set of standard allowable characters for an email address?

 

Here's what i have:

function check_input_email($value){
	if(get_magic_quotes_gpc()) $value = stripslashes($value);
	if(!ereg("^[a-zA-Z0-9.@-_]+$", $value)) return FALSE;
	$value = mysql_real_escape_string($value);       
	return $value;
}

letters, numbers, @, period, dash, underscore

 

Or is it better to just block certain symbols for email addresses?

 

Thanks

Link to comment
https://forums.phpfreaks.com/topic/136103-which-characters-to-allow-for-email/
Share on other sites

<?php
function emailCheck($e){
if(!ereg("^[^@]{1,64}@[^@]{1,255}$", $e)) return false;
$e_array = explode("@", $e);
$local_array = explode(".", $e_array[0]);
for($i=0;$i<count($local_array);$i++) if(!ereg("^(([A-Za-z0-9!#$%&'*+/=?^_`{|}~-][A-Za-z0-9!#$%&'*+/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$", $local_array[$i])) return false;
if(!ereg("^\[?[0-9\.]+\]?$", $email_array[1])){
	$domain_array = explode(".", $e_array[1]);
	if(count($domain_array)<2) return false;
	for($i=0;$i<count($domain_array);$i++) if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $domain_array[$i])) return false;
}
return true;
}
?>

 

That is what I use :)

<?php
function emailCheck($e){
if(!ereg("^[^@]{1,64}@[^@]{1,255}$", $e)) return false;
$e_array = explode("@", $e);
$local_array = explode(".", $e_array[0]);
for($i=0;$i<count($local_array);$i++) if(!ereg("^(([A-Za-z0-9!#$%&'*+/=?^_`{|}~-][A-Za-z0-9!#$%&'*+/=?^_`{|}~\.-]{0,63})|(\"[^(\\|\")]{0,62}\"))$", $local_array[$i])) return false;
if(!ereg("^\[?[0-9\.]+\]?$", $email_array[1])){
	$domain_array = explode(".", $e_array[1]);
	if(count($domain_array)<2) return false;
	for($i=0;$i<count($domain_array);$i++) if (!ereg("^(([A-Za-z0-9][A-Za-z0-9-]{0,61}[A-Za-z0-9])|([A-Za-z0-9]+))$", $domain_array[$i])) return false;
}
return true;
}
?>

 

I won't pretend that I fully understand that, because I don't.  But, are there certain rules you are assuming?  Like there are only a certain number of periods before the @ sign, and after the @ sign?

 

Or, there should only be a certain number of letters or characters after the last period after the @ sign?

 

Also, are these the symbols that you are allowing?

!#$%&'*+/=?^_`{|}~- 

http://code.iamcal.com/php/rfc822/

 

People seem to speak highly of that code. I would suggest that.

 

Thanks for sharing the info. 

Wow, I thought the previous posters code was hard to understand, that stuff looks like it was coded in a different language from a different planet.

It said something about the periods not working right if you're using php 4.x

On the lunarpages server where my webpage is, it says the php is php 4. something.

 

I would really like to just allow characters that are allowable for an email address.

actually, after reading on wiki that

!#$%&*+-/=?^_`{|}~

 

are allowed, I'm thinking, just let them type whatever they want there.  Just keep it under 300 characters, even though wiki says 64@255 which is really 320.

 

I'm not building an email service.  Plus they have to have a valid email address anyway to register with my site.  I'll send them an email to validate their registration.

 

 

The above example seems more complex than it needs to be in my opinion (couldn't open the external link). Here is a simple function I created to ensure an email is properly formatted.

 

function is_email($email) {

    $formatTest = '/^[-\w+]+(\.[-\w+]+)*@[-a-z\d]{2,}(\.[-a-z\d]{2,})*\.[a-z]{2,6}$/i';
    $lengthTest = '/^(.{1,64})@(.{4,255})$/';

    return (preg_match($formatTest, $email) && preg_match($lengthTest, $email));

}    

 

The above validates the following:

 

Format test

- Username accepts: 'a-z', 'A-Z', '0-9', '_' (underscore), '-' (dash), '+' (plus), & '.' (period)

      Note: cannot start or end with a period (and connot be in succession)

- Domain accepts: 'a-z', 'A-Z', '0-9', '-' (dash), & '.' (period)

      Note: cannot start or end with a period (and connot be in succession)

- TLD accepts: 'a-z', 'A-Z', & '0-9'

 

Length test

- Username: 1 to 64 characters

- Domain: 4 to 255 character

The above example seems more complex than it needs to be in my opinion (couldn't open the external link). Here is a simple function I created to ensure an email is properly formatted.

 

function is_email($email) {

    $formatTest = '/^[-\w+]+(\.[-\w+]+)*@[-a-z\d]{2,}(\.[-a-z\d]{2,})*\.[a-z]{2,6}$/i';
    $lengthTest = '/^(.{1,64})@(.{4,255})$/';

    return (preg_match($formatTest, $email) && preg_match($lengthTest, $email));

}    

 

The above validates the following:

 

Format test

- Username accepts: 'a-z', 'A-Z', '0-9', '_' (underscore), '-' (dash), '+' (plus), & '.' (period)

      Note: cannot start or end with a period (and connot be in succession)

- Domain accepts: 'a-z', 'A-Z', '0-9', '-' (dash), & '.' (period)

      Note: cannot start or end with a period (and connot be in succession)

- TLD accepts: 'a-z', 'A-Z', & '0-9'

 

Length test

- Username: 1 to 64 characters

- Domain: 4 to 255 character

 

For user's usernames I want to only allow a-zA-Z0-9-_.

But, how do I ensure that it starts with a letter or number and ends with a letter or number?

 

Right now I just have:

 

 

if(!ereg("^[a-zA-Z0-9-_.]+$", $value)) return FALSE;

 

Thanks

 

 

 

For user's usernames I want to only allow a-zA-Z0-9-_.

But, how do I ensure that it starts with a letter or number and ends with a letter or number?

 

Why would you not allow '+' (plus) or '.' period? those are perfectly acceptable characters for an email - a period is a very common character for an email.

 

But, you can modify it if you wish. This shoudl do it as you asked, but I havent tested it:

 

function is_email($email) {

    $formatTest = '/^[a-z0-9][-\w]+@[-a-z\d]{2,}(\.[-a-z\d]{2,})*\.[a-z]{2,6}$/i';
    $lengthTest = '/^(.{1,64})@(.{4,255})$/';

    return (preg_match($formatTest, $email) && preg_match($lengthTest, $email));

}    

For user's usernames I want to only allow a-zA-Z0-9-_.

But, how do I ensure that it starts with a letter or number and ends with a letter or number?

 

Why would you not allow '+' (plus) or '.' period? those are perfectly acceptable characters for an email - a period is a very common character for an email.

 

But, you can modify it if you wish. This shoudl do it as you asked, but I havent tested it:

 

function is_email($email) {

    $formatTest = '/^[a-z0-9][-\w]+@[-a-z\d]{2,}(\.[-a-z\d]{2,})*\.[a-z]{2,6}$/i';
    $lengthTest = '/^(.{1,64})@(.{4,255})$/';

    return (preg_match($formatTest, $email) && preg_match($lengthTest, $email));

}    

 

I think that is a completely different topic, he is just trying to use this topic to answer a different question.

For user's usernames I want to only allow a-zA-Z0-9-_.

But, how do I ensure that it starts with a letter or number and ends with a letter or number?

 

Why would you not allow '+' (plus) or '.' period? those are perfectly acceptable characters for an email - a period is a very common character for an email.

 

But, you can modify it if you wish. This shoudl do it as you asked, but I havent tested it:

 

function is_email($email) {

    $formatTest = '/^[a-z0-9][-\w]+@[-a-z\d]{2,}(\.[-a-z\d]{2,})*\.[a-z]{2,6}$/i';
    $lengthTest = '/^(.{1,64})@(.{4,255})$/';

    return (preg_match($formatTest, $email) && preg_match($lengthTest, $email));

}    

 

I'm sorry, I meant for their username with my site.

For their email address, since there are so many things allowed, I just figure let them type in whatever they want.  I've decided not to restrict their email.  Since I validate their email address anyway, they have to use a real one.

 

I thought I am allowing periods

f(!ereg("^[a-zA-Z0-9-_.]+$", $value)) return FALSE;

 

Is there a website that says what all the numbers in the braces mean?

 

Like for instance,

$formatTest = '/^[a-z0-9][-\w]+@[-a-z\d]{2,}(\.[-a-z\d]{2,})*\.[a-z]{2,6}$/i';

    $lengthTest = '/^(.{1,64})@(.{4,255})$/';

 

I understand that the 64 probably means allow up to 64 characters, and 255 means probably allow up to 255 characters, but I don't understand some all of the rules.

Is there a place that explains all the rules?

Reading will help.

 

http://www.faqs.org/rfcs/rfc822.html

 

That is the standard for an email address, which is why I referred you to the below function. The function will check that the email is valid by the actual standards of an email address, exactly what you want isn't it?

 

<?php

    function is_valid_email_address($email){
        $no_ws_ctl    = "[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x7f]";
        $alpha        = "[\\x41-\\x5a\\x61-\\x7a]";
        $digit        = "[\\x30-\\x39]";
        $cr        = "\\x0d";
        $lf        = "\\x0a";
        $crlf        = "($cr$lf)";

        $obs_char    = "[\\x00-\\x09\\x0b\\x0c\\x0e-\\x7f]";
        $obs_text    = "($lf*$cr*($obs_char$lf*$cr*)*)";
        $text        = "([\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f]|$obs_text)";
        $obs_qp        = "(\\x5c[\\x00-\\x7f])";
        $quoted_pair    = "(\\x5c$text|$obs_qp)";

        $wsp        = "[\\x20\\x09]";
        $obs_fws    = "($wsp+($crlf$wsp+)*)";
        $fws        = "((($wsp*$crlf)?$wsp+)|$obs_fws)";
        $ctext        = "($no_ws_ctl|[\\x21-\\x27\\x2A-\\x5b\\x5d-\\x7e])";
        $ccontent    = "($ctext|$quoted_pair)";
        $comment    = "(\\x28($fws?$ccontent)*$fws?\\x29)";
        $cfws        = "(($fws?$comment)*($fws?$comment|$fws))";
        $cfws        = "$fws*";

        $atext        = "($alpha|$digit|[\\x21\\x23-\\x27\\x2a\\x2b\\x2d\\x2f\\x3d\\x3f\\x5e\\x5f\\x60\\x7b-\\x7e])";
        $atom        = "($cfws?$atext+$cfws?)";

        $qtext        = "($no_ws_ctl|[\\x21\\x23-\\x5b\\x5d-\\x7e])";
        $qcontent    = "($qtext|$quoted_pair)";
        $quoted_string    = "($cfws?\\x22($fws?$qcontent)*$fws?\\x22$cfws?)";
        $word        = "($atom|$quoted_string)";

        $obs_local_part    = "($word(\\x2e$word)*)";
        $obs_domain    = "($atom(\\x2e$atom)*)";

        $dot_atom_text    = "($atext+(\\x2e$atext+)*)";
        $dot_atom    = "($cfws?$dot_atom_text$cfws?)";

        $dtext        = "($no_ws_ctl|[\\x21-\\x5a\\x5e-\\x7e])";
        $dcontent    = "($dtext|$quoted_pair)";
        $domain_literal    = "($cfws?\\x5b($fws?$dcontent)*$fws?\\x5d$cfws?)";

        $local_part    = "($dot_atom|$quoted_string|$obs_local_part)";
        $domain        = "($dot_atom|$domain_literal|$obs_domain)";
        $addr_spec    = "($local_part\\x40$domain)";

        $done = 0;

        while(!$done){
            $new = preg_replace("!$comment!", '', $email);
            if (strlen($new) == strlen($email)){
                $done = 1;
            }
            $email = $new;
        }


        #
        # now match what's left
        #

        return preg_match("!^$addr_spec$!", $email) ? 1 : 0;
    }
?>

 

I removed most of the comments, if you want the original comments also see the page.

 

That code essentially validates any input against the rules laid out in the above document. Read that document to figure out what this function is actually checking.

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.