Jump to content

Pulling out Phone Number and Email Addresses from an email message


Recommended Posts

I'm writing a script that will process incoming email messages.  The subject line of every message that I'm interested in processing will be formatted like this:

 

SMS from Sender's Name [(740) 123-4567]

 

The name of the sender (shown in red above) may or may not be there--but the rest of the subject will be as shown.  I need to extract out the phone number without the parenthesis or hyphen--i.e. given the above, I need to retrieve 7401234567.  The code that I've been trying produces the error message that I've shown below my snippet.

$regex = "/\nSubject: SMS from .{0,}\[\((\d{3})\)\s(\d{3})-(\d{4})\])\n/imU";
if (preg_match($regex,$content,$matches)){   //<=this is line 91 (see the error message below)
  echo "Matches:  <br><pre>";
  print_r ($matches);
  echo "</pre><br><br><br><br><hr>";
}

 

Warning: preg_match() [function.preg-match]: Compilation failed: unmatched parentheses at offset 62 in /home/doni49/public_html/SMSTest.php on line 91

 

Also, I need to retrieve the email address listed in the From field--I need the email address without the person's name.  The snippet shown below produces what's in the quote below that--basically the entire contents of the From field with the exception of the greater than symbol.

From: "Don Ingram" <doni49@mydomain.com>

$regex = "/\nFrom: [.^<]{0,}<{0,}(.*)>{0,}\n/imU";

if (preg_match($regex,$content,$matches)){
  echo "Matches:  <br><pre>";
  print_r ($matches);
  echo "</pre><br><br><br><br><hr>";
}

 

"Don Ingram" <doni49@gmail.com

 

TIA!

Regarding the warning, it says right there "unmatched parentheses". There's at least one extra ( or ). In your case, an extra ).

 

Unless I'm missing something,

/^Subject: SMS from .* \[\((\d\d\d)\) (\d\d\d)-(\d\d\d\d)\]$/im

/^From: .* ]+)>$/im

IMO you should simplify it by breaking it down into multiple regexes.  Instead of trying to match parenthesis and hyphens and spaces that may or may not be there, simply match anything between [ and ] and then strip non-numbers out.  This will also help you in the future, in case other phone number formats are introduced.

 

Example:

$subject = "Subject: SMS from John Doe [(123) 457-7890]";
preg_match('~Subject:[^\[]*\[([^\]]*)\]~','$subject,$number);
$number = preg_replace('~[^0-9]~','',$number[1]);

 

As for the email address... you say you have

 

From: "Don Ingram" <doni49@mydomain.com>

 

and you just want the green part?

 

$subject = 'From: "Don Ingram" <doni49@mydomain.com>';
preg_match('~From:[^<]*<([^>]*)>~',$subject,$email);
$email = $email[1];

Thanks guys.  Yes I understood that it was having a problem with mismatched one or more parenthesis.  But I had been reading and re-reading my code for more than an hour before posting this thread so I asked for help.  Part of my issue was dealing with the fact that the person's name MAY OR MAY NOT be there (both the phone number and email).

 

After posting, I went back to the regex docs and tutorials and think I have a better understanding of how lookarounds and backrefs work.  I've read them in the past and could never get my head around them.

 

I'll have another look at this tonight.  And yes, I'll look at separating the searches into two separate searches.

my patterns will work whether or not there is a name.

 

Thanks for the help.  But it's still not returning the email address if there is no name.  Following is a snippet of code from my php file.  The lines that I've got commented out show some of the things I've tried.  For now, it's fine because almost every message that needs processed will have a name.

//preg_match('~\nFrom:([^<]<([^>]*)>|([^<]*@[^>]*))\n~mU',$content,$email);
//preg_match('~\nFrom:([^<]<([^>]*)>\n|\nTo:([^<]*@[^>]*))\n~mU',$content,$email);
preg_match('~\nFrom:[^<]*<([^>]*)>\n~mU',$content,$email);
//preg_match('~From:([^<]*@.*)\n~mU',$content,$email);
$from = $email[1];

 

When $content contains the following line, it finds the email address--the one that's not commented out.

From: Don Ingram<doni49@gmail.com>

 

When $content contains the following line, it will not find the email address--NONE of the attempts that I've made.

From: doni49@gmail.com
This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.