doni49 Posted December 25, 2010 Share Posted December 25, 2010 I'm writing a script that will process incoming email messages. The subject line of every message that I'm interested in processing will be formatted like this: SMS from Sender's Name [(740) 123-4567] The name of the sender (shown in red above) may or may not be there--but the rest of the subject will be as shown. I need to extract out the phone number without the parenthesis or hyphen--i.e. given the above, I need to retrieve 7401234567. The code that I've been trying produces the error message that I've shown below my snippet. $regex = "/\nSubject: SMS from .{0,}\[\((\d{3})\)\s(\d{3})-(\d{4})\])\n/imU"; if (preg_match($regex,$content,$matches)){ //<=this is line 91 (see the error message below) echo "Matches: <br><pre>"; print_r ($matches); echo "</pre><br><br><br><br><hr>"; } Warning: preg_match() [function.preg-match]: Compilation failed: unmatched parentheses at offset 62 in /home/doni49/public_html/SMSTest.php on line 91 Also, I need to retrieve the email address listed in the From field--I need the email address without the person's name. The snippet shown below produces what's in the quote below that--basically the entire contents of the From field with the exception of the greater than symbol. From: "Don Ingram" <doni49@mydomain.com> $regex = "/\nFrom: [.^<]{0,}<{0,}(.*)>{0,}\n/imU"; if (preg_match($regex,$content,$matches)){ echo "Matches: <br><pre>"; print_r ($matches); echo "</pre><br><br><br><br><hr>"; } "Don Ingram" <doni49@gmail.com TIA! Quote Link to comment Share on other sites More sharing options...
requinix Posted December 25, 2010 Share Posted December 25, 2010 Regarding the warning, it says right there "unmatched parentheses". There's at least one extra ( or ). In your case, an extra ). Unless I'm missing something, /^Subject: SMS from .* \[\((\d\d\d)\) (\d\d\d)-(\d\d\d\d)\]$/im /^From: .* ]+)>$/im Quote Link to comment Share on other sites More sharing options...
.josh Posted December 25, 2010 Share Posted December 25, 2010 IMO you should simplify it by breaking it down into multiple regexes. Instead of trying to match parenthesis and hyphens and spaces that may or may not be there, simply match anything between [ and ] and then strip non-numbers out. This will also help you in the future, in case other phone number formats are introduced. Example: $subject = "Subject: SMS from John Doe [(123) 457-7890]"; preg_match('~Subject:[^\[]*\[([^\]]*)\]~','$subject,$number); $number = preg_replace('~[^0-9]~','',$number[1]); As for the email address... you say you have From: "Don Ingram" <doni49@mydomain.com> and you just want the green part? $subject = 'From: "Don Ingram" <doni49@mydomain.com>'; preg_match('~From:[^<]*<([^>]*)>~',$subject,$email); $email = $email[1]; Quote Link to comment Share on other sites More sharing options...
doni49 Posted December 25, 2010 Author Share Posted December 25, 2010 Thanks guys. Yes I understood that it was having a problem with mismatched one or more parenthesis. But I had been reading and re-reading my code for more than an hour before posting this thread so I asked for help. Part of my issue was dealing with the fact that the person's name MAY OR MAY NOT be there (both the phone number and email). After posting, I went back to the regex docs and tutorials and think I have a better understanding of how lookarounds and backrefs work. I've read them in the past and could never get my head around them. I'll have another look at this tonight. And yes, I'll look at separating the searches into two separate searches. Quote Link to comment Share on other sites More sharing options...
.josh Posted December 25, 2010 Share Posted December 25, 2010 my patterns will work whether or not there is a name. Quote Link to comment Share on other sites More sharing options...
doni49 Posted December 26, 2010 Author Share Posted December 26, 2010 my patterns will work whether or not there is a name. Thanks for the help. But it's still not returning the email address if there is no name. Following is a snippet of code from my php file. The lines that I've got commented out show some of the things I've tried. For now, it's fine because almost every message that needs processed will have a name. //preg_match('~\nFrom:([^<]<([^>]*)>|([^<]*@[^>]*))\n~mU',$content,$email); //preg_match('~\nFrom:([^<]<([^>]*)>\n|\nTo:([^<]*@[^>]*))\n~mU',$content,$email); preg_match('~\nFrom:[^<]*<([^>]*)>\n~mU',$content,$email); //preg_match('~From:([^<]*@.*)\n~mU',$content,$email); $from = $email[1]; When $content contains the following line, it finds the email address--the one that's not commented out. From: Don Ingram<doni49@gmail.com> When $content contains the following line, it will not find the email address--NONE of the attempts that I've made. From: doni49@gmail.com Quote Link to comment Share on other sites More sharing options...
sasa Posted December 26, 2010 Share Posted December 26, 2010 for e-mail $pattern = '/From:.*?([A-Za-z0-9!#$%&\'*+-\/=?^_`{|}~\.]+@[A-Za-z0-9!#$%&\'*+-\/=?^_`{|}~\.]+)/'; Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.