steveclondon Posted February 12, 2007 Share Posted February 12, 2007 I would like to make a regex so that I can take out email address from peoples posts but without taking out other information. I already have a regex for taking out web addresses and email addreesses but would now like to take out the list of all the alternatives that people might try. Just wondered if anyone had a regex for this rather than me try and think of all of the things people might try and type. Quote Link to comment Share on other sites More sharing options...
Jessica Posted February 12, 2007 Share Posted February 12, 2007 Your title makes it sound like you want to extract emails in the me at domain dot com format, but your post sounds like you want to make me@domain.com into that. That could be done with a simple functions like strpos, substr, and str_replace, and I guess you could use an email regex to validate it. Here's the email regex I use if(!preg_match("/^.+@[^\.].*\.[a-z]{2,}$/", $this->email)){ $msg = "Please enter a valid email."; } Quote Link to comment Share on other sites More sharing options...
steveclondon Posted February 12, 2007 Author Share Posted February 12, 2007 don't think i have made myself clear. Users can post information on a website I am developing but not their contact details as there is a charge for that service. So I wanted a regex to get around some of the main ways to get around any automated process that is looking for an email address such as. me @me.com me@ me.com me at me dot com me at me dot co dot uk me (a) me dot com me (at) me dot com me (@) me . com You get the idea. This must have been done before and just wondered if anyone had a regex for it rather than me write one from scratch. I don't want it to take out any normal text so I am aware there could be problems with the "at" being used as a word. Quote Link to comment Share on other sites More sharing options...
steveclondon Posted February 12, 2007 Author Share Posted February 12, 2007 infact I have thought about things. I have the users email address in the database, and chances are this is what they will use so I will break it down such as me me com then if there is a sequence appearing in that order with different charcters between I know it is the email address and then I will just remove the sequence. Quote Link to comment Share on other sites More sharing options...
Jessica Posted February 12, 2007 Share Posted February 12, 2007 You still are being confusing. You want them to not be able to post their email anywhere, so you're looking for that type of get-around? Is that what you're saying? Quote Link to comment Share on other sites More sharing options...
effigy Posted February 12, 2007 Share Posted February 12, 2007 This, like your phone number post, is a sensitive issue. Below is an example that you should put through a great deal of testing if you decide to use it. It works for all of the entries except the last--the "at" portion is required. (You could create another regex to look for emails without this part.) You also need to be sure that $domain_ends contains every possible ending you can find; without this information the regex will start consuming content. Although this is a good start, a user can easily get around this by adding spaces to the content, for instance: "m y e m a i l @ m y d o m a i n . com." If you are running a pay site, you may want to add something about this to your TOS; I'm not sure how easy it is to get a legal backing on this. <pre> <?php $emails = array( 'me @me.com', 'me@ me.com', 'me at me dot com', 'me at me dot co dot uk', 'me (a) me dot com', 'me (at) me dot com', 'me (@) me . com', 'me (@) me . co . uk', 'me me com' ); $content = '\w+'; $at = '[@()at]+'; $dot = '(?:\.|dot)?'; // Add every possible domain ending here. $domain_ends = '(?:net|com|uk|org|gov|edu)'; foreach ($emails as $email) { // Add some "content" $email = 'ABC DEF GHI ' . $email . ' JKL MNO PQR'; echo $email, ' => '; $email = preg_replace("/$content\s*$at\s*$content(?:\s*$dot\s*$content)*(?:\s*$dot\s*$domain_ends)/", '-----', $email); echo $email, '<br>'; } ?> </pre> Quote Link to comment Share on other sites More sharing options...
Jessica Posted February 12, 2007 Share Posted February 12, 2007 In order to counter the spaces thing, you could create a temporary string with all the spaces removed, and check it then Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.