Jump to content


This topic is now archived and is closed to further replies.


Matching company name against email

Recommended Posts

Well, unfortenately i just can't get the hang of regex, so i hope someone can help out.
I have a database of companys containing am.o.t. company names, email adresses and url's to their web sites.
Its however many many thats not registered with url's - and by taking a glance i notice that several of them do have an email adress matching up with a unique domin also representing their company web sites.

What i was thinking was to make a url-suggest feature on those companys missing url's IF the email adress domain looks similar to the company name.

company: Digger A/S
email: blah@digger.no
= match

company: Ultra Star AS
email: blah@ultra.no
= match

company: Whatever A/S
email: blah@no.whatever.net
= match

company: Digger A/S
email: blah@yourfreehost.net
= NOT match

Any ideas on a reges on this feature ??
And a step-by-step explanaition on it would be great too...

Share this post

Link to post
Share on other sites

$info = array();
$info[0]['company'] = 'Digger A/S';
$info[0]['email'] = 'blah@digger.no';

$info[1]['company'] = 'Ultra Star AS';
$info[1]['email'] = 'blah@ultra.no';

$info[2]['company'] = 'Whatever A/S';
$info[2]['email'] = 'blah@no.whatever.net';

$info[3]['company'] = 'Digger A/S';
$info[3]['email'] = 'blah@yourfreehost.net';

foreach ($info as $value) {
    if (preg_match('/[@.](\w+)\.\w{2,6}$/',$value['email'],$match)) {
        if (strpos(strtolower($value['company']),strtolower($match[1])) !== FALSE) echo "$value[company] found in $value[email] "$match[1]"<br>\n";
        else echo "<b>No match</b> ($value[company], $value[email]) "$match[1]"<br>\n";


The regular expression ("/[@.](\w+)\.\w{2,6}$/") finds the domain name and puts any match in the $match[1] variable, then the code checks for the existence of the match within the company name string (note the use of !== since you might get a zero offset).

Explaination of regex (I'll explain it backwards):

$ - Anchoring the expression at the end of the line.
\w{2,6} - Two, three, four, five, or six "word characters." (TLD)
\. - A literal period (since "." means ANY character).
(\w+) - Capture one or more "word characters." (domain name)
[@.] - A character class containing the literal "@" and "." ("." loses its special meaning in a char class).

So this regular expression captures the part of the e-mail address before the top-level domain and after either a period or "@".

Make sense?

Share this post

Link to post
Share on other sites
That is brilliant, thank you so much - after working with it and adjusting it to my needs it is happily suggesting web adresses as we speak [img src=\"style_emoticons/[#EMO_DIR#]/smile.gif\" style=\"vertical-align:middle\" emoid=\":smile:\" border=\"0\" alt=\"smile.gif\" /]
I really appreciate your explanation as it also helped me figure out another regex issue i was adjusting on my own.

Thanks !

Share this post

Link to post
Share on other sites


Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.