Jump to content

Matching company name against email


alpine

Recommended Posts

Well, unfortenately i just can't get the hang of regex, so i hope someone can help out.
I have a database of companys containing am.o.t. company names, email adresses and url's to their web sites.
Its however many many thats not registered with url's - and by taking a glance i notice that several of them do have an email adress matching up with a unique domin also representing their company web sites.

What i was thinking was to make a url-suggest feature on those companys missing url's IF the email adress domain looks similar to the company name.

example:
company: Digger A/S
email: blah@digger.no
= match

example:
company: Ultra Star AS
email: blah@ultra.no
= match

example:
company: Whatever A/S
email: blah@no.whatever.net
= match

example:
company: Digger A/S
email: blah@yourfreehost.net
= NOT match

Any ideas on a reges on this feature ??
And a step-by-step explanaition on it would be great too...
Link to comment
Share on other sites

[code]<?php

$info = array();
$info[0]['company'] = 'Digger A/S';
$info[0]['email'] = 'blah@digger.no';

$info[1]['company'] = 'Ultra Star AS';
$info[1]['email'] = 'blah@ultra.no';

$info[2]['company'] = 'Whatever A/S';
$info[2]['email'] = 'blah@no.whatever.net';

$info[3]['company'] = 'Digger A/S';
$info[3]['email'] = 'blah@yourfreehost.net';


foreach ($info as $value) {
    if (preg_match('/[@.](\w+)\.\w{2,6}$/',$value['email'],$match)) {
        if (strpos(strtolower($value['company']),strtolower($match[1])) !== FALSE) echo "$value[company] found in $value[email] "$match[1]"<br>\n";
        else echo "<b>No match</b> ($value[company], $value[email]) "$match[1]"<br>\n";
    }
}


?>[/code]

The regular expression ("/[@.](\w+)\.\w{2,6}$/") finds the domain name and puts any match in the $match[1] variable, then the code checks for the existence of the match within the company name string (note the use of !== since you might get a zero offset).

Explaination of regex (I'll explain it backwards):

$ - Anchoring the expression at the end of the line.
\w{2,6} - Two, three, four, five, or six "word characters." (TLD)
\. - A literal period (since "." means ANY character).
(\w+) - Capture one or more "word characters." (domain name)
[@.] - A character class containing the literal "@" and "." ("." loses its special meaning in a char class).

So this regular expression captures the part of the e-mail address before the top-level domain and after either a period or "@".

Make sense?
Link to comment
Share on other sites

That is brilliant, thank you so much - after working with it and adjusting it to my needs it is happily suggesting web adresses as we speak [img src=\"style_emoticons/[#EMO_DIR#]/smile.gif\" style=\"vertical-align:middle\" emoid=\":smile:\" border=\"0\" alt=\"smile.gif\" /]
I really appreciate your explanation as it also helped me figure out another regex issue i was adjusting on my own.

Thanks !
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.