Jump to content


Photo

Matching company name against email


  • Please log in to reply
2 replies to this topic

#1 alpine

alpine
  • Members
  • PipPipPip
  • Advanced Member
  • 756 posts
  • LocationNorway

Posted 20 June 2006 - 05:04 AM

Well, unfortenately i just can't get the hang of regex, so i hope someone can help out.
I have a database of companys containing am.o.t. company names, email adresses and url's to their web sites.
Its however many many thats not registered with url's - and by taking a glance i notice that several of them do have an email adress matching up with a unique domin also representing their company web sites.

What i was thinking was to make a url-suggest feature on those companys missing url's IF the email adress domain looks similar to the company name.

example:
company: Digger A/S
email: blah@digger.no
= match

example:
company: Ultra Star AS
email: blah@ultra.no
= match

example:
company: Whatever A/S
email: blah@no.whatever.net
= match

example:
company: Digger A/S
email: blah@yourfreehost.net
= NOT match

Any ideas on a reges on this feature ??
And a step-by-step explanaition on it would be great too...

#2 Wildbug

Wildbug
  • Members
  • PipPipPip
  • Advanced Member
  • 1,149 posts

Posted 20 June 2006 - 02:22 PM

<?php

$info = array();
$info[0]['company'] = 'Digger A/S';
$info[0]['email'] = 'blah@digger.no';

$info[1]['company'] = 'Ultra Star AS';
$info[1]['email'] = 'blah@ultra.no';

$info[2]['company'] = 'Whatever A/S';
$info[2]['email'] = 'blah@no.whatever.net';

$info[3]['company'] = 'Digger A/S';
$info[3]['email'] = 'blah@yourfreehost.net';


foreach ($info as $value) {
    if (preg_match('/[@.](\w+)\.\w{2,6}$/',$value['email'],$match)) {
        if (strpos(strtolower($value['company']),strtolower($match[1])) !== FALSE) echo "$value[company] found in $value[email] "$match[1]"<br>\n";
        else echo "<b>No match</b> ($value[company], $value[email]) "$match[1]"<br>\n";
    }
}


?>

The regular expression ("/[@.](\w+)\.\w{2,6}$/") finds the domain name and puts any match in the $match[1] variable, then the code checks for the existence of the match within the company name string (note the use of !== since you might get a zero offset).

Explaination of regex (I'll explain it backwards):

$ - Anchoring the expression at the end of the line.
\w{2,6} - Two, three, four, five, or six "word characters." (TLD)
\. - A literal period (since "." means ANY character).
(\w+) - Capture one or more "word characters." (domain name)
[@.] - A character class containing the literal "@" and "." ("." loses its special meaning in a char class).

So this regular expression captures the part of the e-mail address before the top-level domain and after either a period or "@".

Make sense?

Twice a day my clock works PERFECTLY!  I can't figure out what's wrong with it.

#3 alpine

alpine
  • Members
  • PipPipPip
  • Advanced Member
  • 756 posts
  • LocationNorway

Posted 21 June 2006 - 08:07 PM

That is brilliant, thank you so much - after working with it and adjusting it to my needs it is happily suggesting web adresses as we speak [img src=\"style_emoticons/[#EMO_DIR#]/smile.gif\" style=\"vertical-align:middle\" emoid=\":smile:\" border=\"0\" alt=\"smile.gif\" /]
I really appreciate your explanation as it also helped me figure out another regex issue i was adjusting on my own.

Thanks !




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users