Jump to content

Comparing two lists of email addresses with PHP


ionicle

Recommended Posts

Hey again, everybody.

 

Need help with the solution of yet another task related to php.

 

I have two lists of email addresses - List 1 and List 2, so to say. The entire contents of List 2 is a part of List 1.

 

I would like for PHP to compare both lists and erase all the email addresses out of List 2, that are contained in List 1, plus all the email addresses, located at every domain name, listed within List 2.

 

For instance:

 

List 1:

 

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

 

 

List 2:

 

[email protected]

[email protected]

 

 

After processing's completed, List 1 should look like this:

 

List 1:

 

[email protected]

[email protected]

[email protected]

 

How would I go about doing that?

That looks pretty convenient! Thing is, I not only need the entire List 2 to be removed from List 1, but also have all email addresses from List 1, matching with all domain names from List 2, be removed from List 1. 

 

array_diff() wouldn't help with that, I guess.

Try this

<?php
    // LIST 1
    $list1 = array(
        "[email protected]",
        "[email protected]",
        "[email protected]",
        "[email protected]",
        "[email protected]",
        "[email protected]"
    );
    // LIST 2
    $list2 = array(
        "[email protected]",
        "[email protected]"
    );
    
    function extractUnwantedDomains($list2) {
        $list2Domains = array();
        foreach($list2 as $email) {
            $atPosition = strripos($email, '@');
            $dotPosition = strripos($email, '.', $atPosition + 1);
            $list2Domains[] = substr($email, ($atPosition + 1), ($dotPosition - 1) - ($atPosition));
        }
        return($list2Domains);
    }
    
    
    
    function extractFinalList($list1, $list2) {
        $finalList = array();
        $difference = array_diff($list1, $list2);
        $unwantedDomains = extractUnwantedDomains($list2);
        foreach($difference as $email) {
            foreach($unwantedDomains as $domain) {
                if(preg_match("/{$domain}/", $email)) {
                    continue(2);
                }
            }
            $finalList[] = $email;
        }
        return($finalList);
    }
    
    $finalList = extractFinalList($list1, $list2);
    var_dump($finalList);

You could try using array_diff() to get the difference. And then adapt the solution provided a few days ago to get unique domains:

http://forums.phpfreaks.com/topic/284098-pick-a-random-email-address-out-of-a-maillist/?do=findComment&comment=1459187

just satisfying your second condition of removing all entries with the same domains found in the second list will remove the specific emails in the first condition.

 

the quickest code would be to preprocess the arrays of addresses so that you have an array of arrays, where the main key is the domain name (shown in pseudo code form, not the actual array) -

 

$array1['yahoo.com'] = array([email protected],[email protected])

$array1['gmail.com'] = array([email protected])

$array1['hotmail.com'] = array([email protected])

...

 

$array2 ....

 

then (untested), you should be able to use array_diff() and the main key values in the second list will remove all the corresponding domain entries in the first list.

using array_diff_key() -

 $list1 = array(
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]"
);

$list2 = array(
"[email protected]",
"[email protected]"
);

$array1 = array();
foreach($list1 as $email){
    list($name,$domain) = explode('@',$email);
    if(!isset($array1[$domain])){$array1[$domain] = array();}
    $array1[$domain][] = $email;
}

$array2 = array();
foreach($list2 as $email){
    list($name,$domain) = explode('@',$email);
    if(!isset($array2[$domain])){$array2[$domain] = array();}
    $array2[$domain][] = $email;
}

$result = array_diff_key($array1,$array2);

$final = array();
foreach($result as $arr){ // there should be a way to do this without a loop...
    $final = array_merge($final,$arr);
}

echo '<pre>';
print_r($final);

Turns out JIXO's code works. For some reason though, when I load up a large number of email addies in the arrays, it screws up and spits out a blank result. No clue why.

 

I got another reply on Stackoverflow:

 

 

 
<?php
 
function domain($email){
$x=explode('@',$email);
return $x[1];
}

$list1=array("[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]");//first list
$list2=array("[email protected]","[email protected]");//second list
$black_domains=array();
foreach($list2 as $l2){
$black_domains[]=domain($l2);
}
$new_list1=array();
foreach($list1 as $l1){
$domain=domain($l1);
if(!in_array($domain,$black_domains)){
$new_list1[]=$l1;
};
}

print_r($new_list1); //this gives new list
 
?>
 

 

 

That one works like a charm, even with very large lists.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.