Jump to content

array_diff help!


Gho§t

Recommended Posts

I have a very simple script which should be working with no problems.  I have 2 arrays, and I'm simply trying to array_diff them to get the results from array1 which are not present in array2.

 

$scrubbed = array_diff($numbers, $dnc);

print_r($numbers);
print_r($dnc);
print_r($scrubbed);

 

The results:

Array
(
    [0] => 3042476001
    [1] => 3042476002
    [2] => 3042476003
    [3] => 3042476004
    [4] => 3042476005
    [5] => 3042476006
    [6] => 3042476007
    [7] => 3042476008
    [8] => 3042476009
    [9] => 3042476010
    [10] => 3042476011
    [11] => 3042476012
    [12] => 3042476013
    [13] => 3042476014
    [14] => 3047655291
    [15] => 3047281328
)
Array
(
    [0] => 3042476068
    [1] => 3047281328
    [2] => 3047652846
    [3] => 3047652982
    [4] => 3047655291
    [5] => 3047655398
    [6] => 3047655458
    [7] => 3047655719
    [8] => 3047657562
    [9] => 3047657747
)
Array
(
    [0] => 3042476001
    [1] => 3042476002
    [2] => 3042476003
    [3] => 3042476004
    [4] => 3042476005
    [5] => 3042476006
    [6] => 3042476007
    [7] => 3042476008
    [8] => 3042476009
    [9] => 3042476010
    [10] => 3042476011
    [11] => 3042476012
    [12] => 3042476013
    [13] => 3042476014
    [14] => 3047655291
    [15] => 3047281328
)

 

 

Now, $numbers is drawn from a single column .csv file, and $dnc is concatenated from anywhere between 1 and 3 mysql database queries (there are 3 seperate Do Not Call lists).  There are 2 numbers from $numbers which should have been removed, but are not.  I have tried intval() to make sure all types are the same using array_walk(), and I have also tried to trim() with array_walk() in case there were any extra characters not showing up on the print_r.  I am at quite a loss here :(  Someone help?

 

 

 

EDIT:

I have also tried to use the following work around functions

function array_diff_alt($array1, $array2){
$result = array();
foreach ($array1 as $value){
	if (!in_array($value, $array2)){
		$result[] = $value;
	}
}		
return $result;
}
function array_diff_alt($a, $b) {
    $map = $out = array();
    foreach($a as $val) $map[$val] = 1;
    foreach($b as $val) unset($map[$val]);
    return array_keys($map);
}

function array_diff_alt($a, $b) {
$result = array();
foreach ($a as $avalue){
	foreach ($b as $bvalue){
		if ($avalue != $bvalue){
			$result[] = $avalue;
			break;
		}
	}
}
return $result;
}

Link to comment
Share on other sites

Here it is

 

if (isset($_GET['csv_scrub'])){
	/*	
		check extension && security etc
		get array of numbers to be scrubbed
	*/
	if(is_uploaded_file($_FILES['csv_file']['tmp_name']) && end(explode(".", $_FILES['csv_file']['name'])) == 'csv'){
		$filename = "./" . $_FILES['csv_file']['name'];
		if (move_uploaded_file($_FILES['csv_file']['tmp_name'], $filename)){
			chmod($filename, 0777)or die('Could not chmod');
			if ($handle = fopen($filename, 'r')){
				$contents = fread($handle, filesize($filename));
				$numbers = explode("\n", $contents);
				fclose($handle);
			}
		}
	}

	// $numbers = what we get from CSV file
	$numbers = array_filter($numbers, "check_phone");
	$numbers = array_unique($numbers);
	$numbers = array_values($numbers);
	array_walk($numbers, 'trim');
	array_walk($numbers, 'intval');

	$cnumbers = count($numbers);

	// Get an array of each present phone prefix
	$prefixes = array();
	foreach ($numbers as $value){
		$prefix = substr($value, 3, 3);
		if(!in_array($prefix, $prefixes)){
			$prefixes[] = $prefix;
		}
	}

	// figure out area code - we assume that there will not be multiple area codes in a .csv here
	$area_code = substr($numbers[0], 0, 3);

	// get an array of all dnc numbers
	$dnc = retrieve_dnc($area_code, $prefixes);
	array_walk($dnc, 'trim');
	array_walk($numbers, 'intval');

	// for some reason array_diff fails.  see functions above that have also failed
	$scrubbed = array_diff($numbers, $dnc);

	// write to new file -- overwrite any previously existing file for this area code
	$handle = fopen("$area_code Scrubbed.csv", 'w');
	$file = '';
	foreach ($scrubbed as $skey => $svalue){			
		$file .= $svalue . "\n";
	}
	// If $filename isset, will show up as a link (anchor) on page.
	if (fwrite($handle, $file)){
		$filename = $area_code . ' Scrubbed.csv';
	}
	fclose($handle);


	// Calculate how many numbers were removed from the csv file
	echo $cscrubbed = count($scrubbed) . '<br /><pre style="color: white;">';


	//Debug
	echo "numbers<br>" . print_r($numbers);
	echo "dnc<br>" . print_r($dnc);
	echo "scrubbed<br>" . print_r($scrubbed);
	echo '</pre>';
	$entries_removed = $cnumbers - $cscrubbed;

}

 

 

And relevant functions:

function check_phone($x){
if (empty($x) || preg_match('/[0-9]/', $x) < 1 || strlen($x) < 10){
	return false;
} else {
	return true;
}
}

function retrieve_dnc($area_code, $prefix){
// we now grab all DNC entries, add them to an array, and remove dupes
$all_dnc = array();

//If $prefix is an array, we need to create OR statements for every prefix
if (is_array($prefix)){		
	if (@$_POST['dnc'] == 'on') {
		$like = '';
		$or = '';
		foreach ($prefix as $value){
			$like .= "$or number LIKE '$area_code$value%' ";
			$or = ' OR';
		}
		$dnc = do_query("SELECT number FROM `dnc` WHERE $like");
		while ($rez = mysql_fetch_assoc($dnc)) {
			$all_dnc[] = $rez['number'];
		}
	} else {
		$dnc = array();
	}
	if (@$_POST['dnc_fed'] == 'on') {
		$in = '(';
		$or = '';
		foreach ($prefix as $value){
			$in .= "$or number LIKE $value%";
			$or = ' OR';
		}
		$in .= ')';
		$dnc_fed = do_query("SELECT area, number FROM `dnc_fed` WHERE area = $area_code AND $in");
		while ($rez = mysql_fetch_assoc($dnc_fed)) {
			$all_dnc[] = $rez['area'] . $rez['number'];
		}
	} else {
		$dnc_fed = array();
	}
	if (@$_POST['dnc_state'] == 'on') {
		$like = '';
		$or = '';
		foreach ($prefix as $value){
			$like .= "$or number LIKE '$area_code$value%' ";
			$or = ' OR';
		}
		$dnc_state = do_query("SELECT number FROM `dnc_state` WHERE $like");

		while ($rez = mysql_fetch_assoc($dnc_state)) {
			$all_dnc[] = $rez['number'];
		}
	} else {
		$dnc_state = array();
	}
} else {	
	if (@$_POST['dnc'] == 'on') {
		$dnc = do_query("SELECT number FROM `dnc` WHERE number LIKE '$area_code$prefix%'");
		while ($rez = mysql_fetch_assoc($dnc)) {
			$all_dnc[] = $rez['number'];
		}	
	} else {
		$dnc = array();
	}
	if (@$_POST['dnc_fed'] == 'on') {
		$dnc_fed = do_query("SELECT area, number FROM `dnc_fed` WHERE area = '$area_code' AND number LIKE '$prefix%'");
		while ($rez = mysql_fetch_assoc($dnc_fed)) {
			$all_dnc[] = $rez['area'] . $rez['number'];
		}
	} else {
		$dnc_fed = array();
	}
	if (@$_POST['dnc_state'] == 'on') {
		$dnc_state = do_query("SELECT number FROM `dnc_state` WHERE number LIKE '$area_code$prefix%'");
		while ($rez = mysql_fetch_assoc($dnc_state)) {
			$all_dnc[] = $rez['number'];
		}
	} else {
		$dnc_state = array();
	}
}

$all_dnc = array_unique($all_dnc);
return $all_dnc;
}

Link to comment
Share on other sites

Although I don't really see anything specifically "wrong" right now, I'm leaning toward this being a whitespace issue. If it were me debugging this, I'd start by looping through the arrays that are exhibiting the problem, and var_dump()ing the values both before and after the array_walk()s to see what, if any, difference there is. I'd also var_dump the array that it's being compared against just to make sure that data is clean as well.

Link to comment
Share on other sites

array_walk cannot be used with built in functions because it does not return a value and for it to operate on the elements of an array the call-back function must be defined using a reference & to the passed parameter.

 

array_map will work with built in functions or you would need to write your own call-back function to use with array_walk.

Link to comment
Share on other sites

Pikachu, you were right about the whitespace issue.  However my problem is now getting rid of whatever the extra character may be. PFMaBiSmAd,  I have tried both using trim() with array_map() and using a custom function to use with array_walk().  However, the extra character remains.  Here is the output from var_dump for the same 3 arrays:

 

$numbers
array(16) {
  [0]=>
  string(11) "3042476001
"
  [1]=>
  string(11) "3042476002
"
  [2]=>
  string(11) "3042476003
"
  [3]=>
  string(11) "3042476004
"
  [4]=>
  string(11) "3042476005
"
  [5]=>
  string(11) "3042476006
"
  [6]=>
  string(11) "3042476007
"
  [7]=>
  string(11) "3042476008
"
  [8]=>
  string(11) "3042476009
"
  [9]=>
  string(11) "3042476010
"
  [10]=>
  string(11) "3042476011
"
  [11]=>
  string(11) "3042476012
"
  [12]=>
  string(11) "3042476013
"
  [13]=>
  string(11) "3042476014
"
  [14]=>
  string(11) "3047281328
"
  [15]=>
  string(11) "3047655291
"
}

$dnc
array(10) {
  [0]=>
  string(10) "3042476068"
  [1]=>
  string(10) "3047281328"
  [2]=>
  string(10) "3047652846"
  [3]=>
  string(10) "3047652982"
  [4]=>
  string(10) "3047655291"
  [5]=>
  string(10) "3047655398"
  [6]=>
  string(10) "3047655458"
  [7]=>
  string(10) "3047655719"
  [8]=>
  string(10) "3047657562"
  [9]=>
  string(10) "3047657747"
}

$scrubbed
array(16) {
  [0]=>
  string(11) "3042476001
"
  [1]=>
  string(11) "3042476002
"
  [2]=>
  string(11) "3042476003
"
  [3]=>
  string(11) "3042476004
"
  [4]=>
  string(11) "3042476005
"
  [5]=>
  string(11) "3042476006
"
  [6]=>
  string(11) "3042476007
"
  [7]=>
  string(11) "3042476008
"
  [8]=>
  string(11) "3042476009
"
  [9]=>
  string(11) "3042476010
"
  [10]=>
  string(11) "3042476011
"
  [11]=>
  string(11) "3042476012
"
  [12]=>
  string(11) "3042476013
"
  [13]=>
  string(11) "3042476014
"
  [14]=>
  string(11) "3047281328
"
  [15]=>
  string(11) "3047655291
"
}

 

 

The file was exported from Excel.  Is there perhaps a character left by M$ that trim() does not normally remove?

Link to comment
Share on other sites

As an aside, the code you have to build your queries is a bit overcomplicated. For example, this

$in = '(';
$or = '';
foreach ($prefix as $value){
$in .= "$or number LIKE $value%";
$or = ' OR';
}
$in .= ')';

 

Could just be this:

foreach ($prefix as &$value)
{
    $value = "number LIKE $value%";
}
$in = '(' . implode(' OR ', $prefix) . ')';

 

In fact, you use the same process several times, so I would make a function with appropriate parameters and just call it each time.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.