neo115162 Posted October 25, 2012 Share Posted October 25, 2012 Currently I have a PHP code that displays my result as shown here http://www.terranceweb.com/Capture.JPG What I would like to have is for my function to compare the combination of the addressid and personalnumber so that they are unique. If there's any duplicates found then tell me what files are affected with the addressid and personalnumber combination. For example: Files: LabourForceSurvey2013_2012-10-17_11-09-26.xml, LabourForceSurvey2013_2012-10-17_14-33-47.xml, LabourForceSurvey2013_2012-10-17_15-56-10.xml, LabourForceSurvey2013_2012-10-17_16-33-22.xml and LabourForceSurvey2013_2012-10-18_08-14-12.xml has duplicate values with an Address id 1234 and Personal number 1 Files: LabourForceSurvey2013_2012-10-17_11-25-36.xml and LabourForceSurvey2013_2012-10-17_14-53-10.xml has duplicate values with an Address id 4567 and Personal number 1 etc $this->bestandAddrPers[] = array( "bestand" => $file, "addressid" => $addressid, "personalnumber" =>$Q2); private function displayDuplicate($_ARRAY) { foreach ( $_ARRAY as $key => $val ) { foreach ( $_ARRAY as $key2 => $val2 ) { if ( $key == $key2 ) continue; if ( $val['addressid'] == $val2['addressid'] && $val['personalnumber'] == $val2['personalnumber'] ) { // Duplicates are shown here } } } } Much thanks in advance. Quote Link to comment Share on other sites More sharing options...
Psycho Posted October 25, 2012 Share Posted October 25, 2012 Might I first suggest giving your variables descriptive names rather than $key, $val, $key2, $val2. I think you will find it very helpful in the future. When you're in the thick of working on new code you know what those hold because you just created them. But, when you (or someone else) comes back later it is difficult to easily understand. Also, your double foreach() loop seems unnecessary since you are only referencing two values from the inner array. Or was that just an attempt? As to your request, it all depends on how you want to handle the duplicates. You might be able to do the processing in-line with what you have - or it might require you to preprocess the array first. Also, knowing how you are going to present the duplicates (i.e. the output) could affect the solution. But, if you only want to know what address Ids and personal Number have duplicates, that is easy enough. here is one possible solution (not tested, but logic should be correct) //Create arrays to track duplicates $duplicateAddressIDs = array(); $duplicatePersonalNos = array(); //Function to remove non-duplicate values from duplicate arrays function removeNonDupes($subAry) { return count($subAry) > 1; } //Loop through all the records to add values to the duplicate arrays foreach($this->bestandAddrPers as $rec_idx => $record) { //Put all addr/per. numbers into duplicate arrays using value as key $duplicateAddressIDs[$record['addressid']][] = $rec_idx; $duplicatePersonalNos[$record['personalnumber'][]] = $rec_idx; } //Remove all indexes in duplicate arrays that have only one value //i.e. the ones that really aren't duplicates $duplicateAddressIDs = array_filter($duplicateAddressIDs, 'removeNonDupes'); $duplicatePersonalNos = array_filter($duplicatePersonalNos, 'removeNonDupes'); //The arrays above now only contain duplicates with the index being //the "values" that is duplicates and the value of the array is a //sub-array of all the record indexes that have that value duplicated // e.g. array('1234' => array('0', '2')) //Example Output echo "<h1>Duplicate Address IDs:</h1><br>\n"; foreach($duplicateAddressIDs as $addrValue => $dupRecords) { echo "<h2>Address: {$addrValue}</h2><br>\n"; echo "<ul>\n"; foreach($dupRecords as $recId) { echo "<li>" . $this->bestandAddrPers[$recId]['bestand'] . "</li>\n"; } echo "</ul>\n"; } echo "<h1>Duplicate Personal Numbers:</h1><br>\n"; foreach($duplicatePersonalNos as $noValue => $dupRecords) { echo "<h2>Personal Number: {$noValue}</h2><br>\n"; echo "<ul>\n"; foreach($dupRecords as $recId) { echo "<li>" . $this->bestandAddrPers[$recId]['bestand'] . "</li>\n"; } echo "</ul>\n"; } Quote Link to comment Share on other sites More sharing options...
Psycho Posted October 25, 2012 Share Posted October 25, 2012 (edited) OK, I decided to test it. there was typo. I also change $this->bestandAddrPers to just $bestandAddrPers because I was too lazy to create a class. Below is the original code with a couple minor fixes and some test data. <?php //Test data $bestandAddrPers[] = array( "bestand" => 'file1', "addressid" => '1234', "personalnumber" => '1111'); $bestandAddrPers[] = array( "bestand" => 'file2', "addressid" => '4567', "personalnumber" => '2222'); $bestandAddrPers[] = array( "bestand" => 'file3', "addressid" => '8901', "personalnumber" => '3333'); $bestandAddrPers[] = array( "bestand" => 'file4', "addressid" => '1234', "personalnumber" => '2222'); $bestandAddrPers[] = array( "bestand" => 'file5', "addressid" => '4567', "personalnumber" => '4444'); $bestandAddrPers[] = array( "bestand" => 'file6', "addressid" => '8888', "personalnumber" => '2222'); $bestandAddrPers[] = array( "bestand" => 'file7', "addressid" => '8901', "personalnumber" => '1111'); $bestandAddrPers[] = array( "bestand" => 'file8', "addressid" => '9999', "personalnumber" => '5555'); //Create arrays to track duplicates $duplicateAddressIDs = array(); $duplicatePersonalNos = array(); //Function to remove non-duplicate values from duplicate arrays function removeNonDupes($subAry) { return count($subAry) > 1; } //Loop through all the records to add values to the duplicate arrays foreach($bestandAddrPers as $rec_idx => $record) { //Put all addr/per. numbers into duplicate arrays using value as key $duplicateAddressIDs[$record['addressid']][] = $rec_idx; $duplicatePersonalNos[$record['personalnumber']][] = $rec_idx; } //Remove all indexes in duplicate arrays that have only one value //i.e. the ones that really aren't duplicates $duplicateAddressIDs = array_filter($duplicateAddressIDs, 'removeNonDupes'); $duplicatePersonalNos = array_filter($duplicatePersonalNos, 'removeNonDupes'); //The arrays above now only contain duplicates with the index being //the "values" that is duplicates and the value of the array is a //sub-array of all the record indexes that have that value duplicated // e.g. array('1234' => array('0', '2')) //Example Output echo "<h2>Duplicate Address IDs:</h2>\n"; foreach($duplicateAddressIDs as $addrValue => $dupRecords) { echo "<h3>Address: {$addrValue}</h3>\n"; echo "<ul>\n"; foreach($dupRecords as $recId) { echo "<li>" . $bestandAddrPers[$recId]['bestand'] . "</li>\n"; } echo "</ul>\n"; } echo "<h2>Duplicate Personal Numbers:\n"; foreach($duplicatePersonalNos as $noValue => $dupRecords) { echo "<h3>Personal Number: {$noValue}</h3>\n"; echo "<ul>\n"; foreach($dupRecords as $recId) { echo "<li>" . $bestandAddrPers[$recId]['bestand'] . "</li>\n"; } echo "</ul>\n"; } ?> Edited October 25, 2012 by Psycho Quote Link to comment Share on other sites More sharing options...
neo115162 Posted October 25, 2012 Author Share Posted October 25, 2012 Hello Psycho, thank you very much for your prompt response. The duplicates are actually the combination of the addressid and personalnumber together that I want to extract. It's actually xml files that has data and the addressid is from a household that has people, each assigned to a personal number. So in the end you may have more than one addressid and personal number but should never have the combination of both. Not from the same file and most definitely not from another file. Quote Link to comment Share on other sites More sharing options...
Psycho Posted October 25, 2012 Share Posted October 25, 2012 Hello Psycho, thank you very much for your prompt response. The duplicates are actually the combination of the addressid and personalnumber together that I want to extract. It's actually xml files that has data and the addressid is from a household that has people, each assigned to a personal number. So in the end you may have more than one addressid and personal number but should never have the combination of both. Not from the same file and most definitely not from another file. OK, you can just modify the logic I already provided. Just create ONE array for the duplicates and use the concatenated addressid and personalnumber as the index. <?php //Test data $bestandAddrPers[] = array( "bestand" => 'file1', "addressid" => '1234', "personalnumber" => '1111'); $bestandAddrPers[] = array( "bestand" => 'file2', "addressid" => '4567', "personalnumber" => '2222'); $bestandAddrPers[] = array( "bestand" => 'file3', "addressid" => '8901', "personalnumber" => '3333'); $bestandAddrPers[] = array( "bestand" => 'file4', "addressid" => '1234', "personalnumber" => '1111'); $bestandAddrPers[] = array( "bestand" => 'file5', "addressid" => '4567', "personalnumber" => '4444'); $bestandAddrPers[] = array( "bestand" => 'file6', "addressid" => '4567', "personalnumber" => '2222'); $bestandAddrPers[] = array( "bestand" => 'file7', "addressid" => '8901', "personalnumber" => '1111'); $bestandAddrPers[] = array( "bestand" => 'file8', "addressid" => '9999', "personalnumber" => '5555'); //Create array to track duplicates $duplicateRecords = array(); //Function to remove non-duplicate values from duplicate arrays function removeNonDupes($subAry) { return count($subAry) > 1; } //Loop through all the records to add values to the duplicate arrays foreach($bestandAddrPers as $rec_idx => $record) { //Put all addr/per. numbers into duplicate arrays using value as key $duplicateRecords[$record['addressid'].'-'.$record['personalnumber']][] = $rec_idx; } //Remove all indexes in duplicate arrays that have only one value //i.e. the ones that really aren't duplicates $duplicateRecords = array_filter($duplicateRecords, 'removeNonDupes'); //The arrays above now only contain duplicates with the index being //the "values" that is duplicates and the value of the array is a //sub-array of all the record indexes that have that value duplicated // e.g. array('1234' => array('0', '2')) //Example Output echo "<h2>Files with duplicated address ID and Personal number:</h2>\n"; foreach($duplicateRecords as $dupeValues => $dupRecords) { list($addrID, $persNo) = explode('-', $dupeValues); echo "<h3>Address ID: {$addrID}, Personal No: {$persNo}</h3>\n"; echo "<ul>\n"; foreach($dupRecords as $recId) { echo "<li>" . $bestandAddrPers[$recId]['bestand'] . "</li>\n"; } echo "</ul>\n"; } ?> Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.