Jump to content

Recommended Posts

Good morning!

 

Long story short, I've got a lot of data that I need to condense!  I was wondering if there is an easy way of doing this... I have a CSV file containing 3 columns: NAMES, WEIGHT, HEIGHT

 

What I want to do is, sift through this file and pull out any rows who's WEIGHT and HEIGHT match exactly, then output their how many names match as well as the corresponding names.

 

So, in the end I'll output something like "3 kids: 108lbs/5ft  John, Bob, Rick"

 

I can think of a few ways of doing this using various loops, but I wasn't sure if there is better way.  Any suggestions are very very much appreciated.

Link to comment
https://forums.phpfreaks.com/topic/142799-pull-out-duplicate-values-from-array/
Share on other sites

Hm.. I was considering using array_search inside a foreach loop.  However, this gets messy once I come to a record who's duplicate I've already determined to exist.. if that makes sense.  I guess I'd have to include a line to delete the duplicate once found?

Alright, this sort of works for the 1st entry, then goes nuts with lots of Undefined offsets.

<?php

$count=0;

// Open our test text file, read it into an array $fc
$file=file("glass.csv");

// Loop through the array, simply printing the line
foreach($file as $line) {
list($location[], $type[], $width1[], $height1[], $width2[], $height2[]) = explode(",", $line);
}

$count=0;
$count2=1;


// Loop through each entry in the list
foreach($location as $loc1) {

// Output the entry I'm checking against
echo $location[$count];

// Loop through list to find duplicates
foreach($location as $loc2) {
	if (isset($width1[$count2])) {
		if ($width1[$count] == $width1[$count2] && $height1[$count] == $height1[$count2]) {
			// If duplicate is found, append to previous output
			echo ", ".$location[$count2];

			// Remove entry from the list
			unset($location[$count2]);
			unset($width1[$count2]);
			unset($height1[$count2]);
		}
	}
$count2++;
}

echo "<br /><br />";

// Remove the original "check" entry
unset($location[$count]);
unset($width1[$count]);
unset($height1[$count]);
//$location = array_values($location);
//$width1 = array_values($width1);
//$height1 = array_values($height1);

$count++;
$count2=1;

}


?>

Quick example...

 

<?php

$lines = file ( './glass.csv' );

$matches = array ();

foreach ( $lines AS $line )
{
$line = array_map ( 'trim', explode ( ',', $line ) );

$item = '_' . md5 ( $line[1] . $line[2] );

if ( isset ( $matches[$item] ) )
{
	if ( isset ( $matches[$item]['weight_height'] ) )
	{
		$matches[$item]['names'][] = $line[0];
	}
	else
	{
		$matches[$item]['weight_height'] = $line[1] . '/' . $line[2];

		$matches[$item]['names'][] = $line[0];
	}
}
else
{
	$matches[$item]['names'] = array ( $line[0] );
}
}

unset ( $item, $line );

$lines = array ();

foreach ( $matches AS $name => $value )
{
if ( isset ( $matches[$name]['weight_height'] ) )
{
	$lines[] = sizeof ( $matches[$name]['names'] ) . ' kids: ' . $matches[$name]['weight_height'] . ' ' . implode ( ', ', $matches[$name]['names'] );
}
}

unset ( $matches, $name, $value );

// dump the results

print_r ( $lines );

?>

 

 

example data... "./glass.csv"

 

bill, 104lbs, 5ft
sam, 104lbs, 5ft
tom, 104lbs, 5ft
adam, 104lbs, 5ft
tim, 104lbs, 4ft
sam, 104lbs, 6ft
kim, 104lbs, 7ft
brain, 104lbs, 3ft

 

example results...

 

Array
(
    [0] => 4 kids: 104lbs/5ft bill, sam, tom, adam
)

Assuming your file looks like this:

 

Joe, 100, 5

Jane, 104, 5

Sam, 110, 6

Dick, 100, 5

etc...

 

 

Use file to grab the list and put each line into an array element.  Use a foreach loop to loop through each line.  Inside the foreach loop, first trim the line.  Then explode it.  Next, make a new array where the key is a weight|height concated values from the exploded line, and the value of that array is the whole exploded line.  This will effectively group all same weight|height lines together, into a multi-dim array. Next, use another foreach loop to loop through that list and make a new array of only the ones that have more than one name.  You now have a multi-dim array of all names with same height and weight.

 

So...you should end up with a structure like this:

 

Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => John
                    [1] => 108
                    [2] => 5
                )

            [1] => Array
                (
                    [0] => Bob
                    [1] => 108
                    [2] => 5
                )

            [2] => Array
                (
                    [0] => Rick
                    [1] => 108
                    [2] => 5
                )

        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] => Ray
                    [1] => 110
                    [2] => 5
                )

            [1] => Array
                (
                    [0] => Mary
                    [1] => 110
                    [2] => 5
                )

        )

)

 

As far as outputting the list to your format:

 

Start with a foreach loop to go through each group of names.  To get the number of people in that group, simply do a count($val) where $val is the variable you specified as the value in your foreach loop. You can use $val[1] and $val[2] for weight and height.

 

For list of names, you would use another foreach loop nested inside the first one, to get each name from the inner arrays.  Putting it into a comma separated list (as you showed in your post).

Quick example...

 

<?php

$lines = file ( './glass.csv' );

$matches = array ();

foreach ( $lines AS $line )
{
$line = array_map ( 'trim', explode ( ',', $line ) );

$item = '_' . md5 ( $line[1] . $line[2] );

if ( isset ( $matches[$item] ) )
{
	if ( isset ( $matches[$item]['weight_height'] ) )
	{
		$matches[$item]['names'][] = $line[0];
	}
	else
	{
		$matches[$item]['weight_height'] = $line[1] . '/' . $line[2];

		$matches[$item]['names'][] = $line[0];
	}
}
else
{
	$matches[$item]['names'] = array ( $line[0] );
}
}

unset ( $item, $line );

$lines = array ();

foreach ( $matches AS $name => $value )
{
if ( isset ( $matches[$name]['weight_height'] ) )
{
	$lines[] = sizeof ( $matches[$name]['names'] ) . ' kids: ' . $matches[$name]['weight_height'] . ' ' . implode ( ', ', $matches[$name]['names'] );
}
}

unset ( $matches, $name, $value );

// dump the results

print_r ( $lines );

?>

 

 

example data... "./glass.csv"

 

bill, 104lbs, 5ft
sam, 104lbs, 5ft
tom, 104lbs, 5ft
adam, 104lbs, 5ft
tim, 104lbs, 4ft
sam, 104lbs, 6ft
kim, 104lbs, 7ft
brain, 104lbs, 3ft

 

example results...

 

Array
(
    [0] => 4 kids: 104lbs/5ft bill, sam, tom, adam
)

 

Works like a charm!  Thank you thank you thank you!  Now I'll have to dissect it and figure out what makes it tick!

Hm... using PrintF's solution above, I'm having trouble outputting this info into a simple table.  I'm replacing his "dump the results" section with:

 

foreach ($lines as $line) {
print_r ( $line );
explode('|',$line);
echo "<tr>";
echo "<td>$line[0]</td>"; // Output QTY
echo "<td>$line[1]</td>"; // Output Weight
echo "<td>$line[2]</td>"; // Output Height
echo "<td>$line[3]</td>"; // Output Names
echo "<tr>";
}

 

This yields:

3|110|5|bill, ted, eggbert
as the result of the first $line, but is for some reason showing the delimiter '|' as the result of $line[1]?  I can't figure out why.  Any thoughts?

well seeing as how his code doesn't use | anywhere...kind of hard to be exploding by that...

Nah, I've modified the above code to add the |'s.  If I echo $line before exploding I get "3|110|5|bill, ted, eggbert", so I know the code works fine up to there.  Just, for some reason, exploding that string doesn't seem to yield the expected results.  Why would I wind up with a | for $line[1]?

hmm... okay, maybe I'm just using explode completely wrong.  I've tried a simplied version of my code.

<?php
$line = "3,110,5,bill";
explode(',',$line);
echo $line[1];
?>

Which outputs a single comma (,).  Why would this set the delimiter as an array value?  Such a noob...

If you had said that's what you wanted in the first place, all you have to do is put everything into an array and sort by weight, then height.

Should have been more specific.  So sort by weight, then by height, that should order all the common entries together.  I would still need a way to consolidate the common entries though.

well since it's it's sorted, they are grouped together in the sorted list.  All you would need to do is throw in a simple condition to see when the weight and height changes.  Start a new table in your loop when it happens, or however you want it to display.

This makes sense.  However, may source data has gotten a little more complicated.... I'm kicking myself now, but I sort of "dumbed down" my problem.  What I'm actually doing is trying to consolidate a list of insulated glass sizes.  Insulated glass has 2 panes, not necessarily the same size, so I have 2 widths and 2 heights.  This makes things messy, and it seems like the first solution would be best for this since it find exact matches.

 

Using the code posted by PrintF, I just need a simple way to figure out which entries aren't duplicates.

 

Sorry for not coming clean earlier  :)

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.