Jump to content

Grouping in array or a better way


Go to solution Solved by gizmola,

Recommended Posts

I have an array of first and last names joined by an underscore (eg: Sally_Smith).

Ultimately, I want to count the number of people (elements) with the same last name.

My thinking is to run through the array and explode each element to isolate the last names, then use array_unique to create an array of $lastNameOnly and eventually use this as a counting mechanism against the original array..

Am I over-complicating this?

Link to comment
https://forums.phpfreaks.com/topic/327315-grouping-in-array-or-a-better-way/
Share on other sites

if all you want is a COUNT() per last name, you can do that in the query that's getting the data, with a GROUP BY last_name term.

if you want the name data and a count for each last name, i would index/pivot the data using the last name as the main array index when you fetch the data, from wherever it is coming from.

18 minutes ago, mac_gyver said:

if all you want is a COUNT() per last name, you can do that in the query that's getting the data, with a GROUP BY last_name term.

if you want the name data and a count for each last name, i would index/pivot the data using the last name as the main array index when you fetch the data, from wherever it is coming from.

From the way the question was worded, it doesn't appear that the data is in a relational database.

45 minutes ago, mac_gyver said:

if you want the name data and a count for each last name, i would index/pivot the data using the last name as the main array index when you -

 

12 minutes ago, phppup said:

generated (it) from other PHP coding.

if you post the code generating the array, someone could post an example, instead of just writing about how to do it.

Edited by mac_gyver
  • Solution
52 minutes ago, phppup said:

I have an array of first and last names joined by an underscore (eg: Sally_Smith).

Ultimately, I want to count the number of people (elements) with the same last name.

My thinking is to run through the array and explode each element to isolate the last names, then use array_unique to create an array of $lastNameOnly and eventually use this as a counting mechanism against the original array..

Am I over-complicating this?

To a degree, because you can create your desired list/count in one pass through the array.  

<?php

$names = ['bob_jones', 'sam_smith', 'jane_doe', 'john_smith', 'jill_jackson', 'matt_jones', 'john_doe', 'emily_smith'];
$lastNames = [];

foreach ($names as $name) {
    $lname = substr($name, strpos($name, '_') +1);
    if (array_key_exists($lname, $lastNames)) {
        $lastNames[$lname]++;
    } else {
        $lastNames[$lname] = 1;
    }
}

var_dump($lastNames);

 

Hopefully, it's clear to you that Barand's solution and the one I presented are the same, only Barand used a 100% functional approach, and a lambda (anonymous function) passed to array_map.  

Compare/contrast the two solutions, and if you can understand them both, you'll have a good basis for solving future problems like this.

One thing I would note about these solutions, is that any solution is only as good as its suitability to the data.  For example, if the names can have suffixes like jr, sr, 3rd etc, you'll probably not get what you want.  They could be improved to handle those situations, should the data you're working with warrant it.

Using PHPUnit to create unit tests for code like this is a really valuable investment, if you care about testing, maintainability and quality.

  • 3 months later...

@gizmola  Sorry for the delay in my return to comment (but had some personal situations), and yes, I did realize the similarity.

I very much wanted to use Barand's method (to try something new and more brief) but it wouldn't operate (I think an extra parenthesis) and the FOREACH solution was more in line with my understanding and existing code.

If I recall correctly, I adapted this method successfully.

Thanks to all that helped.

I just want to add a different opinion to the subject. I am more into speed and efficiency and i am always willing to rewrite my own code to find faster and better ways of accomplishing the same tasks. I've spent three years rewriting my website code and it is faster, smarter and better than all of the previous code. I do hate the fact that we do not yet have vision code or machine learning code in php (built-in features added to the language). Vision would be nice here but coding it will be a pain.

Anyway, i hate it when people make unnecessary variables and also use long variable names. Each letter used in a variable name is stored in memory. Memory saving code is always better, no matter what someone else will say. I have started using single letter variables and i love it. The code is cleaner and easier to read. Sometimes a legend is necessary as a multi-line comment but the single character variable works.

OP: The names array should not be coded in such a manner if it is not to be used in the manner that it is coded (bob_smith will be used as bob_smith). I think that once you start finding reasons to separate the names (last name only), you will discover that this method is not a good idea. I do not know why php coders waste memory with variables and variable names but complain about multi-dimensional arrays. I would rather see someone use a database or a multi-dimensional array when it comes to names. Gizmola pointed out rank, such as Senior (Sr.) or Junior (Jr.), which is a good point. Other naming conventions will also cause a problem. I think that anyone in this position should use a multi-dimensional array which separates first and last names, stores titles and ranks et cetera. Then finding matching names is easier and it simplifies performing other data crunching ideas that have yet to be thought of. And it becomes easier to add info to the arrays or remove info from the arrays.

as a note for users that want to save memory, i am including example code that reuses the foreach variable, rather than creating a second or third variable. Also, this code is 2-3 milliseconds faster. I am certain that this code can also be tweaked for speed and memory but it serves as a model of code reusability and memory-saving techniques. I'd like to see more people writing faster, smarter, better code - myself included.

$names = ['bob_jones', 'sam_smith', 'jane_doe', 'john_smith', 'jill_jackson', 'matt_jones', 'john_doe', 'emily_smith'];
$sn = []; //surnames
foreach ($names as $x) {
    $x = substr($x, strpos($x, '_') +1);
    empty($sn[$x]) ? $sn[$x] = 1 : $sn[$x]++;
}
print_r($sn);
	
Edited by jodunno
formatting

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.