Jump to content

more efficient way? <---ambiguous title ftw


Recommended Posts

Let me just state from the beginning, I already have a solution to this problem.  I'm just curious if anybody else can come up with a more efficient way of doing it. 

 

Okay, for the sake of simplicity, let's say I pick a random number from 1-10 five times, like this:

 

for ($x = 0; $x < 5; $x++){
   $list[] = rand(1,10);
} // end for $x

 

So now I have a $list of randomly generated numbers, and numbers can be in any order, and there can be duplicate numbers.  Couple of list examples:

9,1,10,9,5
8,7,5,10,7
6,7,10,8,3
4,3,1,4,1
1,4,9,1,1
etc..

 

Now, I want to find out how many of each number there is, and I want it to be sorted highest to lowest, first by group count, then by number. So for instance, if I had this list:

 

9,1,10,9,5

 

it would be sorted as such:

 

9 => 2

10 => 1

5 => 1

1 => 1

 

Another (perhaps easier) way of explaining it, is if I have a db table with 1 column called num and each row contains a random number from 1 to 10, doing the following query would accomplish that ^^ :

 

select num, count(num) as c from table group by num order by c desc, num desc

 

Well my generated list of numbers is not coming from a database, so I have no desire to put the numbers in a database just to run a query and pull them back out.  In other words, I want to do it with php. 

 

Okay, so here is what I came up with:

 

function groupSort($array) {
   $groups = array_count_values($array);
   foreach($groups as $k => $v) {
      $tGroups[$v][] = $k;
   } // end foreach $groups
   foreach($tGroups as $key => $val) {
      rsort($val);
      foreach ($val as $v) {
         $result[$v] = $key;
      } // end foreach $val
   } // end foreach $tGroups
   return $result;
} // end groupSort

 

As mentioned, it works.  It does exactly what I want.  I'm just curious if it can be done better.

Link to comment
https://forums.phpfreaks.com/topic/143848-more-efficient-way/
Share on other sites

If I understand how $array is supposed to look when passed to it, I don't get output like you said you wanted:

 

 

<?php

$group = array();

/*for($i = 0; $i < 10; ++$i) {
$group[] = rand(0,3); //make sure to get dupes
}*/

$group = array(1,1,2,2,2,3,3,5);

print_r(groupSort($group));

function groupSort($array) {
   $groups = array_count_values($array);
   foreach($groups as $k => $v) {
      $tGroups[$v][] = $k;
   } // end foreach $groups
   foreach($tGroups as $key => $val) {
      rsort($val);
      foreach ($val as $v) {
         $result[$v] = $key;
      } // end foreach $val
   } // end foreach $tGroups
   return $result;
} // end groupSort

 

Output:

Array

(

    [3] => 2

    [1] => 2

    [2] => 3

    [5] => 1

)

 

 

 

Here is my solution:

 

<?php

$group = array();

/*for($i = 0; $i < 10; ++$i) {
$group[] = rand(0, 3);
}*/

$group = array(1,1,2,2,2,3,3,5);

$START = microtime(true);

for($i = 0; $i < 1000000; ++$i) {
groupSort($group);
}

$END = microtime(true);

echo ($END - $START) . PHP_EOL;

sleep(1);

$START = microtime(true);

for($i = 0; $i < 1000000; ++$i) {
groupSort2($group);
}

$END = microtime(true);

echo $END - $START;

function kvsort($a1, $a2) {
//$ax[0] is the number and $ax[1] is the freq

/*
	If the numbers are random, you may want the == condition first since it could potentially be more common than the > or <
	since random numbers should theoretically be distributed evenly
*/
if($a1[1] < $a2[1])
	return 1;
if($a1[1] > $a2[1])
	return -1;
return ($a1[0] < $a2[0]) ? 1 : -1;
}

function groupSort2($array) {
$freq = array_count_values($array);
$container = array();
foreach($freq as $n => $f) {
	$container[] = array($n, $f);
}
//Is usort slow?  Something tells me it may be, but something else tells me
//it shouldn't be any slower than sort() except for the overhead of an extra function call
//(as opposed for the code being contained in sort())
usort($container, "kvsort");
$r = array();
foreach($container as $v) {
	$r[$v[0]] = $v[1];
}
return $r;
}

function groupSort($array) {
   $groups = array_count_values($array);
   foreach($groups as $k => $v) {
      $tGroups[$v][] = $k;
   } // end foreach $groups
   foreach($tGroups as $key => $val) {
      rsort($val);
      foreach ($val as $v) {
         $result[$v] = $key;
      } // end foreach $val
   } // end foreach $tGroups
   return $result;
} // end groupSort

 

 

Running very basic/crude benchmarks now out of curiosity.

Link to comment
https://forums.phpfreaks.com/topic/143848-more-efficient-way/#findComment-754823
Share on other sites

oops, I forgot to post my final function.  It's missing one line (the arsort towards the top):

 

function groupSort($array) {
   $groups = array_count_values($array);
   arsort($groups);
   foreach($groups as $k => $v) {
      $tGroups[$v][] = $k;
   } // end foreach $groups
   foreach($tGroups as $key => $val) {
      rsort($val);
      foreach ($val as $v) {
         $result[$v] = $key;
      } // end foreach $val
   } // end foreach $tGroups
   return $result;
} // end groupSort

Link to comment
https://forums.phpfreaks.com/topic/143848-more-efficient-way/#findComment-754836
Share on other sites

Just in case you were wondering (and couldn't tell by looking), my version is much slower.

 

 

 

It starts out around 4x slower with small sets of numbers, and as the sets grow, so does the slower-ness.  With a set of 1000 numbers, my function is about 19 times slower.

 

 

lol.  Apparently usort is slow, like I had assumed.  Also, reassigning an array of 1000 values probably takes a bit of time (both times... once for the container, and once to extract from the container).

Link to comment
https://forums.phpfreaks.com/topic/143848-more-efficient-way/#findComment-754847
Share on other sites

Haha damn... I was originally looking into using usort or some other u___ function when trying to work my code out, but it just seemed like it would just be adding some sort of middle-man into the mix.  I kind of figured it would be slower; didn't think it would be that much, lol.

 

Well anyways, I'm happy with the function I came up with.  Just thought maybe there was some easier way and I was over-thinking it. 

Link to comment
https://forums.phpfreaks.com/topic/143848-more-efficient-way/#findComment-754854
Share on other sites

Yeah, going into it, I figured usort would be slower.  With usort, there is an extra function call for each data pair, and of course function calls can add up very quickly.

 

 

Your function actually scales very well.  No offense, but I wouldn't expect it to ;p.

 

 

Wow, just did some very crude tests, and your function scales amazingly well.

 

For 1000 calls to your function,

 

With an array of 10 elements:  0.025460004806519

For an array of 1000 elements: 2.8048558235168

 

 

 

Link to comment
https://forums.phpfreaks.com/topic/143848-more-efficient-way/#findComment-754861
Share on other sites

In real life, the "variable length" of the list will only be 5-10 numbers long and the numbers aren't quite as random as that example $list, so I'm not necessarily concerned about scaling.  At least, not from that dimension.  But, the random number picking/sorting will have a variable length of times it will be ran, upwards of millions and millions, so it's good to know it scales well.

Link to comment
https://forums.phpfreaks.com/topic/143848-more-efficient-way/#findComment-754867
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.