Jump to content

ksort issue


ejaboneta

Recommended Posts

I'm having a weird issue with ksort. When there's a '+' in the key, it throws off the array order. (See example at http://codepad.org/iNCUGe07)

for example:

 

<?php
$array = array(
'F+' => 'F+',
'N' => 'N',
'A' => 'A',
'M' => 'M', 
'B' => 'B', 
'D' =>  'D',
'G' => 'G', 
'C' => 'C', 
'F' => 'F',
'K' => 'K',
'0' => '0',
);
ksort($array);
print_r($array);
?>

 

I'd expect this:

Array
(
    [0] => 0
    [A] => A
    [b] => B
    [C] => C
    [D] => D
    [F] => F
    [F+] => F+
    [G] => G
    [K] => K
    [M] => M
    [N] => N
)

 

But what I get is this:

    [A] => A
    [C] => C
    [0] => 0
    [b] => B
    [D] => D
    [F] => F
    [F+] => F+
    [G] => G
    [K] => K
    [M] => M
    [N] => N

 

What's wrong?

 

Update: I've figured out its a combination of 0 index and keys more than one character long, not the '+'.

Link to comment
Share on other sites

I haven't researched the exact reason, but it seems it's due to the sorting algorithm not using a "natural" order. You can overcome the problem by using natcasesort with a double dose of array_flip. If the keys and values are always the same, you don't need array_flip() at all, you can just natcasesort() the array.

 

$array = array_flip($array);
natcasesort($array);
$array = array_flip($array);

echo '<pre>';
print_r($array);
echo '</pre>';

Link to comment
Share on other sites

Known bug. But it's not about the plus - it has to do partly with the "0" key (try sorting without it). ksort() has problems of the "undefined behavior" type when you mix integer/integer-string and string keys* but there's more to it than just that.

 

You can work around it by passing in a sort flag.

ksort($array, SORT_STRING);

 

* Array keys that are integer strings, like "0" and "123", are silently converted to actual numeric keys (0 and 123).

var_dump(array_keys(array("0" => "zero")));
// [0]=>int(0)

Link to comment
Share on other sites

I love delving into these sorts of problems and I think I know what it is so I figure I might as well share:

(this is all according to the PHP 5.4.0 source code but I'm sure the relevant bits haven't changed in a while)

(note that ksort() and sort() and friends work exactly the same way except for what bits they're looking at - ksort() looks at keys while sort() looks at values)

 

When built-in sorting functions do a normal sort they run a comparison function against pairs of values - exactly how the u*sort()s work with your own comparison function. The default function (sort_flags=SORT_REGULAR) is the first half of the issue. It has a long list of actions it takes based on the types of the arguments. For instance, if the two are long* then it compares them as longs, and if one's double* while the other is long then it compares them as doubles. There is nothing that compares strings versus longs so that case falls into the default action. Said default is to cast both to numbers (either long or double as needed) and compare that way.

The second half of the issue is how PHP (Zend, actually) does sorting: quicksort**. Basically the list is sorted into two parts where the first is all the pivot. The pivot then gets placed right in the middle. The two parts are sorted the same way recursively until the base case where there are only a couple items.

 

Here's where it gets unpredictable. When the array consists of mixed strings and numbers the comparisons may contradict each other. The simplest example is an array of three items:

array("ABC", 123, "789")

There are three different comparisons:

"ABC"  123  "789" 

Thus a contradiction: ABC 
 
Quite reasonable, quicksort does not look for contradictions like this because it assumes that the comparison function cannot possibly create any. Once compared it doesn't go back over the list, thus there are no infinite loops of constantly moving list items around as the function dictates. The end result is undefined behavior because the same array items in varying orders will create different results.
[code]"ABC",  123,  "789" => "ABC",  123,  "789"
"ABC", "789",  123  =>  123,  "789", "ABC
123,  "ABC", "789" => "789", "ABC",  123
Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.