Jump to content

Create custom sort with usort


gnomeplanet

Recommended Posts

I am trying to create a custom sort. I realise that it will have to be a usort function, as none of the standard methods work.

abcd fgh
abcd0
abcd1
abcd2
abcd3
abcd10
abcd11
abcd22
abcdefgh
abcd'fgh
abcd-fgh

The above is the order that I need my sort to output. Please notice the following:

a/ the list may contain numerical, punctuation, or characters.

b/ in the order of sorting, a space comes first, then numerical, then letters, then punctuation.

c/ where there are numerical characters, they are sorted in numerical order.

 

A variety of programs, such as Lightroom, already sort in this order, but I need to emulate this is PHP.

 

I am hoping that someone may recognise the pattern, and so be able to help with a usort function to copy it.

 

Thanking you for your consideration.

 

Link to comment
Share on other sites

Hi Barand. Yes, I have tried natsort. It gives the following order:

abcd'fgh
abcd-fgh
abcd0
abcd1
abcd2
abcd3
abcd10
abcd11
abcd22
abcdefgh
abcd fgh

where the punctuation comes first, then the numbers (although the number order is correct), and then the letters, then the spaces.

 

The order that I need is a standard one for ?? programming language. Someone must recognise the order, I would have thought.

Edited by gnomeplanet
Link to comment
Share on other sites

Hmm, I worked on a usort solution that compares each character between words and almost finished it, but there was one problem. You don't want numbers treated as individual characters. So, "abc2" would come before "abc10". That makes it much more difficult. What if the values were:

 

abc10def

abc1def

abc2def

 

Does the "abc10def" come before "abc2def" or the other way around?

Link to comment
Share on other sites

On reading this post I was considering embarking on a natsort function with a custom collation sequence but chickened out. I decided I could live with natsort()'s default. I wish gnomeplanet luck in creating one.

 

If they ever bring back the competitions, this would be a good contender :)

Link to comment
Share on other sites

i would temporarily modify the 'collation' of the character set so that it will naturally sort. note: this will only work with the lower ascii character set as it uses the high/extended ascii characters to make the 'magic' work - 

<?php
$d[] = "abcd fgh";
$d[] = "abcd0";
$d[] = "abcd1";
$d[] = "abcd2";
$d[] = "abcd3";
$d[] = "abcd10";
$d[] = "abcd11";
$d[] = "abcd22";
$d[] = "abcdefgh";
$d[] = "abcd'fgh";
$d[] = "abcd-fgh";

// call back function to modify the 'collation' of the characters
function _collate($str){
    $arr = str_split($str);
    foreach($arr as $key=>$char){
        if($char == ' '){
            $arr[$key] = chr(0); // space -> null
        } else {
            if(!ctype_alnum($char)){
                $arr[$key] = chr(ord($char) + 128); // convert to high/extended ascii character
            }
        }
    }
    return implode($arr);
}

// call back function to restore the 'collation' of the characters
function _decollate($str){
    $arr = str_split($str);
    foreach($arr as $key=>$char){
        if($char == chr(0)){
            $arr[$key] = " "; // null -> space
        } else {
            if(ord($char) >= 128){
                $arr[$key] = chr(ord($char) - 128); // convert from high/extended ascii character
            }
        }
    }
    return implode($arr);
}

// source and result should look like this -
echo 'Source:<pre>',print_r($d,true),'</pre>';

shuffle($d); // make data random, for testing

$d = array_map('_collate',$d); // modify the 'collation' so it will natural case sort as expected

natcasesort($d);

$d = array_map('_decollate',$d); // restore the 'collation'

echo 'Result:<pre>',print_r($d,true),'</pre>';

you can also use this method in a usort call back function by altering the 'collation' of the two input values and using  strnatcasecmp() to perform the comparison on the two altered values to give the return value from the call back function.

Link to comment
Share on other sites

Thanks, guys. Its very encouraging to see people taking this question seriously. It was only after doing some text processing in PHP that I noticed that there were differences in a supposedly 'standard' alphabetic sort from one place to another!

 

 

ANSI value sort - PSPad

abcd fgh
abcd0
abcd1
abcd10
abcd11
abcd2
abcd22
abcd3
abcdefgh
abcd'fgh
abcd-fgh

ASCII value sort - PSPad

abcd fgh
abcd'fgh
abcd-fgh
abcd0
abcd1
abcd10
abcd11
abcd2
abcd22
abcd3
abcdefgh


NUMERIC VALUE sort - PSPad

abcd fgh
abcd0
abcd1
abcd10
abcd11
abcd2
abcd22
abcd3
abcdefgh
abcd'fgh
abcd-fgh

The above orders were observed in the PSPad text editor, which gives a variety of ways to sort a list (plus ascending/descending). Just to remind you, the way that follows is the way that Adobe Lightroom sorts the list:
 

abcd fgh
abcd0
abcd1
abcd2
abcd3
abcd10
abcd11
abcd22
abcdefgh
abcd'fgh
abcd-fgh

and as I am developing keyword lists for Lightroom, this is what I am trying to emulate in PHP. 'Psycho' adds another level of complication still! I guess the 'numbers first' rule is the one I prefer, to his list would sort as:
 

abc10def
abc1def
abc2def


Full marks to 'mac_gyver' for coming up with a working solution. The case of the letters is not important, as Lightroom ignores them anyway.

 

Don't go to sleep, yet....

New Problem: I have been playing around some more, thinking that all would be fixed now, but then I realised there were still discrepancies: look at this new list:
 

20 def
20-45
20-def
20c
20cdef
ab def
ab-def
abcd0
abcd1
abcd2
abcd12
abcdef

Note how a space precedes a hyphen precedes a letter all of the time..


but Lightroom sorts this new list as:
 

20 def
20-45
20c
20cdef
20-def
ab def
abcd0
abcd1
abcd2
abcd12
abcdef
ab-def

What can be going on now?! What rules are these guys working to??



 

Link to comment
Share on other sites

Personally, I thing you are taking the wrong approach. Trying to backward-engineer a non-trivial sort order may never be 100% accurate. There could be nuances with specific characters that would take a lot of time and effort to verify. Just not worth the effort.

 

I don't use Adobe Lightroom, so I can't provide specific instructions, but I think a possible better approach is to utilize AL to tell you what the sort order would be for a list of values. I would set up a page that takes a list of values as input and outputs the list using AL and apply the sort from AL. Then read the output of that page to determine how AL sorted the records. You can then apply a sort order attribute to your values. 

 

Hopefully, this is just needed one time when setting up new records and not needed every time data is displayed as that could incur unnecessary overhead.

Edited by Psycho
Link to comment
Share on other sites

Hi Psycho - yes, that's exactly what I have been doing: trying to think of a minimal yet suitable set of letter-number-punctuation combinations that would, when entered, illustrate the sort order that the Lightroom program produces. As far as I know, that last set of lines seem to show things off the best, though of course, I might be wrong about this. Have you a better idea to determine their full sort rules?

 

One thing we can be reasonably sure of: they are sorting according to a standard sort in whatever program they use to write the Lightroom program. I can see no reason why they would have decided to create some unique sort for their own purposes, therefore the sort is, to them, a standard one. Its just strange to us, because PHP doesn't provide a similar ordering. I was rather hoping that someone who is familiar with PHP and the ?? language would recognise the order being produced and be able to provide some insight.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.