gnomeplanet Posted April 28, 2015 Share Posted April 28, 2015 I am trying to create a custom sort. I realise that it will have to be a usort function, as none of the standard methods work. abcd fgh abcd0 abcd1 abcd2 abcd3 abcd10 abcd11 abcd22 abcdefgh abcd'fgh abcd-fgh The above is the order that I need my sort to output. Please notice the following: a/ the list may contain numerical, punctuation, or characters. b/ in the order of sorting, a space comes first, then numerical, then letters, then punctuation. c/ where there are numerical characters, they are sorted in numerical order. A variety of programs, such as Lightroom, already sort in this order, but I need to emulate this is PHP. I am hoping that someone may recognise the pattern, and so be able to help with a usort function to copy it. Thanking you for your consideration. Quote Link to comment Share on other sites More sharing options...
Barand Posted April 28, 2015 Share Posted April 28, 2015 Have you tried natsort Quote Link to comment Share on other sites More sharing options...
gnomeplanet Posted April 28, 2015 Author Share Posted April 28, 2015 (edited) Hi Barand. Yes, I have tried natsort. It gives the following order: abcd'fgh abcd-fgh abcd0 abcd1 abcd2 abcd3 abcd10 abcd11 abcd22 abcdefgh abcd fgh where the punctuation comes first, then the numbers (although the number order is correct), and then the letters, then the spaces. The order that I need is a standard one for ?? programming language. Someone must recognise the order, I would have thought. Edited April 28, 2015 by gnomeplanet Quote Link to comment Share on other sites More sharing options...
Psycho Posted April 28, 2015 Share Posted April 28, 2015 Hmm, I worked on a usort solution that compares each character between words and almost finished it, but there was one problem. You don't want numbers treated as individual characters. So, "abc2" would come before "abc10". That makes it much more difficult. What if the values were: abc10def abc1def abc2def Does the "abc10def" come before "abc2def" or the other way around? Quote Link to comment Share on other sites More sharing options...
Barand Posted April 28, 2015 Share Posted April 28, 2015 On reading this post I was considering embarking on a natsort function with a custom collation sequence but chickened out. I decided I could live with natsort()'s default. I wish gnomeplanet luck in creating one. If they ever bring back the competitions, this would be a good contender Quote Link to comment Share on other sites More sharing options...
mac_gyver Posted April 28, 2015 Share Posted April 28, 2015 i would temporarily modify the 'collation' of the character set so that it will naturally sort. note: this will only work with the lower ascii character set as it uses the high/extended ascii characters to make the 'magic' work - <?php $d[] = "abcd fgh"; $d[] = "abcd0"; $d[] = "abcd1"; $d[] = "abcd2"; $d[] = "abcd3"; $d[] = "abcd10"; $d[] = "abcd11"; $d[] = "abcd22"; $d[] = "abcdefgh"; $d[] = "abcd'fgh"; $d[] = "abcd-fgh"; // call back function to modify the 'collation' of the characters function _collate($str){ $arr = str_split($str); foreach($arr as $key=>$char){ if($char == ' '){ $arr[$key] = chr(0); // space -> null } else { if(!ctype_alnum($char)){ $arr[$key] = chr(ord($char) + 128); // convert to high/extended ascii character } } } return implode($arr); } // call back function to restore the 'collation' of the characters function _decollate($str){ $arr = str_split($str); foreach($arr as $key=>$char){ if($char == chr(0)){ $arr[$key] = " "; // null -> space } else { if(ord($char) >= 128){ $arr[$key] = chr(ord($char) - 128); // convert from high/extended ascii character } } } return implode($arr); } // source and result should look like this - echo 'Source:<pre>',print_r($d,true),'</pre>'; shuffle($d); // make data random, for testing $d = array_map('_collate',$d); // modify the 'collation' so it will natural case sort as expected natcasesort($d); $d = array_map('_decollate',$d); // restore the 'collation' echo 'Result:<pre>',print_r($d,true),'</pre>'; you can also use this method in a usort call back function by altering the 'collation' of the two input values and using strnatcasecmp() to perform the comparison on the two altered values to give the return value from the call back function. Quote Link to comment Share on other sites More sharing options...
gnomeplanet Posted April 29, 2015 Author Share Posted April 29, 2015 Thanks, guys. Its very encouraging to see people taking this question seriously. It was only after doing some text processing in PHP that I noticed that there were differences in a supposedly 'standard' alphabetic sort from one place to another! ANSI value sort - PSPad abcd fgh abcd0 abcd1 abcd10 abcd11 abcd2 abcd22 abcd3 abcdefgh abcd'fgh abcd-fgh ASCII value sort - PSPad abcd fgh abcd'fgh abcd-fgh abcd0 abcd1 abcd10 abcd11 abcd2 abcd22 abcd3 abcdefgh NUMERIC VALUE sort - PSPad abcd fgh abcd0 abcd1 abcd10 abcd11 abcd2 abcd22 abcd3 abcdefgh abcd'fgh abcd-fgh The above orders were observed in the PSPad text editor, which gives a variety of ways to sort a list (plus ascending/descending). Just to remind you, the way that follows is the way that Adobe Lightroom sorts the list: abcd fgh abcd0 abcd1 abcd2 abcd3 abcd10 abcd11 abcd22 abcdefgh abcd'fgh abcd-fgh and as I am developing keyword lists for Lightroom, this is what I am trying to emulate in PHP. 'Psycho' adds another level of complication still! I guess the 'numbers first' rule is the one I prefer, to his list would sort as: abc10def abc1def abc2def Full marks to 'mac_gyver' for coming up with a working solution. The case of the letters is not important, as Lightroom ignores them anyway. Don't go to sleep, yet.... New Problem: I have been playing around some more, thinking that all would be fixed now, but then I realised there were still discrepancies: look at this new list: 20 def 20-45 20-def 20c 20cdef ab def ab-def abcd0 abcd1 abcd2 abcd12 abcdef Note how a space precedes a hyphen precedes a letter all of the time..but Lightroom sorts this new list as: 20 def 20-45 20c 20cdef 20-def ab def abcd0 abcd1 abcd2 abcd12 abcdef ab-def What can be going on now?! What rules are these guys working to?? Quote Link to comment Share on other sites More sharing options...
Psycho Posted April 29, 2015 Share Posted April 29, 2015 (edited) Personally, I thing you are taking the wrong approach. Trying to backward-engineer a non-trivial sort order may never be 100% accurate. There could be nuances with specific characters that would take a lot of time and effort to verify. Just not worth the effort. I don't use Adobe Lightroom, so I can't provide specific instructions, but I think a possible better approach is to utilize AL to tell you what the sort order would be for a list of values. I would set up a page that takes a list of values as input and outputs the list using AL and apply the sort from AL. Then read the output of that page to determine how AL sorted the records. You can then apply a sort order attribute to your values. Hopefully, this is just needed one time when setting up new records and not needed every time data is displayed as that could incur unnecessary overhead. Edited April 29, 2015 by Psycho Quote Link to comment Share on other sites More sharing options...
gnomeplanet Posted April 30, 2015 Author Share Posted April 30, 2015 Hi Psycho - yes, that's exactly what I have been doing: trying to think of a minimal yet suitable set of letter-number-punctuation combinations that would, when entered, illustrate the sort order that the Lightroom program produces. As far as I know, that last set of lines seem to show things off the best, though of course, I might be wrong about this. Have you a better idea to determine their full sort rules? One thing we can be reasonably sure of: they are sorting according to a standard sort in whatever program they use to write the Lightroom program. I can see no reason why they would have decided to create some unique sort for their own purposes, therefore the sort is, to them, a standard one. Its just strange to us, because PHP doesn't provide a similar ordering. I was rather hoping that someone who is familiar with PHP and the ?? language would recognise the order being produced and be able to provide some insight. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.