1. sort constants confusing
- Posted by petelomax Jun 09, 2017
- 2850 views
http://openeuphoria.org/docs/std_sort.html#_2964_sorting
I could not work out why there were 4 sort options. It reads, at best, like you have to use ASCENDING for sort() and NORMAL_ORDER for custom_sort().
As soon as I looked at the source, the penny dropped instantly.
The constants NORMAL_ORDER and REVERSE_ORDER are just aliases for ASCENDING and DESCENDING respectively, and behave identically.
Better yet, just bin two of them (I don't much care which).
Pete
2. Re: sort constants confusing
- Posted by DerekParnell (admin) Jun 12, 2017
- 2786 views
Agreed. Why they exist is a mystery. In any case I would have expected normal/abnormal or forward/reverse pairs instead.
3. Re: sort constants confusing
- Posted by Spock Jun 12, 2017
- 2778 views
Agreed. Why they exist is a mystery. In any case I would have expected normal/abnormal or forward/reverse pairs instead.
Stranger still is this sheepish note in the documentation for sort(), custom_sort() and sort_columns():
"This function uses the "Shell" sort algorithm. This sort is not "stable", i.e. elements that are considered equal might change position relative to each other"
Really? Does \std still have unstable sort routines after all these years or was it fixed and someone just needs to update the documentation? I thought we had thrashed this out in 2014? In fact, it was indicated as far back as 2011 but nobody did anything about it.
Spock
4. Re: sort constants confusing
- Posted by jimcbrown (admin) Jun 13, 2017
- 2744 views
I could not work out why there were 4 sort options. It reads, at best, like you have to use ASCENDING for sort() and NORMAL_ORDER for custom_sort().
As soon as I looked at the source, the penny dropped instantly.
The constants NORMAL_ORDER and REVERSE_ORDER are just aliases for ASCENDING and DESCENDING respectively, and behave identically.
Agreed. Why they exist is a mystery. In any case I would have expected normal/abnormal or forward/reverse pairs instead.
I traced the history of it.
ASCENDING/DESCENDING was CChris's idea - before that the code only sorted in ascending - http://scm.openeuphoria.org/hg/euphoria/annotate/d63e0d759325/include/std/sort.e#16
But he didn't invent the term - the initial commit for the file has the term 'ascending' in it - http://scm.openeuphoria.org/hg/euphoria/annotate/076e4e3c0f12/include/std/sort.e
I checked my copy of sort.e from RDS Eu 2.3 and it uses the same term. So ASCENDING goes back to RobCraig.
Derek was the one who added the custom_sort() aliases - and the misleading comments to go with it - http://scm.openeuphoria.org/hg/euphoria/annotate/1debb13c40cf/include/std/sort.e#24
Derek, why did you add a normal/reverse pair instead of your expected forward/reverse pair ?
5. Re: sort constants confusing
- Posted by DerekParnell (admin) Jul 02, 2017
- 2633 views
Derek, why did you add a normal/reverse pair instead of your expected forward/reverse pair ?
No idea now. Probably stupidity but that is not the salient lesson here, in my opinion.
The better question would be to ask why did such poor code get to stay in the distribution? I suspect that the answer to that relates to the lack of peer review before release packaging.
6. Re: sort constants confusing
- Posted by _tom (admin) Jul 04, 2017
- 2603 views
suggested | current |
---|---|
rising | ascending, normal_order |
falling | descending, reverse_order |
Can we standardize on rising and falling? These are the simplest English words I can suggest.
_tom
7. Re: sort constants confusing
- Posted by petelomax Jul 04, 2017
- 2610 views
- Last edited Jul 05, 2017
Can we standardize on rising and falling?
Let's not. There is nothing whatsoever wrong with any of the terms/constant names as given, the problem is just the implication that they are somehow different.
Having four constants implies (to my tiny brain) that there are four separate meanings, when in fact there are only two, which is as clear as can be in the source, but not in the documentation.
It is fair to say that someone would only ever notice this if they had not just looked at the source file.
Assuming that it is not practical to change eu2doc (or whatever it is called) to stop it from ripping out the "=" parts,
I think the simplest solution would just be to change, in std/sort.e itself:
--** The normal sort order used by the custom comparison routine. NORMAL_ORDER = ASCENDING,
to
--** The normal sort order used by the custom comparison routine. (same as ASCENDING) NORMAL_ORDER = ASCENDING,
and
--** Reverses the sense of the order returned by a custom comparison routine. REVERSE_ORDER = DESCENDING
to
--** Reverses the sense of the order returned by a custom comparison routine. (same as DESCENDING) REVERSE_ORDER = DESCENDING
I retract my suggestion to "bin two of em"
Pete
8. Re: sort constants confusing
- Posted by _tom (admin) Jul 04, 2017
- 2603 views
Simplify and add lightness. For an even smaller brain, ascending is more complex then rising; and such with descending vs falling. Nothing wrong with a total rewrite and rethink of the library.
We must include msort.e http://www.rapideuphoria.com/msort.zip in the standard library; possibly as a separate .e file.
Then, a clean way to present:
- qsort (quick an dirty unstable sort)
- insertion sort; tree/heap management (to maintain a sorted list)
- column sorting
- ????
I just downloaded all of the sort files from the Archive; this will take time to "sort out."
_tom
9. Re: sort constants confusing
- Posted by petelomax Jul 04, 2017
- 2616 views
I just downloaded all of the sort files from the Archive; this will take time to "sort out."
From March this year: http://rosettacode.org/wiki/Compare_sorting_algorithms%27_performance#Phix
10. Re: sort constants confusing
- Posted by _tom (admin) Jul 04, 2017
- 2573 views
Thanks Pete, now it will take even longer to "sort out."
There is another dimension to this stuff where an "unfashionable" algorithm is better on a small data-set (or some specific dataset) than a "fashionable" algorithm.
_tom
11. Re: sort constants confusing
- Posted by Spock Jul 04, 2017
- 2573 views
- Last edited Jul 05, 2017
Thanks Pete, now it will take even longer to "sort out."
There is another dimension to this stuff where an "unfashionable" algorithm is better on a small data-set (or some specific dataset) than a "fashionable" algorithm.
_tom
The central routine in msort.zip first checks the size of the data set and then selects one of two options:
n < 200 (call fast insertion sort)
else (call merge sort supplemented with fast insertion sort)
So whatever n there is the sort performance is always good, always stable. And without the stability property sorting by column is not guaranteed to be correct - an important goal in the real-world. Quicksort and Shellsort [Edit: Heapsort too] are not naturally stable sorts.
It would be interesting to compare the performance of the Phix code on Rosetta versus msort but.. I couldn't get the former to work (something about blah blah Visual .. 2013 blah blah) so I gave up.
Spock
12. Re: sort constants confusing
- Posted by petelomax Jul 04, 2017
- 2564 views
I couldn't get the former to work (something about blah blah Visual .. 2013 blah blah) so I gave up.
Yup. That would be msvcr120.dll not found, needed to run IUP. Sorry about that, not much I can do better.
It does at least suggest https://www.microsoft.com/en-us/download/details.aspx?id=40784 (a 6MB download)
Pete
13. Re: sort constants confusing
- Posted by Spock Jul 04, 2017
- 2546 views
I couldn't get the former to work (something about blah blah Visual .. 2013 blah blah) so I gave up.
Yup. That would be msvcr120.dll not found, needed to run IUP. Sorry about that, not much I can do better.
It does at least suggest https://www.microsoft.com/en-us/download/details.aspx?id=40784 (a 6MB download)
Pete
I installed it but the demo is still not running. Copying the dll to the various sub-folders of Phix didn't work either.
Spock
14. Re: sort constants confusing
- Posted by DerekParnell (admin) Jul 05, 2017
- 2494 views
suggested | current |
---|---|
rising | ascending, normal_order |
falling | descending, reverse_order |
Can we standardize on rising and falling? These are the simplest English words I can suggest.
I hope not. I've been giving this a bit more thought and I'm starting to think that normal / reverse are actually good choices.
In any given language, there is a collating order for characters (graphemes) that is accepted as the 'normal' way to do it, and the most common variation to the normal sequence is to 'reverse' it.
15. Re: sort constants confusing
- Posted by _tom (admin) Jul 07, 2017
- 2410 views
suggested | current |
---|---|
rising | ascending, normal_order |
falling | descending, reverse_order |
Can we standardize on rising and falling? These are the simplest English words I can suggest.
I hope not. I've been giving this a bit more thought and I'm starting to think that normal / reverse are actually good choices.
In any given language, there is a collating order for characters (graphemes) that is accepted as the 'normal' way to do it, and the most common variation to the normal sequence is to 'reverse' it.
OK.
We can standardize on NATURAL and REVERSE; the other choices will be there just there to help define "natural." We naturally write from left to right, hence small to large. But, we are accustomed to writing numbers themselves in their "natural" Arabic form!
- Output is in natural order which is "sorted from small to large (ascending or rising)."
- The optional output is reverse order which is "sorted from large to small (descending or falling)."
_tom
16. Re: sort constants confusing
- Posted by ne1uno Jul 07, 2017
- 2439 views
suggested | current | suggested | current |
---|---|---|---|
rising | ascending, normal_order | [quote _tom] [quote DerekParnell] [quote _tom] | rising |
ascending, normal_order | |||
falling | |||
descending, reverse_order |
Can we standardize on rising and falling? These are the simplest English words I can suggest. [/quote]
I hope not. I've been giving this a bit more thought and I'm starting to think that normal / reverse are actually good choices.
In any given language, there is a collating order for characters (graphemes) that is accepted as the 'normal' way to do it, and the most common variation to the normal sequence is to 'reverse' it.
[/quote]
OK.
We can standardize on NATURAL and REVERSE; the other choices will be there just there to help define "natural." We naturally write from left to right, hence small to large. But, we are accustomed to writing numbers themselves in their "natural" Arabic form!
- Output is in natural order which is "sorted from small to large (ascending or rising)."
- The optional output is reverse order which is "sorted from large to small (descending or falling)."
_tom
[/quote]
falling | descending, reverse_order |
Can we standardize on rising and falling? These are the simplest English words I can suggest.
I hope not. I've been giving this a bit more thought and I'm starting to think that normal / reverse are actually good choices.
In any given language, there is a collating order for characters (graphemes) that is accepted as the 'normal' way to do it, and the most common variation to the normal sequence is to 'reverse' it.
OK.
We can standardize on NATURAL and REVERSE; the other choices will be there just there to help define "natural." We naturally write from left to right, hence small to large. But, we are accustomed to writing numbers themselves in their "natural" Arabic form!
- Output is in natural order which is "sorted from small to large (ascending or rising)."
- The optional output is reverse order which is "sorted from large to small (descending or falling)."
_tom
natural is not a good choice. usually refers to a sort in numeric order try sorting 1, 2, 10, 20 you can't do it.
the normal ASCHII order is 1,10,2,20 there is a natural sort in the archive
17. Re: sort constants confusing
- Posted by Spock Jul 09, 2017
- 2374 views
- Last edited Jul 10, 2017
Can we standardize on rising and falling? These are the simplest English words I can suggest.
In any given language, there is a collating order for characters (graphemes) that is accepted as the 'normal' way to do it, and the most common variation to the normal sequence is to 'reverse' it.
True. The normal order for sorting is ascending. Sort({3,1,5,2,4}) ought to result in {1,2,3,4,5} without any need to explicitly specify the order.
The algorithm(s) in msort.zip all produce ascending order results. To have it descending I simply reverse the result, eg:
s = reverse( sort(s, n) ) -- n is column ['coz I have lots of column sorting processes]
However, because of the stability principle, this is not strictly the same as, eg:
s = sort(s, n, REVERSE)
Nevertheless, YAGNI and Occam's razor
So in the end that would lead to, eg:
s = sort(s) s = sort_reverse(s)
An optional parameter could give the desired column sorting.
Spock
18. Re: sort constants confusing
- Posted by kinz Jul 10, 2017
- 2313 views
I like the following forms of the sort command:
a=sort(b, UP) b=sort(a, DN) -- or DOWN a=sort(b, TO_MAX) b=sort(a, TO_MIN)
They are just self explained, IMHO.
Regards,
kinz
19. Re: sort constants confusing
- Posted by _tom (admin) Jul 12, 2017
- 2247 views
Base on the input from above...
My suggestion:
- leave the std/sort.e alone for compatibility reasons
eventually deprecate it? - in the demo folder: provide a set of sort algorithms (from allsorts and rosetta code) for those that have special requirements
- for stable sorting add: http://www.rapideuphoria.com/msort.zip from Mike
- for natural sorting add: http://www.rapideuphoria.com/nat_sort.zip from Andy Serpa
- The new sorting.e library will work on OE and Phix.
My working sorting definitions are:
stable
- as found in msort but not in std/sort.e
rank
- as output by the compare function (larger or smaller) or custom_compare
direction
- UP or DN (default small to large )
order
- the resulting sorted sequence
- default is numerical (mathematical number line)
- ASCII order for historical reasons
- unicode order is what UTF8 gives you
- natural order (as in nat_sort.zip)
- user defined for things like sorting playing cards
_tom
20. Re: sort constants confusing
- Posted by Spock Jul 12, 2017
- 2248 views
Base on the input from above...
My suggestion: .. for stable sorting add: http://www.rapideuphoria.com/msort.zip from Mike
Msort is also highly performant.
Spock
21. Re: sort constants confusing
- Posted by _tom (admin) Jul 19, 2017
- 2167 views
A demo of the new sorting include is: http://openeuphoria.org/pastey/293.wc
It provides stable sorting, column sorting, index sorting, along with number and natural order options. All sorts are in the up direction.
After changing a few identifiers this works on OE and Phix.
_tom
Forked into: Natural sort comparison
22. Re: sort constants confusing
- Posted by kinz Aug 01, 2017
- 2105 views
Tom wrote:
My working sorting definitions are:
Let's see.
stable
- as found in msort but not in std/sort.e
"Stable" doesn't seem to be one of the definitions at all.
Unstable is just bad. So "stable" is good? Or what?
So, maybe, msort is just one of the specific sortings?
And requires more specific definition?
rank
- as output by the compare function (larger or smaller) or custom_compare
Try to find rank in The Archive of RDS.
I found just "Frank Dowling".
Just 2 entries.
So, rank is not the Euphoria language word at all.
BTW, for example, I am the retired captain of the second rank
of Russian Navy, former submariner. Maybe....???
direction
- UP or DN (default small to large )
In classic EU (earlier 4), it is ORDER,
not direction. Direction is extra parameter.
Why?
EU works with numbers and only with numbers
and their values.
ERM reads:
"There aren't really any "characters" in Euphoria, just numbers (atoms)."
So, EU has just 2 kinds of sortings:
usual (just for numbers)
and
special (or specific) for the user defined types
(strings encoded as ascii, utf, koi8-r, unicode, etc etc).
Main EU sorting is usual, just for numbers, in two orders,
just "up" and "down".
order
- the resulting sorted sequence
- default is numerical (mathematical number line)
- ASCII order for historical reasons
- unicode order is what UTF8 gives you
- natural order (as in nat_sort.zip)
- user defined for things like sorting playing cards
No. After "default", i.e. usual, these are specific sortings, not orders.
Regards
kinz
23. Re: sort constants confusing
- Posted by _tom (admin) Aug 01, 2017
- 2104 views
The English language is not helping at all. We are using words with fuzzy meanings. The word "natural" seems to be a particular problem when used to describe sorting.
The solution is going to be that a few words will be chosen and then defined for use within Euphoria--ignoring the use (and misuse) of the same words elsewhere.
The idea of "stable sorting" is important. Download http://www.rapideuphoria.com/msort.zip. Spock has explained "stable" clearly in the docs that come with Mike Sort.
Here is another idea for your consideration:
To sort values is "to order values into numerical order from small to large."
To collate values is "to order values into a logical order from small to large."
- We sort numbers
There are no arguments as to what number value comes next. Some people call this "natural order". Sorting "up" from small to large is also "natural."
the eu:compare function will sort number values without problem
- we collate alphanumeric values
Collated data does not have to be in number order. When ascii encoded characters are sorted in numerical order the alphabetic order may not agree with everyone's idea of the correct order. Thus we want a "custom sort" that puts characters, into what is for characters, a "natural order". Some call this "lexical order" to distinguish it from number sorting.
We need a custom algorithm to collate values. Collating must consider ascii and utf encoding for characters.
The problem is if I just use "natural" you can not be sure what I mean unless you look at a definition I create for exclusive use in OE.
_tom
24. Re: sort constants confusing
- Posted by kinz Aug 15, 2017
- 1983 views
The English language is not helping at all...
Yes, Russian is much better, but English is not that bad too.
Do not try to define stable, natural, logical, ascii etc orders.
We have just two orders - UP and DOWN (DN), period.
Instead, describe the sortings using as many words as you need.
And dont worry be happy
Regards
kinz