Re: Natural sort comparison

new topic     » goto parent     » topic index » view thread      » older message » newer message

My preferred sort API might be (this is all draft, there may be niggles)

    sequence res = sort(sequence s, sequence options={}) 

where s is the data to be sorted and options is an even-length sequence with odd elements being recognised literal strings and even elements actual settings, eg

    ?sort(s,{"direction",-1,"compare",routine_id("my_compare")}) 

The available options would include at least:

even element odd element
"compare" routine_id("my_compare") where my_compare is a function that accepts two elements of s
"direction" +/-1 (look ma, no names to argue over)
"preproc" routine_id("my_preprocessor"), a function that accepts (idx, si), or a column set as per "return"
"return" -2, or {0..length(s[i])} ie: all, or a set of columns (aka retset)

As noted by Mike, "preproc" could be significantly more efficient than "compare"; it would be an error to have both.
In fact we might drop "compare" and only offer "preproc", and I now think we should.

This interface could/should effectively make custom_sort obsolete (albeit fiddly to convert compare->preproc).

I would use -2 to indicate "preproc not set", rather than -1 which could conflict with/hide errors from routine_id failures.

A preproc/retset element of 0 means "index".

I would also make it thread-safe, meaning you cannot have the file-level SortType/Column/Keys/CompareRtn.

The innards might look a bit like this:

 
    keys = s 
    if preproc!=-2 then 
        for i=1 to length(keys) do 
            if integer(preproc) then 
                keys[i] = call_func(preproc,{i,keys[i]}) 
            else 
                keys[i] = get_columns(keys[i],i,preproc) 
            end if 
        end for 
    end if 
    keys = actual_sort(keys) 
    if retset!=-2 then 
        for i=1 to length(keys) do 
            keys[i] = get_columns(keys[i],i,retset) 
        end for 
    end if 
    return keys     -- **NB** 

Note that if [parts of] the original s need to be returned, preproc must include them in keys, and "return" specified to extract them from and throw away the other working storage that formed part of keys[i].

A stable sort could be achieved by incorporating the index into keys.

I haven't attempted any of this at all, just a few ideas.

Pete

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu