RE: External sorting

new topic     » goto parent     » topic index » view thread      » older message » newer message

Well, I once wrote an external sort for a machine that had not one (remember
Cromemco computers, similar to PC's in the 80's, running Cromix, an
Unix-like OS?). I used a technique very similar to the NWay EU sort you
mention. There are many books explaining how to do that, of which I can
recommend Donald Knuth's "The Art of Computer Programming" (regrettably he
uses not a high-level language but a strange all-purpose Assembly invented
by him, called "MIX").
I promise to write such an external sort, not in the near future... Don't
take me too seriously in this respect, please ;)
Andy, according to what you usually post, you are by no means dumb. I think
once you read a good book or paper on external sorting, you'll be able to
program it without too much effort.
But let me briefly explain here how would I do that.
1) Read in the input file while it fits into memory (to be on the safe side,
use only 1Mb of memory, for example).
2) Internally sort the chunk that was read.
3) Write it to a temporary file, saving the starting and ending addresses in
a vector (sequence).
4) Repeat 1 - 3 until all of the input file was read.
5) Read in the chunks one record at a time, using the address vector
developed earlier to figure where to read and when to stop, and merge them
using a technique similar to the NWAY sorting. Write the resulting data to
the final output file.

What can be a bit complex is how to generalize the sort regarding possible
key positions, multiple keys, data positions, order (ascending /
descending), comparison options, etc., but Euphoria will help here, having
only a data type and the possibility of custom-comparison.

Regards.
----- Original Message -----
From: Andy Serpa <ac at onehorseshy.com>
To: EUforum <EUforum at topica.com>
Sent: Tuesday, June 10, 2003 8:23 PM
Subject: External sorting


>
>
> Has anyone written anything to do external sorting?  That is, sorting
> data in a file where the file is too large to fit entirely in memory?
> [writing out to a new file in sorted order.]  Ricardo Forno suggests in
> genfunc.e that his NWay sorts could be apapted to this task, but I'm too
> dumb to determine how one would do it.
>
> Anyone?
>
>
>
> TOPICA - Start your own email discussion group. FREE!
>
>

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu