Re: External sorting

new topic     » goto parent     » topic index » view thread      » older message » newer message

On Tue, 10 Jun 2003 23:23:12 +0000, Andy Serpa <ac at onehorseshy.com> wrote:

>
>
> Has anyone written anything to do external sorting?  That is, sorting 
> data in a file where the file is too large to fit entirely in memory? 
> [writing out to a new file in sorted order.]  Ricardo Forno suggests in 
> genfunc.e that his NWay sorts could be apapted to this task, but I'm too 
> dumb to determine how one would do it.
>
> Anyone?
>

This takes me back to my IBM mainframe days.

The general idea is to take into RAM as much as you dare, sort that and 
output it to a sort work file. You repeat this, creating as many sort work 
files as required until all the original (unsorted) data has been 
processed. This means you end up with a set of sorted files. Next, you take 
two (or more) files at a time and merge them, creating either the final 
file or additional work files. You repeat merging the work files until you 
have one completely sorted file.

There are a few optimisation possible, but this is the general idea.

To diagram this...

  [UNSORTED] --> (ram sorting) --> sortwk1, sortwk2, ...sortwkn

  [sortwk1] + [sortwk2] --> (merge phase) --> sortmg1
  [sortmg1] + [sortwk3] --> (merge phase) --> sortmg2
  . . . [sortmgn] + [sortwkn] --> (merge phase) --> final

-- 
Derek
-- 

cheers,
Derek Parnell

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu