Re: Sequences and long files

new topic     » goto parent     » topic index » view thread      » older message » newer message

From:    Irv

>There's also the problem of Euphoria's native sequence
>format: a 300 meg data file would probably take 2 or 3
>times that much disk space.
>If s = {"Now is the time",123}, -- 20 bytes more or less --
>it takes 63 bytes to store on disk with a print(fn,s)

     A few more instructions keeps the format readable by get() but saves a
good deal of disk space.  Use printf to print the strings as strings and
actually print out the quotes, braces, and commas.

puts( fn, "{" )
for whatever = 1 to NumberOfRecords do
   puts( fn, "{" )
   for count = 1 to length( record ) do
      data = record[count]
      if sequence( data ) then
               printf( fn, "\"%s\"", data )
      else
               printf( fn, "%d", data )
      end if
      if count != length( record ) then
         puts( fn, "," )
      end if
   end for
   puts( fn, "}" )
if whatever != NumberOfRecords then
   puts( fn, "," )
end if
end for
puts( fn, "}" )

     That's just my spur of the moment typing...it could probably be made
smaller and/or made into a general purpose recursive function.  Of course,
if you don't know that all your data is either strings or numbers then you'd
also have to check for sequences inside of sequences and recurse down
properly.  You could of course use some sort of compression on the actual
strings themselves before you save it too, if they're big..

     I still like the idea of an index at the end of the file better,
though.  Only keeping one record and the index in RAM.  Just have to rewrite
the file every x edits or whatever..  Simpler than keeping track of idle
time, I'd think.  But that would work too.

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu