1. Random.e - Euphoria Random Access Library

Today I finished a random access library and sent it to RDS.  This should
add another aspect to Euphoria programming.  Unlike the mydata.ex demo, this
does not store each record in the same sequence.  Instead, each sequence is
stored individually and marked by a '!' character.  I originally thought
having a '!' in a record would split it in half, but it turns out it does not.
Basically records are stored like this:

!(record)!(record)!(record)!(record)....

  To read data, the program goes and makes a map of the records.  The overhead
for making the map is very small.  I found that a database with 50,000 records
(each with three fields) takes about a third of a second to map.  Once it
makes the map it will seek to the desired field and get your data.  You can
also overwrite data.  Replace() is used when the records are all the same
size.  When the records have different lengths, over_write() is used.  This is
pretty slow because it has to first save the records after the one you're
writing to into a temporary variable, write your record, then rewrite the
records after that.  However, replace() will probably be used more since most
databases have records of the same length.
  I'm going to experiment with new ways to make the library faster, especially
the over_write() procedure.

Features include:
*  Multiple level sequences are supported
*  Records of Different Length
*  Stability
*  Easy to use

So what's the catch?
  The only catch is when you over_write() a record with one that is smaller,
it will leave garbage at the end.  However, this garbage does not intefere
with your data.  It is not mapped as a record, so it is skipped.  The only
effect is that it consumes say 18 bytes of wasted space in your datafile.  I
don't see this as a problem at all.  Once you write the majority of your data
to disk, only one collection of garbage will exist.  However, if you write
again, the garbage will get stuck, and another one could form at the end.  But
again, this is actually not a problem at all.  I figure if someone had a
database with 1,000 entries and overwrote 50 records, say half were smaller
and half were larger than the original records, odds are that there could be
no garbage.  And if there was, it would be probably less than one hunredth of
a percent of total file size.  Basically here's what happens though:

!{4,5,6,7,8}!{3,4,5,7,1}!{5,7,3,1,1}!  --DB w/ 3 entries

                {4,6,3}                     --smaller record is inserted at #2

!{4,5,6,7,8}!{4,6,3}!{5,7,3,1,1}1,1}  --1,1} is garbage from old last record

---Record {4,4,4,4,4} is appended to the end:

!{4,5,6,7,8}!{4,6,3}!{5,7,3,1,1}1,1}!{4,4,4,4,4}

  If you input record 3, it will return {5,7,3,1,1}, not {5,7,3,1,1}1,1}.  So
this will not corrupt data.  This garbage pile will of course go away if a
record is added before it that is bigger than the original by 4+ characters.

Derek Brown

new topic     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu