1. Berkeley DB with Euphoria

Hello,

I would really like to be able to make a Euphoria wrapper for the 
Berkeley DB, which is open-source and found at:

http://www.sleepycat.com/

It looks like it would be sort of a high-end version of the EDS system 
in that the data is stored in key/value pairs and the key & value can be 
anything.  A Euphoria wrapper could probably easily be made to use 
similar functions as the EDS system, making it an easy transition.  (My 
project is a little too large for EDS, but I would like the same 
flexibility.)

Problem is the source (for Win32 anyway -- it is cross-platform) is 
set-up for Visual C++ 6.0, which I do not have, and I certainly don't 
have the knowledge/ability to attempt to compile it with something else 
like Borland.  I was able to find some pre-compiled binaries for some 
older versions on the net, but the latest version (4.1.x) has changed 
the file format somewhat so I'd like to use that.

By any chance have any of you used/compiled this?

Barring that unlikely event, are there any *extremely* generous souls 
out there with VC++ 6.0 that would like to build it?  (In which case 
I'll make a wrapper, and then I'll upload that & the binaries to RDS so 
anyone can use.)

-- Andy

new topic     » topic index » view message » categorize

2. Re: Berkeley DB with Euphoria

Andy Serpa writes:
> Well, 1.2 GB is too big.  It is not the size so much 
> as the speed as EDS gets slower and slower as it grows...

I recently speeded up my copy of EDS.
It reads records twice as fast as before, and it
inserts and deletes into huge (100,000 record)
databases 3 times as fast. Even the released
EDS can insert/delete many times per second
on huge databases on a slow machine. 
If you acquire new records via a human entering 
data into a GUI, you'll never notice the time.

But if you are starting at 1.2GB and growing 0.1 GB/month
I'd be a bit worried about the 4GB limit. Of course
you could consider creating several separate databases
of 4Gb or less each.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

3. Re: Berkeley DB with Euphoria

Matthew Lewis writes:
> Ooh, ooh!  When can we see it?!

I've sent you a copy.
I'll release it as part of version 2.4.
I want to test it some more before
I put a lot of people at risk of losing their data.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

4. Re: Berkeley DB with Euphoria

Rudy Toews writes:
> ...could something similar be a feature of EDS? having the 
> different open databases /selected tables use their own 
> 'buffers' without explict 'selecting' them?
> i am currently building 1 database and many tables 
> but would like to break it up somewhat. and still move data 
> between them easier.

At the very outset, in designing EDS, I had a choice
of:
    1. have the user specify the database & table 
        on each operation, 
or:
    2. have a "context" where the database and
        table were established beforehand, and all database
       operations would have shorter parameter lists.

I went with #2. I don't regret it, but there are situations
where it is inconvenient. 

When you select a table, EDS reads in 4 bytes per record
in that table. This is inefficient when you have a lot of records
(over 1000 say) and you are trying to rapidly flip back and forth 
between tables. Obviously some buffering would help,
i.e. keep the record pointers in memory. Maybe I'll do something
about this. Note that it's dangerous to modify part of the data base
structure in memory only, in case your program crashes 
and the database on disk is left in an inconsistent (corrupted) state.
Simply keeping a copy of the record pointers in memory 
should be safe as long as other processes are locked out from
making changes. There's also the issue of how much memory
this might require. If you have lots of tables with lots of records
you might run out of memory. 

If you look at db_compress() you'll see that it copies records,
between tables, 20 at a time. That's another approach 
to the problem.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

5. Re: Berkeley DB with Euphoria

Irv Mullins writes:

> include database.e as dbA
> include database.f as dbB
> ...
> Can you depend on this to always work properly? 
> I don't know, maybe Rob can answer.

Yes it will work.
You'll have two independent sets of database variables.

You can also switch back and forth between databases
with db_select(). It takes almost zero time. It's db_select_table()
that can be slow if you have a large table. Yesterday
I speeded up db_select_table() quite a bit - no buffering 
but faster code.
 
I've now added a lot of speedups to EDS.
Some were distributed to a few people recently.
I did some more very recently. 
I  used profile_time in deciding what to speed up.

Rather than have a bunch of people ask for the latest
version, I've decided to place it in Recent User contributions.
Just keep in mind that it hasn't been tested as much as
the official 2.3 released version. 

The .zip also contains a new misc.e with pretty_print().
pretty_print() is now used by db_dump().

-----------------------------------------

database.e speedups:

   - Keys and records are read faster due to a faster decompress() routine.
     Almost twice as fast when the key or record data to be retrieved
     consists mainly of sequences of characters or small integers.
     This case is quite common.

   - allocating new space in a database is much faster, up to 4x faster,
     especially in large databases with a large list of free blocks

   - Inserting and deleting records in huge tables is now much faster.
     database.e is now about 25% faster for a table with 10,000 records
     and over 3x faster for a table with 100,000 records.
     
   - db_select_table() is faster

   - get4() is faster which speeds up almost everything
      to some extent


Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

6. Re: Berkeley DB with Euphoria

Or use my namespace parser, which will convert a file included under
2 namespaces into 2 instances. (Personally I'd like to see the features
of my namespace parser remade in the interpreter itself, however Rob
indicated that having too many features might confuse new users.)

jbrown

On  0, Andy Serpa <renegade at earthling.net> wrote:
> 
> 
> irv at take.maxleft.com wrote:
> > 
> > Andy Serpa wrote:
> > 
> > > Yeah, exactly, so in my example of including database.e twice as "dbA" 
> > > and "dbB" you could:
> > > 
> > > dbA:db_open("database A")
> > > dbB:db_open("database B")
> > > 
> > > Now BOTH databases will be open and current so instead of using 
> > > db_select() and switching back & forth you just use the appropriate 
> > > "dbA:" or "dbB:" in front of the proper one.  Sort of a quick & dirty 
> > > way to get separate "instances" in a OOP-like fashion...
> > > 
> > > (Unless there is some reason that won't work?  Like I said, I haven't 
> > > actually tried it.)
> > 
> > It won't work, because Rob's implementation of namespacing doesn't 
> > create multiple instances of an include, it just makes "aliases" for 
> > one single instance. I, for one, think this is unfortunate.
> > 
> > Therefore, no matter what you prefix the db_* commands with, they 
> > still use the single copy of database.e with its variables (including 
> > the currently selected database file).
> > 
> 
> Ok, what if we just make two copies of database.e and then:
> 
> include databaseA.e as dbA
> include databaseB.e as dbB
> 
> Now would they be separate?
> 


--

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu