Re: Garbage Collection
- Posted by "Igor Kachan" <kinz at peterlink.ru> Jun 29, 2005
- 712 views
Robert Craig wrote: > posted by: Robert Craig <rds at RapidEuphoria.com> > > As most of you know, Euphoria uses reference-counting to > reclaim unused blocks of storage. Over the years, I've > periodically looked at the latest algorithms for > garbage collection, and after careful analysis I've > always decided to stick with reference counting. > > If you search the Web for "garbage collection", > you'll find many different algorithms, such as > mark/sweep, mark/compact and generational garbage collection. > It's a fairly active and interesting area of research. > There are many approaches, and many trade-offs to consider. > > Most people say that reference counting is slower than true > garbage collection (GC) because of the time wasted incrementing > and decrementing reference counts during execution. They also point out > that "circular" data structures can't be garbage collected this way > because the reference counts will never drop to zero. > Euphoria carefully avoids creating circular structures of pointers > (e.g. A -> B -> C -> A) so this is not a problem. > It's acknowledged that reference counting generally > avoids the random, noticable delays that can occur with the other methods. > e.g. you wouldn't want your action game to freeze for a second > every minute or so. If it were a big non-interactive program, > you wouldn't notice the GC delay, and the total time might > be less than with reference counting (RC). > > Main Advantages to GC over RC in Euphoria: > - eliminates all DeRef's (faster program execution > and reduced size (by 15%?) of translated code. > Less chance for coding errors.) > - very fast allocation of blocks (with most methods) and deallocation > is a no-op > - eliminates overheads (such as ref count field) on each block of storage > - eliminates storage fragmentation (with most methods) that could > prevent allocation of a big block, even though many small > non-contiguous blocks are free > > Main Disadvantages: > - random pauses in execution (there are ways to control this but > not completely eliminate it. RC can also cause small pauses sometimes, > but that can be almost completely eliminated quite easily) > - for efficiency you must pre-reserve memory well in excess of what > you actually need > - some Ref's of a sequence would have to become full copies > (maybe you could "lock" sequences that are going to be modified > repeatedly) > - RC seems to have better cache performance, since it reuses > blocks that were used recently, while GC keeps moving further > into fresh areas of memory > - GC is known for introducing subtle bugs. The algorithms must > be implemented very carefully. > > If anyone has any ideas on this, please post them. > I'm still actively thinking about all these issues. Rob, for example, in BASICs, there is the FRE() function, it returns amount of the available memory. Sometimes, we can see the Euphoria message: "You program has run out of memory. One moment, please ..." I'd prefer to make something like to:
if eu_FRE() < 256000 then ok = collect_all_garbage_immediatelly() end if
than get that message above. This collect_all_garbage_immediatelly() function may be the most powerful of the existent ones as plus to automated fast collection, I think. Then all pauses will be under programmer's control. I'm not very specialist on these things, just my $.02. Good Luck! Regards, Igor Kachan kinz at peterlink.ru