1. garbage collector

From rapid looking on sources I had concluded that Eu use reference
counting garbage collector.

Commonly known that refcounting GC can not handle circular structures.

Circstructs widely used in Lisp, Scheme, Haskel and so on.

Use Eu circular structs in internal work, or developers avoid to use it?

One more:
  Refcounting GC does not compact reclaimed space.
  How developers avoid this restriction, or this restriction is not
significant.

new topic     » topic index » view message » categorize

2. Re: garbage collector

Deepfinder wrote:
> 
> From rapid looking on sources I had concluded that Eu use reference
> counting garbage collector.
> 
> Commonly known that refcounting GC can not handle circular structures.
> 
> Circstructs widely used in Lisp, Scheme, Haskel and so on.
> 
> Use Eu circular structs in internal work, or developers avoid to use it?
> 
> One more:
>   Refcounting GC does not compact reclaimed space.
>   How developers avoid this restriction, or this restriction is not
> significant.

Not entirely sure what you mean by circular structures.
Do you mean a structure that refers to itself?
Eg s[2] = s

If I understand correctly, yes, Eu handles those types of references internally,
using pointers for efficiency and copying the data when it needs to.

If you mean being able to change s and having the changes also happen in s[2]
then no, Eu does not handle that. Eu programmers avoid it.

Regarding your question about reclaimed space, I beleive that the actual data
space can be reclaimed, but the space used by the symbol itself cannot.
It's insignificant.

Chris Bensler
~ The difference between ordinary and extraordinary is that little extra ~
http://empire.iwireweb.com - Empire for Euphoria

new topic     » goto parent     » topic index » view message » categorize

3. Re: garbage collector

Deepfinder wrote:
> From rapid looking on sources I had concluded that Eu use reference
> counting garbage collector.
> 
> Commonly known that refcounting GC can not handle circular structures.
> 
> Circstructs widely used in Lisp, Scheme, Haskel and so on.
> 
> Use Eu circular structs in internal work, or developers avoid to use it?

Yes, you are right. Reference counting does not handle
circular structures. That's why Euphoria detects at compile time
the few cases where a circular reference might be created,
and it issues internal code that will make a copy rather than risk 
a circular chain of pointer references.

> One more:
>   Refcounting GC does not compact reclaimed space.
>   How developers avoid this restriction, or this restriction is not
>   significant.

Euphoria coders do not have to think about reference counting
or garbage collection issues. Euphoria takes care of all that.
People can think that a logical full copy is always made of all data.
Euphoria does not have pointers or call by reference in the language
definition. This makes programs easier to understand and debug,
though perhaps a bit less efficient in some cases.
*Internally*, Euphoria tries to point at data rather than copy it, 
and it does efficient recycling of any unused data space, 
but this is *not* something that the Euphoria programmer needs to 
think about. (Thank goodness!) 

As for compaction, this is a problem for a standard garbage 
collector, not a reference-counter. Note that Euphoria could be converted
to use the garbage collection techniques of other languages,
without breaking existing code. However I've looked at this carefully
over the years, and I keep coming to the conclusion that 
reference-counting is better for Euphoria, at least in the interpreter, 
probably in the translator too, and I wouldn't want to use a 
different scheme for translated-to-C code.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

4. Re: garbage collector

> Yes, you are right. Reference counting does not handle
> circular structures. That's why Euphoria detects at compile time
> the few cases where a circular reference might be created,
> and it issues internal code that will make a copy rather than risk 
> a circular chain of pointer references.

Hm... Doublig of data is not good solution. Best solution is to make
improovements to GC tchnique...

> the few cases where a circular reference might be created...

Number of casses strongly depends of programming style... Functional
programming can advance quantity of cases...

> Euphoria coders do not have to think about reference counting
> or garbage collection issues. Euphoria takes care of all that.

Yes... the coders... But existing some peoples that want to enhace the
language. For example add OOP in the core of language. Add high order
functions to language, which strongly enhance expressive power of
language.

Eu can not care OOP in core of the language. Simple recursion force 
to use routine_id. Eu does not have high order functions, which are in
strong use in Lisp, Haskel, Pearl and so on.

> People can think that a logical full copy is always made of all data.
> Euphoria does not have pointers or call by reference in the language
> definition.

Yes Lisp, Scheme & so on does not have pointers or call by reference in
the language definition.

> *Internally*, Euphoria tries to point at data rather than copy it, 
> and it does efficient recycling of any unused data space, 
> but this is *not* something that the Euphoria programmer needs to 
> think about. (Thank goodness!) 

Yes, but what about programmers that want to enhance their language?
 

> As for compaction, this is a problem for a standard garbage 
> collector, not a reference-counter.

What you mean of standard GC?
Compaction is a weak place especially of refcounting GC.
Strong place is real time GC-ing.

Stop and copy GC is compacting by the nature of the algoritm.
But it is not for realtime systems.

> Note that Euphoria could be converted
> to use the garbage collection techniques of other languages,
> without breaking existing code. However I've looked at this carefully
> over the years, and I keep coming to the conclusion that 
> reference-counting is better for Euphoria, at least in the interpreter, 
> probably in the translator too, and I wouldn't want to use a 
> different scheme for translated-to-C code.

It is versions of refcounting GC's that do compacting.
For what we need compaction? It is vital for execution speed in vectors
(arrays).
It seems that Eu sequences is fragmented in memory and if it is true then
speed of array processing is not as maximum possible.
If we use the compaction GC, we can improove speed of array (sequences)
processing, because data is not, or little fragmented. Some versions of
Lisp, Scheme and others use lists linearizing when it is possiple,
and acces to element of vector (array) is fast and time independent.

It is not need to rebuild the Univerce, but it is need to change thinking
and make some improovemnts to GC technique to approach new speed horizonts.

And one more:
 You choose translation to C from IL. But why C? You can say: cause it is
like higher level assembler for crossplatforming. And C may feed high speed.

But it is not true. Interplatforming is care of IL but not of C. Better
choise is to translate not to C, but to asm of target platform. And speed
that we arrive will be maximal. Yes it needs some more knoweleges in asm
versus C to port to platform with different processor. But if system is
correctly planned and developed, work to porting will be minimal.

For example many Lisp systems are writen in Lisp itself, and minor amount
of code is writed in target asm. Porting to other platform need small
work to translate redundant core writen in asm.

Pls forgive me long post but it is very interesting to ask developer
himself :)

new topic     » goto parent     » topic index » view message » categorize

5. Re: garbage collector

Deepfinder wrote:
> > Yes, you are right. Reference counting does not handle
> > circular structures. That's why Euphoria detects at compile time
> > the few cases where a circular reference might be created,
> > and it issues internal code that will make a copy rather than risk 
> > a circular chain of pointer references.
> 
> Hm... Doublig of data is not good solution. Best solution is to make
> improovements to GC tchnique...
> 
> > the few cases where a circular reference might be created...
> 
> Number of casses strongly depends of programming style... Functional
> programming can advance quantity of cases...

In practice, there aren't that many cases. for example:

    s[i] = s
or
    s[i][j] = s

or indirectly,
    t = s
    s[i] = t

etc.

These are pretty unusual statements to write, and the
person would probably want a copy of s to be made, rather than
creating an "infinite loop" of pointer references.

> > Euphoria coders do not have to think about reference counting
> > or garbage collection issues. Euphoria takes care of all that.
> 
> Yes... the coders... But existing some peoples that want to enhace the
> language. For example add OOP in the core of language. Add high order
> functions to language, which strongly enhance expressive power of
> language.
> 
> Eu can not care OOP in core of the language. Simple recursion force 
> to use routine_id. Eu does not have high order functions, which are in
> strong use in Lisp, Haskel, Pearl and so on.
> 
> > People can think that a logical full copy is always made of all data.
> > Euphoria does not have pointers or call by reference in the language
> > definition.
> 
> Yes Lisp, Scheme & so on does not have pointers or call by reference in
> the language definition.
> 
> > *Internally*, Euphoria tries to point at data rather than copy it, 
> > and it does efficient recycling of any unused data space, 
> > but this is *not* something that the Euphoria programmer needs to 
> > think about. (Thank goodness!) 
> 
> Yes, but what about programmers that want to enhance their language?

Sorry, I have to do the best implementation for Euphoria as it is, 
not as it might be in some vague distant future.

> > As for compaction, this is a problem for a standard garbage 
> > collector, not a reference-counter.
> 
> What you mean of standard GC?

I often talk about "garbage collection" in Euphoria,
but some articles do not consider reference counting
to be garbage collection. So by "standard GC", I 
mean true garbage collectors that actually stop the world 
and clean up the mess when they run out of storage or
hit some predefined limit. i.e. they trace out all the 
currently live data and reorganize it, or compact it.

> Compaction is a weak place especially of refcounting GC.
> Strong place is real time GC-ing.
> 
> Stop and copy GC is compacting by the nature of the algoritm.
> But it is not for realtime systems.
> 
> > Note that Euphoria could be converted
> > to use the garbage collection techniques of other languages,
> > without breaking existing code. However I've looked at this carefully
> > over the years, and I keep coming to the conclusion that 
> > reference-counting is better for Euphoria, at least in the interpreter, 
> > probably in the translator too, and I wouldn't want to use a 
> > different scheme for translated-to-C code.
> 
> It is versions of refcounting GC's that do compacting.
> For what we need compaction? It is vital for execution speed in vectors
> (arrays).
> It seems that Eu sequences is fragmented in memory and if it is true then
> speed of array processing is not as maximum possible.

Sequences are always contiguous in memory, never fragmented,
though if an element of a sequence is not an integer, it will
reside somewhere else in memory.

> If we use the compaction GC, we can improove speed of array (sequences)
> processing, because data is not, or little fragmented. Some versions of
> Lisp, Scheme and others use lists linearizing when it is possiple,
> and acces to element of vector (array) is fast and time independent.
> 
> It is not need to rebuild the Univerce, but it is need to change thinking
> and make some improovemnts to GC technique to approach new speed horizonts.

Over the years, on several occasions, I've done a 
lengthy analysis of GC techniques in the hope of 
improving on the current reference-counting method.
I must say, it's very tempting. The current system
can cause a lot of wasted time and space. Unfortunately
the other GC techniques also waste a lot of time and space.
The two approaches are very different, and there are lots of trade-offs 
to consider. It's a fascinating topic. Perhaps I should write-up 
my findings sometime. At the risk of getting
sucked into it yet again, let me very briefly list a few
pros and cons of true (standard) GC for Euphoria,
off the top of my head. There are many
variations of GC that partially address a few of these issues,
while causing other issues.

     Pros:
         - don't waste time incrementing and decrementing
           reference counts
         - don't waste space storing reference count fields
         - storage allocation can be very fast, perhaps as
           little as incrementing a pointer
         - translator will not output code for decrementing
           (but will likely output code, in place of ref-count-increment,
            for possibly making a copy of a sequence)
     Cons:
         - for efficiency, a huge chunk of memory must be 
           pre-allocated, and at all times maybe half of that must be
           on standby, free of live data (i.e. wasted)
         - a complicated, notoriously error-prone, collector 
           must be written, and the world must stop at random times 
           while storage is reclaimed. The stoppages are often noticeable
           to the user, and may be unacceptable for real-time programming.
           The time spent on GC will likely be less than the time
           spent on ref-counting, but the Translator optimizes out 
           much of the potential reference counting, so it's probably
           not too bad.
         - not very friendly to the cache or storage heirarchy.
           You don't re-use the same memory immediately, rather
           you often touch memory that is not in cache
           and may even be swapped out, both in allocation and
           during GC. With reference counting, blocks of memory 
           are re-used almost immediately, often within the
           same for-loop or while-loop, while those addresses
           are still in cache.
         - since you don't keep track of reference counts, you don't
           know how many pointers there are to a piece of storage,
           so to avoid making a copy of a sequence on every assignment
           to an element, you must ensure that there is only
           one reference to all sequences. This causes extra copying
           of sequences and eliminates chances for sharing data (thereby 
           saving memory that way).

This is a very complex subject. 
Feel free to investigate the Euphoria source yourself
and see if you can find a better overall way to do things.
It's not obvious that reference-counting is better,
that's why I keep getting sucked into this from time to
time, hoping to get some of the pros, while reducing the cons.

> And one more:
>  You choose translation to C from IL. But why C? You can say: cause it is
> like higher level assembler for crossplatforming. And C may feed high speed.
> 
> But it is not true. Interplatforming is care of IL but not of C. Better
> choise is to translate not to C, but to asm of target platform. And speed
> that we arrive will be maximal. Yes it needs some more knoweleges in asm
> versus C to port to platform with different processor. But if system is
> correctly planned and developed, work to porting will be minimal.
> 
> For example many Lisp systems are writen in Lisp itself, and minor amount
> of code is writed in target asm. Porting to other platform need small
> work to translate redundant core writen in asm.

Generating C is much easier than generating asm, and is
much easier to debug, maintain and port. The interpreter is written
in C, so it's easy to take an IL op and use essentially the
same C code (optimized somewhat) in the translated code.
Why re-invent the wheel of asm optimization that is already 
built into any decent C compiler? It would be a major
project to build and maintain an asm back-end with optimization,
and I doubt it would buy you more than a few percent in speed.
The main benefit would be that Windows users would not have
to install a C compiler. (Linux/FreeBSD users already 
have GCC on their system.)

I should also someday write a design document 
for the Translator...

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

6. Re: garbage collector

Hi, Rob.
I am interested in garbage collection techniques.
Can you point me to a good text on the subjet, preferably available
 in the Web?
Regards.

new topic     » goto parent     » topic index » view message » categorize

7. Re: garbage collector

Ricardo M. Forno wrote:
> I am interested in garbage collection techniques.
> Can you point me to a good text on the subjet, preferably available
>  in the Web?

It's a fascinating topic, mainly because there is no
"one size fits all" solution to the problem, and there
are many trade-offs to consider.

Here's a good starting point:
   http://www.cs.kent.ac.uk/people/staff/rej/gc.html

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

8. Re: garbage collector

>This is a very complex subject. 
>Feel free to investigate the Euphoria source yourself
>and see if you can find a better overall way to do things.
>It's not obvious that reference-counting is better,
>that's why I keep getting sucked into this from time to
>time, hoping to get some of the pros, while reducing the cons.

Good... And I must use Watcom, is not it?

> should also someday write a design document for the Translator...

Ah! It someday will be much better to understand the job that you have done...

new topic     » goto parent     » topic index » view message » categorize

9. Re: garbage collector

Deepfinder wrote:
> >This is a very complex subject. 
> >Feel free to investigate the Euphoria source yourself
> >and see if you can find a better overall way to do things.
> >It's not obvious that reference-counting is better,
> >that's why I keep getting sucked into this from time to
> >time, hoping to get some of the pros, while reducing the cons.
> 
> Good... And I must use Watcom, is not it?

I use Watcom to build the official interpreter for
both DOS and Windows, but you can build the Euphoria 
interpreter using any of the C compilers supported 
by the Translator. All I really depend on for storage
allocation is the existence of a C malloc() or 
equivalent routine. 

(Trivia fact: I actually use Borland to
build the translator ecw.exe, just because it compiles
faster than Watcom, and also as a kind of regression self-test.)

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

10. Re: garbage collector

Robert Craig wrote:
> 
> Deepfinder wrote:
> > >This is a very complex subject. 
> > >Feel free to investigate the Euphoria source yourself
> > >and see if you can find a better overall way to do things.
> > >It's not obvious that reference-counting is better,
> > >that's why I keep getting sucked into this from time to
> > >time, hoping to get some of the pros, while reducing the cons.
> > 
> > Good... And I must use Watcom, is not it?
> 
> I use Watcom to build the official interpreter for
> both DOS and Windows, but you can build the Euphoria 
> interpreter using any of the C compilers supported 
> by the Translator. All I really depend on for storage
> allocation is the existence of a C malloc() or 
> equivalent routine. 
> 
> (Trivia fact: I actually use Borland to
> build the translator ecw.exe, just because it compiles
> faster than Watcom, and also as a kind of regression self-test.)
> 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a>

Hi Rob.
Please, tell me what is a regression self-test.
Regards.

new topic     » goto parent     » topic index » view message » categorize

11. Re: garbage collector

Ricardo M. Forno wrote:
> Robert Craig wrote:
> > I use Watcom to build the official interpreter for
> > both DOS and Windows, but you can build the Euphoria 
> > interpreter using any of the C compilers supported 
> > by the Translator. All I really depend on for storage
> > allocation is the existence of a C malloc() or 
> > equivalent routine. 
> > 
> > (Trivia fact: I actually use Borland to
> > build the translator ecw.exe, just because it compiles
> > faster than Watcom, and also as a kind of regression self-test.)
> ...
> Hi Rob.
> Please, tell me what is a regression self-test.

In general, "regression" refers to a program failing
on test cases that it used to succeed on. "Self-test"
refers to a program that tests itself somehow.
euphoria\demo\sanity.ex is a good example of a program
that tests for regression by using lots of little
"self-tests" to check for failures.

When I run build.bat on Windows, the translator is
used to build a new interpreter and new translator.
The interpreter is built using the translator with
Watcom, while the translator itself is built using
the translator with Borland. So if I accidentally 
introduce a Borland-specific regression error in the translator,
I will probably notice it very soon because I will have
broken the ecw.exe translator executable itself.
I probably won't have to wait for a Borland user to 
report the regression bug.

(This incestuous build process is a bit hard to
fathom sometimes!)

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu