1. garbage collector
- Posted by Deepfinder <deepfinder at yandex.ru> Dec 14, 2006
- 560 views
From rapid looking on sources I had concluded that Eu use reference counting garbage collector. Commonly known that refcounting GC can not handle circular structures. Circstructs widely used in Lisp, Scheme, Haskel and so on. Use Eu circular structs in internal work, or developers avoid to use it? One more: Refcounting GC does not compact reclaimed space. How developers avoid this restriction, or this restriction is not significant.
2. Re: garbage collector
- Posted by Chris Bensler <bensler at nt.net> Dec 14, 2006
- 543 views
Deepfinder wrote: > > From rapid looking on sources I had concluded that Eu use reference > counting garbage collector. > > Commonly known that refcounting GC can not handle circular structures. > > Circstructs widely used in Lisp, Scheme, Haskel and so on. > > Use Eu circular structs in internal work, or developers avoid to use it? > > One more: > Refcounting GC does not compact reclaimed space. > How developers avoid this restriction, or this restriction is not > significant. Not entirely sure what you mean by circular structures. Do you mean a structure that refers to itself? Eg s[2] = s If I understand correctly, yes, Eu handles those types of references internally, using pointers for efficiency and copying the data when it needs to. If you mean being able to change s and having the changes also happen in s[2] then no, Eu does not handle that. Eu programmers avoid it. Regarding your question about reclaimed space, I beleive that the actual data space can be reclaimed, but the space used by the symbol itself cannot. It's insignificant. Chris Bensler ~ The difference between ordinary and extraordinary is that little extra ~ http://empire.iwireweb.com - Empire for Euphoria
3. Re: garbage collector
- Posted by Robert Craig <rds at RapidEuphoria.com> Dec 14, 2006
- 550 views
Deepfinder wrote: > From rapid looking on sources I had concluded that Eu use reference > counting garbage collector. > > Commonly known that refcounting GC can not handle circular structures. > > Circstructs widely used in Lisp, Scheme, Haskel and so on. > > Use Eu circular structs in internal work, or developers avoid to use it? Yes, you are right. Reference counting does not handle circular structures. That's why Euphoria detects at compile time the few cases where a circular reference might be created, and it issues internal code that will make a copy rather than risk a circular chain of pointer references. > One more: > Refcounting GC does not compact reclaimed space. > How developers avoid this restriction, or this restriction is not > significant. Euphoria coders do not have to think about reference counting or garbage collection issues. Euphoria takes care of all that. People can think that a logical full copy is always made of all data. Euphoria does not have pointers or call by reference in the language definition. This makes programs easier to understand and debug, though perhaps a bit less efficient in some cases. *Internally*, Euphoria tries to point at data rather than copy it, and it does efficient recycling of any unused data space, but this is *not* something that the Euphoria programmer needs to think about. (Thank goodness!) As for compaction, this is a problem for a standard garbage collector, not a reference-counter. Note that Euphoria could be converted to use the garbage collection techniques of other languages, without breaking existing code. However I've looked at this carefully over the years, and I keep coming to the conclusion that reference-counting is better for Euphoria, at least in the interpreter, probably in the translator too, and I wouldn't want to use a different scheme for translated-to-C code. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
4. Re: garbage collector
- Posted by Deepfinder <deepfinder at yandex.ru> Dec 15, 2006
- 533 views
> Yes, you are right. Reference counting does not handle > circular structures. That's why Euphoria detects at compile time > the few cases where a circular reference might be created, > and it issues internal code that will make a copy rather than risk > a circular chain of pointer references. Hm... Doublig of data is not good solution. Best solution is to make improovements to GC tchnique... > the few cases where a circular reference might be created... Number of casses strongly depends of programming style... Functional programming can advance quantity of cases... > Euphoria coders do not have to think about reference counting > or garbage collection issues. Euphoria takes care of all that. Yes... the coders... But existing some peoples that want to enhace the language. For example add OOP in the core of language. Add high order functions to language, which strongly enhance expressive power of language. Eu can not care OOP in core of the language. Simple recursion force to use routine_id. Eu does not have high order functions, which are in strong use in Lisp, Haskel, Pearl and so on. > People can think that a logical full copy is always made of all data. > Euphoria does not have pointers or call by reference in the language > definition. Yes Lisp, Scheme & so on does not have pointers or call by reference in the language definition. > *Internally*, Euphoria tries to point at data rather than copy it, > and it does efficient recycling of any unused data space, > but this is *not* something that the Euphoria programmer needs to > think about. (Thank goodness!) Yes, but what about programmers that want to enhance their language? > As for compaction, this is a problem for a standard garbage > collector, not a reference-counter. What you mean of standard GC? Compaction is a weak place especially of refcounting GC. Strong place is real time GC-ing. Stop and copy GC is compacting by the nature of the algoritm. But it is not for realtime systems. > Note that Euphoria could be converted > to use the garbage collection techniques of other languages, > without breaking existing code. However I've looked at this carefully > over the years, and I keep coming to the conclusion that > reference-counting is better for Euphoria, at least in the interpreter, > probably in the translator too, and I wouldn't want to use a > different scheme for translated-to-C code. It is versions of refcounting GC's that do compacting. For what we need compaction? It is vital for execution speed in vectors (arrays). It seems that Eu sequences is fragmented in memory and if it is true then speed of array processing is not as maximum possible. If we use the compaction GC, we can improove speed of array (sequences) processing, because data is not, or little fragmented. Some versions of Lisp, Scheme and others use lists linearizing when it is possiple, and acces to element of vector (array) is fast and time independent. It is not need to rebuild the Univerce, but it is need to change thinking and make some improovemnts to GC technique to approach new speed horizonts. And one more: You choose translation to C from IL. But why C? You can say: cause it is like higher level assembler for crossplatforming. And C may feed high speed. But it is not true. Interplatforming is care of IL but not of C. Better choise is to translate not to C, but to asm of target platform. And speed that we arrive will be maximal. Yes it needs some more knoweleges in asm versus C to port to platform with different processor. But if system is correctly planned and developed, work to porting will be minimal. For example many Lisp systems are writen in Lisp itself, and minor amount of code is writed in target asm. Porting to other platform need small work to translate redundant core writen in asm. Pls forgive me long post but it is very interesting to ask developer himself :)
5. Re: garbage collector
- Posted by Robert Craig <rds at RapidEuphoria.com> Dec 16, 2006
- 543 views
Deepfinder wrote: > > Yes, you are right. Reference counting does not handle > > circular structures. That's why Euphoria detects at compile time > > the few cases where a circular reference might be created, > > and it issues internal code that will make a copy rather than risk > > a circular chain of pointer references. > > Hm... Doublig of data is not good solution. Best solution is to make > improovements to GC tchnique... > > > the few cases where a circular reference might be created... > > Number of casses strongly depends of programming style... Functional > programming can advance quantity of cases... In practice, there aren't that many cases. for example: s[i] = s or s[i][j] = s or indirectly, t = s s[i] = t etc. These are pretty unusual statements to write, and the person would probably want a copy of s to be made, rather than creating an "infinite loop" of pointer references. > > Euphoria coders do not have to think about reference counting > > or garbage collection issues. Euphoria takes care of all that. > > Yes... the coders... But existing some peoples that want to enhace the > language. For example add OOP in the core of language. Add high order > functions to language, which strongly enhance expressive power of > language. > > Eu can not care OOP in core of the language. Simple recursion force > to use routine_id. Eu does not have high order functions, which are in > strong use in Lisp, Haskel, Pearl and so on. > > > People can think that a logical full copy is always made of all data. > > Euphoria does not have pointers or call by reference in the language > > definition. > > Yes Lisp, Scheme & so on does not have pointers or call by reference in > the language definition. > > > *Internally*, Euphoria tries to point at data rather than copy it, > > and it does efficient recycling of any unused data space, > > but this is *not* something that the Euphoria programmer needs to > > think about. (Thank goodness!) > > Yes, but what about programmers that want to enhance their language? Sorry, I have to do the best implementation for Euphoria as it is, not as it might be in some vague distant future. > > As for compaction, this is a problem for a standard garbage > > collector, not a reference-counter. > > What you mean of standard GC? I often talk about "garbage collection" in Euphoria, but some articles do not consider reference counting to be garbage collection. So by "standard GC", I mean true garbage collectors that actually stop the world and clean up the mess when they run out of storage or hit some predefined limit. i.e. they trace out all the currently live data and reorganize it, or compact it. > Compaction is a weak place especially of refcounting GC. > Strong place is real time GC-ing. > > Stop and copy GC is compacting by the nature of the algoritm. > But it is not for realtime systems. > > > Note that Euphoria could be converted > > to use the garbage collection techniques of other languages, > > without breaking existing code. However I've looked at this carefully > > over the years, and I keep coming to the conclusion that > > reference-counting is better for Euphoria, at least in the interpreter, > > probably in the translator too, and I wouldn't want to use a > > different scheme for translated-to-C code. > > It is versions of refcounting GC's that do compacting. > For what we need compaction? It is vital for execution speed in vectors > (arrays). > It seems that Eu sequences is fragmented in memory and if it is true then > speed of array processing is not as maximum possible. Sequences are always contiguous in memory, never fragmented, though if an element of a sequence is not an integer, it will reside somewhere else in memory. > If we use the compaction GC, we can improove speed of array (sequences) > processing, because data is not, or little fragmented. Some versions of > Lisp, Scheme and others use lists linearizing when it is possiple, > and acces to element of vector (array) is fast and time independent. > > It is not need to rebuild the Univerce, but it is need to change thinking > and make some improovemnts to GC technique to approach new speed horizonts. Over the years, on several occasions, I've done a lengthy analysis of GC techniques in the hope of improving on the current reference-counting method. I must say, it's very tempting. The current system can cause a lot of wasted time and space. Unfortunately the other GC techniques also waste a lot of time and space. The two approaches are very different, and there are lots of trade-offs to consider. It's a fascinating topic. Perhaps I should write-up my findings sometime. At the risk of getting sucked into it yet again, let me very briefly list a few pros and cons of true (standard) GC for Euphoria, off the top of my head. There are many variations of GC that partially address a few of these issues, while causing other issues. Pros: - don't waste time incrementing and decrementing reference counts - don't waste space storing reference count fields - storage allocation can be very fast, perhaps as little as incrementing a pointer - translator will not output code for decrementing (but will likely output code, in place of ref-count-increment, for possibly making a copy of a sequence) Cons: - for efficiency, a huge chunk of memory must be pre-allocated, and at all times maybe half of that must be on standby, free of live data (i.e. wasted) - a complicated, notoriously error-prone, collector must be written, and the world must stop at random times while storage is reclaimed. The stoppages are often noticeable to the user, and may be unacceptable for real-time programming. The time spent on GC will likely be less than the time spent on ref-counting, but the Translator optimizes out much of the potential reference counting, so it's probably not too bad. - not very friendly to the cache or storage heirarchy. You don't re-use the same memory immediately, rather you often touch memory that is not in cache and may even be swapped out, both in allocation and during GC. With reference counting, blocks of memory are re-used almost immediately, often within the same for-loop or while-loop, while those addresses are still in cache. - since you don't keep track of reference counts, you don't know how many pointers there are to a piece of storage, so to avoid making a copy of a sequence on every assignment to an element, you must ensure that there is only one reference to all sequences. This causes extra copying of sequences and eliminates chances for sharing data (thereby saving memory that way). This is a very complex subject. Feel free to investigate the Euphoria source yourself and see if you can find a better overall way to do things. It's not obvious that reference-counting is better, that's why I keep getting sucked into this from time to time, hoping to get some of the pros, while reducing the cons. > And one more: > You choose translation to C from IL. But why C? You can say: cause it is > like higher level assembler for crossplatforming. And C may feed high speed. > > But it is not true. Interplatforming is care of IL but not of C. Better > choise is to translate not to C, but to asm of target platform. And speed > that we arrive will be maximal. Yes it needs some more knoweleges in asm > versus C to port to platform with different processor. But if system is > correctly planned and developed, work to porting will be minimal. > > For example many Lisp systems are writen in Lisp itself, and minor amount > of code is writed in target asm. Porting to other platform need small > work to translate redundant core writen in asm. Generating C is much easier than generating asm, and is much easier to debug, maintain and port. The interpreter is written in C, so it's easy to take an IL op and use essentially the same C code (optimized somewhat) in the translated code. Why re-invent the wheel of asm optimization that is already built into any decent C compiler? It would be a major project to build and maintain an asm back-end with optimization, and I doubt it would buy you more than a few percent in speed. The main benefit would be that Windows users would not have to install a C compiler. (Linux/FreeBSD users already have GCC on their system.) I should also someday write a design document for the Translator... Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
6. Re: garbage collector
- Posted by Ricardo M. Forno <rmforno at tutopia.com> Dec 16, 2006
- 516 views
Hi, Rob. I am interested in garbage collection techniques. Can you point me to a good text on the subjet, preferably available in the Web? Regards.
7. Re: garbage collector
- Posted by Robert Craig <rds at RapidEuphoria.com> Dec 16, 2006
- 523 views
Ricardo M. Forno wrote: > I am interested in garbage collection techniques. > Can you point me to a good text on the subjet, preferably available > in the Web? It's a fascinating topic, mainly because there is no "one size fits all" solution to the problem, and there are many trade-offs to consider. Here's a good starting point: http://www.cs.kent.ac.uk/people/staff/rej/gc.html Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
8. Re: garbage collector
- Posted by Deepfinder <deepfinder at yandex.ru> Dec 20, 2006
- 550 views
>This is a very complex subject. >Feel free to investigate the Euphoria source yourself >and see if you can find a better overall way to do things. >It's not obvious that reference-counting is better, >that's why I keep getting sucked into this from time to >time, hoping to get some of the pros, while reducing the cons. Good... And I must use Watcom, is not it? > should also someday write a design document for the Translator... Ah! It someday will be much better to understand the job that you have done...
9. Re: garbage collector
- Posted by Robert Craig <rds at RapidEuphoria.com> Dec 20, 2006
- 526 views
Deepfinder wrote: > >This is a very complex subject. > >Feel free to investigate the Euphoria source yourself > >and see if you can find a better overall way to do things. > >It's not obvious that reference-counting is better, > >that's why I keep getting sucked into this from time to > >time, hoping to get some of the pros, while reducing the cons. > > Good... And I must use Watcom, is not it? I use Watcom to build the official interpreter for both DOS and Windows, but you can build the Euphoria interpreter using any of the C compilers supported by the Translator. All I really depend on for storage allocation is the existence of a C malloc() or equivalent routine. (Trivia fact: I actually use Borland to build the translator ecw.exe, just because it compiles faster than Watcom, and also as a kind of regression self-test.) Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
10. Re: garbage collector
- Posted by Ricardo M. Forno <rmforno at tutopia.com> Dec 21, 2006
- 543 views
Robert Craig wrote: > > Deepfinder wrote: > > >This is a very complex subject. > > >Feel free to investigate the Euphoria source yourself > > >and see if you can find a better overall way to do things. > > >It's not obvious that reference-counting is better, > > >that's why I keep getting sucked into this from time to > > >time, hoping to get some of the pros, while reducing the cons. > > > > Good... And I must use Watcom, is not it? > > I use Watcom to build the official interpreter for > both DOS and Windows, but you can build the Euphoria > interpreter using any of the C compilers supported > by the Translator. All I really depend on for storage > allocation is the existence of a C malloc() or > equivalent routine. > > (Trivia fact: I actually use Borland to > build the translator ecw.exe, just because it compiles > faster than Watcom, and also as a kind of regression self-test.) > > Regards, > Rob Craig > Rapid Deployment Software > <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a> Hi Rob. Please, tell me what is a regression self-test. Regards.
11. Re: garbage collector
- Posted by Robert Craig <rds at RapidEuphoria.com> Dec 21, 2006
- 546 views
Ricardo M. Forno wrote: > Robert Craig wrote: > > I use Watcom to build the official interpreter for > > both DOS and Windows, but you can build the Euphoria > > interpreter using any of the C compilers supported > > by the Translator. All I really depend on for storage > > allocation is the existence of a C malloc() or > > equivalent routine. > > > > (Trivia fact: I actually use Borland to > > build the translator ecw.exe, just because it compiles > > faster than Watcom, and also as a kind of regression self-test.) > ... > Hi Rob. > Please, tell me what is a regression self-test. In general, "regression" refers to a program failing on test cases that it used to succeed on. "Self-test" refers to a program that tests itself somehow. euphoria\demo\sanity.ex is a good example of a program that tests for regression by using lots of little "self-tests" to check for failures. When I run build.bat on Windows, the translator is used to build a new interpreter and new translator. The interpreter is built using the translator with Watcom, while the translator itself is built using the translator with Borland. So if I accidentally introduce a Borland-specific regression error in the translator, I will probably notice it very soon because I will have broken the ecw.exe translator executable itself. I probably won't have to wait for a Borland user to report the regression bug. (This incestuous build process is a bit hard to fathom sometimes!) Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com