Re: Constructive criticism
- Posted by DerekParnell (admin) Mar 07, 2009
- 1151 views
Here we go again...
Please bear with me because I may be misunderstanding you.
You do not seem to want to improve Euphoria nor do you seem to want to use Euphoria, so ( and this is not a rhetorical question) why are you here? What exactly is it that you want?
Well, general graphs cannot be easily represented in EU with sequences, IMHO.
I disagree. Describe a general graph that you feel cannot be implemented in Euphoria and I will attempt to show you it can.
Lack of pass by reference makes a stack harder to write...
I disagree. To prove it to myself, I just wrote a library file that implements stacks and it took me about five minutes to write and debug. I can make it public if you wish to see it.
It is also easy to use...
------ include stack.e include std/pretty.e -- Create two stacks. object s = new() object t = new() -- Push three values, of different types, onto the first stack push(s, 1) --> 1 -- push(s, "two") --> "two", 1 -- push(s, 3.4) --> 3.4, "two", 1 -- -- Play around with the second stack push(t, 100) --> 100 -- dup(t) --> 100, 100 -- push(t, 99) --> 99, 100, 100 -- swap(t) --> 100, 99, 100 -- push(t, 98) --> 98, 100, 99, 100 -- rot(t) --> 100, 99, 98, 100 -- push(t, 97) --> 97, 100, 99, 98, 100 -- rrot(t) --> 99, 97, 100, 98, 100 -- drop (t) --> 97, 100, 98, 100 -- sequence r -- Display all of the first stack for i = 1 to depth(s) do r = pop(s) pretty_print(1, r[1], {2}) puts(1, '\n') end for puts(1, "\n----\n") -- Display all of the second stack for i = 1 to depth(t) do r = pop(t) pretty_print(1, r[1], {2}) puts(1, '\n') end for ------
The output is ...
3.4 "two" 1 ---- 97 100 98 100
"Making a flexible list that contains a variety of different kinds of data objects is trivial in Euphoria, but requires dozens of lines of ugly code in other languages."
Well, it is easy in JavaScript, Python, Lua, Lisp, Smalltalk, etc. The thing is called dynamic typing. Many languages have it nowadays. - Outdated.
Please note that "many" is not the same as "all". In terms of the number of programs that are currently being maintained, I would guess that most are written in languages that do not support dynamic typing. For example, COBOL, Fortran, C, C, Assembler. Of course there are libraries available for each of these that emulate dynamic typing but it is not built into the languages themselves.
Also note that there are many respected programmers that feel that dynamic typing is a sign of poor programming discipline and the source of many bugs.
So, to dismiss Euphoria's claim that it supports dynamic typing as "outdated" seems to demonstrate that maybe the computing world is bigger than you care to admit or know about. But I'm just speculating here.
Additionally, Euphoria also supports static typing. If something is declared as integer then you can only assign integers to it, and likewise with atom and sequence. (NB. Integers are a subset of atoms, that is why you can assign all integers to atoms but not all atoms to integers)
"Data structure manipulations are very efficient since the Euphoria interpreter will point to large data objects rather than copy them."
Copy-on-write does not avoid all copies. - Misleading.
What is misleading? The statement is true. For example, if you have a 1000-element sequence, and you change the 501st element, a copy of the sequence is not made, just the element is altered. This is what the documentation is referring to.
Copy-on-write is not what is being discussed here in the documentation. For those that maybe unsure of what this is, COW occurs when a sequence is modified and the original sequence is currently referenced by more than one variable. In this case, we take a copy so that the other references maintain an accurate view of the world.
sequence a sequence b sequence c a = "123" -- 'a' points to a literal b = a -- 'b' now points to the same literal c = a -- 'c' now points to the same literal a[1] = '0' -- 'a' is changed, so we take a copy of the -- literal, modify it and get 'a' to point to the new copy. -- This means that 'b' and 'c' still point to the original literal.
Notice that even though we have three sequences declared, in RAM there exists only two sets sequence data. 'a' points to one set, and 'b' and 'c' point to the other. This is efficient.
"Programming in Euphoria is based entirely on creating and manipulating flexible, dynamic sequences of data. Sequences are it - there are no other data structures to learn. You operate in a simple, safe, elastic world of values, that is far removed from the rigid, tedious, dangerous world of bits, bytes, pointers and machine crashes."
True - if you do not interface with external libraries. - Misleading.
What exactly is the misleading part? It is simply saying that if your program is Euphoria, you do not have to work with a multidute of data structures, pointers and complexity, which is a major source of bugs and maintenance costs. This is a true statement. It is not misleading. One has a choice. However, if your program must work with those data structures required by external libraries, you can.
"Unlike other languages such as LISP and Smalltalk, Euphoria's "garbage collection" of unused storage is a continuous process that never causes random delays in execution of a program, and does not pre-allocate huge regions of memory."
Outdated - LIPS and Smalltalk have better GCs nowadays. Use Java as an example. One needs to tell JVM how much memory the program is going to use.
When this was written, both LISP and Smalltalk were relevant comparision languages. Today, one could use other modern languages as examples, such as D. Some modern languages choose to do garbage collection only at specific events (eg. acquiring a new RAM block), during which time they suspend all threads until the GC has completed then resume those threads. This can cause programs to have momentary pauses. This happens with some languages today. Euphoria does a continuous GC so such pauses are not seen.
"The language definitions of conventional languages such as C, C, Ada, etc. are very complex. Most programmers become fluent in only a subset of the language. The ANSI standards for these languages read like complex legal documents."
That suggests that one learns a language by reading an ANSI standard. - Misleading.
How does it suggest that? "Most programmers become fluent in only a subset of the language" is a true statement. Why is that so? Because the full language specification is more complex than the average programmer needs or is prepared to absorb.
The implied point is that Euphoria is not as complex ... and I'm not so sure that is true anymore. Version 4 adds many new things to the language and the full language is not going to be used by the average Euphoria programmer. This simply means that the coder who currently uses v3.1 can still continue to code with that subset of the language.
"You are forced to write different code for different data types simply to copy the data, ask for its current length, concatenate it, compare it etc. The manuals for those languages are packed with routines such as "strcpy", "strncpy", "memcpy", "strcat", "strlen", "strcmp", "memcmp", etc. that each only work on one of the many types of data."
That refers to C only. EU has a similar problem with "equal" vs. "=". - Misleading.
Yes, it refers (mainly) to C. The template feature of C and D, makes it easier to avoid having to write similar functions that differ only in the data types being addressed. This is a side-effect of static typed languages, by the way.
And yes, the equality test is not handled well in Euphoria. We hope to address this in a future release. Currently, if one needs to 'bullet-proof' your tests, you need to use the equal() or compare() function because these operator on any datatype. The "=" operator does not work for sequences as an equality test.
Much of the complexity surrounds issues of data type. How do you define new types? Which types of data can be mixed? How do you convert one type into another in a way that will keep the compiler happy? When you need to do something requiring flexibility at run-time, you frequently find yourself trying to fake out the compiler.
Maybe for C or some other statically typed languages. - Misleading.
How is this misleading? Yes it is referring to static typed languages, which are still the ones most maintained in the world. It is a relevant statement.
In these languages the numeric value 4 (for example) can have a different meaning depending on whether it is an int, a char, a short, a double, an int * etc. In Euphoria, 4 is the atom 4, period. Euphoria has something called types as we shall see later, but it is a much simpler concept.
Well, 4 never is an "int*" and never was - even in 1993. - Wrong.
You are wrong. Take for example the AmigaDOS operating system. The value at RAM address 0x0000004 is the only predefined value in the system. So the numeric value 4 could actually mean an RAM address.
Issues of dynamic storage allocation and deallocation consume a great deal of programmer coding time and debugging time in these other languages, and make the resulting programs much harder to understand. Programs that must run continuously often exhibit storage "leaks", since it takes a great deal of discipline to safely and properly free all blocks of storage once they are no longer needed.
The thing is called garbage collection and most languages have it nowadays. And Smalltalk and Lisp definitely had it in 1993. - Wrong, outdated and misleading.
As it turns out, most used languages do not have a built-in Garbage Collector. COBOL, Fortran, C, Pacal, and Ada do not have one. Most interpreted scripting languages do have one and a few true compiled languages (eg. D) also have one.
The documentation is still relevant. It is neither wrong, outdated or misleading.
Pointer variables are extensively used. The pointer has been called the "go to" of data structures. It forces programmers to think of data as being bound to a fixed memory location where it can be manipulated in all sorts of low-level, non-portable, tricky ways.
Well, that is exactly what EU's implementation does: If yout cast pointers to integers and store your type information in the lowest bits in the pointer, you violate ANSI C. But it does not follow that this is generally needed when programming in C.
Why is the implemention of Euphoria being compared to the use of Euphoria? Of course the implementation uses pointers - it is written in C. Actually only part of the implementation is in C, a lot of Euphoria is actually written in Euphoria. This is the very point that the documentation is getting at. A coder using Euphoria does not have to think of memory addresses (pointers) and other RAM-Mapped structures because all that has been taken care of for you by the implementation (under-the-hood).
And I do not believe that storing data in a pointer is an ANSI C violation. Can you direct me to the relevant document from which you got that information from? I'm willing to be corrected. What the Euphoria backend is doing is using 29-bit pointers, being stored in a 32-bit object. As every RAM address used in the backend is aligned to an 8-byte boundry the rightmost 3 bits an address are always zeros. So the backend shifts the address right 3 bits leaving 29 bits containing the 'address' value. When this is stored in a 32-bit object, it leaves the leftmost 3 bits available for the backend to use. This is efficient.
"A picture of the actual hardware that your program will run on is never far from your mind. Euphoria does not have pointers and does not need them."
Well, for cyclic object structures, you have to work around the lack of references and pointers by using an index. So arguably, EU needs them just like any other programming language.
Needs what? Pointers? The documention only talks about pointers. Euphoria does not need pointers, that is a fact. It can use pointers, but does not need them. I think your argument here is misguided or disingenuous.
However, indexes as references are a different matter. As you are well aware, an index is not a RAM address (pointer / reference) so there is a significant difference right there. A sequence can be used to emulate RAM access, but there is a huge difference. Pointers only address single bytes, any bytes after the first byte pointed to must only be used in accordance to the data structure assumed to be there. Of course, with sequences, each 'location' can be used to store anything, not just single bytes. This leads to being able to store complex data structures in a simple manner, including cyclic objects.
So why are you here? Are you here to help Euphoria or something else?
On one thing I think we agree; that this section in the manual need's its language and focus updated.