Euphoria vs. structures -- a view
- Posted by "Boehme, Gabriel" <gboehme at MUSICLAND.COM> Feb 01, 1999
- 413 views
Hello. I am new to the Euphoria mailing list, but I have been using Euphoria for quite some time now. I immediately noticed that 90% of my new Euphoria mail had to do with adding structures to Euphoria. I had also noticed the large number of posts on structures when I'd browsed the website earlier. Having read through the varying opinions and ideas, I'd like to share some of my thoughts on the whole structure debate. Many of the people who are "against" structures in Euphoria seem to fear that structures will turn Euphoria into another C++. I can understand where they're coming from, but this is obviously an oversimplification of the main issue here. There is no way structures would turn Euphoria into "C++ with sequences", unless RDS also decided to add memory pointers, and decided to get rid of parameter checks, intelligible error messages, and so on. Many of the people who are "for" adding structures to Euphoria come up with all sorts of clever implementation and code-style ideas, but overlook one fundamental problem: How will a structure be treated in the context of Euphoria? Is it a sequence? It could be, but then can we slice it like other sequences? Or must we use the predefined field names only? I don't think we'd want people trying to try coding "struct_var[3..5]", for example. Plus, functions like prepend(), reverse(), and upper() would have to be restricted in some way to prevent them from "accidentally" changing the fields in a structure. Some of *you* may write perfect code all the time, but I know *I* would need something preventing me from messing things up this way, and I suspect many other Euphoria programmers would appreciate some error-prevention of this type as well. If structures are going to be fundamental to Euphoria, this kind of error-checking will be *required*. Okay then, so maybe we should treat the structure as an atom instead. That way, we can't try to slice it like a sequence, so we're forced to use the field names, and it's "indivisible". But wait a minute, atoms are currently numeric values only. How much existing code would have to be rewritten to deal with non-numeric atom values? Hmm, perhaps this isn't such a great idea, either. Well, perhaps we ought to be *really* clever and say that a structure is a completely *new* data type! It'll be like a sequence in that it contains many values, but it'll also be like an atom in that we can't slice it. So, every object() type would be either an atom(), a sequence(), or a structure() [with an integer() being a restriction of the atom()]. Looks pretty good at first glance, but how much existing code contains logic like this: if sequence(x) then -- it's a sequence else -- it's an atom end if A new structure() type would require any such logic to be rewritten. Umm, perhaps the structure() type should be a restriction of sequence(), much like integer() is a restriction of atom()? This one could work, perhaps -- prepend() and reverse() could be instructed not to modify structure-type sequences. However, I don't think a fixed-length, field-defined structure could be a sub-type of a variable-length, recursive sequence. Besides, the above example logic would still need additional checks for structures, as will much existing program logic. There's another interesting problem with the idea of Euphoria structures that caught my attention. Robert Pilkington posted some sample code of his ideas for "elegant structures". I have included the portions relevant to my points below: >structure pos_struct > atom x, y >end structure > >structure test > atom a > integer i > sequence s > object x > pos_struct pos >end structure > > -- Create them >sequence test my_var -- Setup my_var as a sequence > > -- Once initialized, we access it >? my_var[5].x -- Works I was struck by how easy he assumed it would be to write something like "my_var[5].x". To do that, the interpreter has to know that the "my_var" sequence contains only elements of structure "test". In *this* example it's just fine, though it greatly disturbs me to see the recursive nature of the sequence compromised in this manner. But that's beside the point -- take a look at my code below, based on the above: pos_struct pos_var ... sequence stuff stuff = {pos_var, my_var} ? stuff[1].x ? stuff[2][5].x Do you see the problem? The interpreter would have to "keep track" of where the structure is. Pre-compile syntax checking would be impossible in this case, since the ".x" is only resolvable at run-time. What if stuff[1] isn't a structure? What happens then? Okay, so perhaps we shouldn't be able to do this anyway, and should do something like this instead: pos_struct p test t for i = 1 to length(stuff) do if pos_struct(stuff[i]) then p = stuff[i] ? p.x elsif test(stuff[i]) then t = stuff[i] ? t.x ... end if end for Do you see the problem? You end up having to type-check what you're dealing with -- this example could involve sequences and still need type-checking, so you haven't gained anything. Plus, the programmer is *forced* to move "stuff[i]" into a specially-defined variable in order to access the needed field. Grape Vine mentioned in an earlier post that "the program should know what kind of data it is", but I don't believe any thought went into how difficult that can really be in a programming language with a recursive data type like the sequence. Imagine the above example recursed a few more times, or imagine structures containing sequences of *other* structures -- just because *you* don't need to write code like this doesn't mean that nobody else will. Any way you look at it, structures would have an enormous impact on Euphoria's fundamental attitude towards data and memory access. The simple combination of atoms and sequences is part of what makes Euphoria such a revolutionary programming language in the first place. Introducing structures into Euphoria is not going to "simplify" anything, not by a long shot. Sure, *some* things will be simpler to do, but many other things are going to be a lot more complicated. Structures -- as they have currently been suggested -- would seriously compromise Euphoria's current elegance and ease-of-use. There is no way around it. Now, before I get flamed to death, I would like to say that I started out IN FAVOR of adding structures to Euphoria. Having to define constants for every element of a sequence "record" is aggravating in the extreme! I loved how the object-oriented style of structures (particularly in Visual Basic) simplified code and made it easy to read, and helped to avoid naming conflicts. So I immediately tried to figure out how structures could be done in Euphoria. However, the more I thought about it, the more I realized how difficult it would really be. The variable-length, recursive sequence is FUNDAMENTALLY at odds with the fixed-length, field-named structure, IMO. I think the main problem we face with Euphoria is that the *idea* of sequences is so different from data storage in other languages. This is why the pro-structure camp is constantly accused of trying to convert Euphoria into "C++ with sequences" -- we're trying to take structure ideas from C++, Visual Basic, Pascal, etc., and shoehorn them into Euphoria. It won't work. The shoe doesn't fit. We need something different. Something that carries out the same task as structures, but does it in a much more elegant fashion. I believe *that* should be the focus of our efforts here -- not to come up with clever ways of writing structure definitions, but coming up with a way to incorporate the *essential* features of structures into the Euphoria way of programming. That, I believe, is what Robert Craig means when he says he wants "something elegant and powerful." Now, if only we knew what that "something" was!