Euphoria vs. structures -- a view

new topic     » topic index » view thread      » older message » newer message

Hello. I am new to the Euphoria mailing list, but I have been using Euphoria
for quite some time now. I immediately noticed that 90% of my new Euphoria
mail had to do with adding structures to Euphoria. I had also noticed the
large number of posts on structures when I'd browsed the website earlier.
Having read through the varying opinions and ideas, I'd like to share some
of my thoughts on the whole structure debate.

Many of the people who are "against" structures in Euphoria seem to fear
that structures will turn Euphoria into another C++. I can understand where
they're coming from, but this is obviously an oversimplification of the main
issue here. There is no way structures would turn Euphoria into "C++ with
sequences", unless RDS also decided to add memory pointers, and decided to
get rid of parameter checks, intelligible error messages, and so on.

Many of the people who are "for" adding structures to Euphoria come up with
all sorts of clever implementation and code-style ideas, but overlook one
fundamental problem:

How will a structure be treated in the context of Euphoria?

Is it a sequence? It could be, but then can we slice it like other
sequences? Or must we use the predefined field names only? I don't think
we'd want people trying to try coding "struct_var[3..5]", for example. Plus,
functions like prepend(), reverse(), and upper() would have to be restricted
in some way to prevent them from "accidentally" changing the fields in a
structure. Some of *you* may write perfect code all the time, but I know *I*
would need something preventing me from messing things up this way, and I
suspect many other Euphoria programmers would appreciate some
error-prevention of this type as well. If structures are going to be
fundamental to Euphoria, this kind of error-checking will be *required*.

Okay then, so maybe we should treat the structure as an atom instead. That
way, we can't try to slice it like a sequence, so we're forced to use the
field names, and it's "indivisible". But wait a minute, atoms are currently
numeric values only. How much existing code would have to be rewritten to
deal with non-numeric atom values? Hmm, perhaps this isn't such a great
idea, either.

Well, perhaps we ought to be *really* clever and say that a structure is a
completely *new* data type! It'll be like a sequence in that it contains
many values, but it'll also be like an atom in that we can't slice it. So,
every object() type would be either an atom(), a sequence(), or a
structure() [with an integer() being a restriction of the atom()]. Looks
pretty good at first glance, but how much existing code contains logic like
this:

if sequence(x) then
   -- it's a sequence
else
   -- it's an atom
end if

A new structure() type would require any such logic to be rewritten. Umm,
perhaps the structure() type should be a restriction of sequence(), much
like integer() is a restriction of atom()? This one could work, perhaps --
prepend() and reverse() could be instructed not to modify structure-type
sequences. However, I don't think a fixed-length, field-defined structure
could be a sub-type of a variable-length, recursive sequence. Besides, the
above example logic would still need additional checks for structures, as
will much existing program logic.

There's another interesting problem with the idea of Euphoria structures
that caught my attention. Robert Pilkington posted some sample code of his
ideas for "elegant structures". I have included the portions relevant to my
points below:

>structure pos_struct
>   atom x, y
>end structure
>
>structure test
>   atom a
>   integer i
>   sequence s
>   object x
>   pos_struct pos
>end structure
>
> -- Create them
>sequence test my_var  -- Setup my_var as a sequence
>
> -- Once initialized, we access it
>? my_var[5].x -- Works

I was struck by how easy he assumed it would be to write something like
"my_var[5].x". To do that, the interpreter has to know that the "my_var"
sequence contains only elements of structure "test". In *this* example it's
just fine, though it greatly disturbs me to see the recursive nature of the
sequence compromised in this manner. But that's beside the point -- take a
look at my code below, based on the above:

pos_struct pos_var
...

sequence stuff
stuff = {pos_var, my_var}
? stuff[1].x
? stuff[2][5].x

Do you see the problem? The interpreter would have to "keep track" of where
the structure is. Pre-compile syntax checking would be impossible in this
case, since the ".x" is only resolvable at run-time. What if stuff[1] isn't
a structure? What happens then?

Okay, so perhaps we shouldn't be able to do this anyway, and should do
something like this instead:

pos_struct p
test t
for i = 1 to length(stuff) do
   if pos_struct(stuff[i]) then
      p = stuff[i]
      ? p.x
   elsif test(stuff[i]) then
      t = stuff[i]
      ? t.x
   ...
   end if
end for

Do you see the problem? You end up having to type-check what you're dealing
with -- this example could involve sequences and still need type-checking,
so you haven't gained anything. Plus, the programmer is *forced* to move
"stuff[i]" into a specially-defined variable in order to access the needed
field.

Grape Vine mentioned in an earlier post that "the program should know what
kind of data it is", but I don't believe any thought went into how difficult
that can really be in a programming language with a recursive data type like
the sequence. Imagine the above example recursed a few more times, or
imagine structures containing sequences of *other* structures -- just
because *you* don't need to write code like this doesn't mean that nobody
else will.

Any way you look at it, structures would have an enormous impact on
Euphoria's fundamental attitude towards data and memory access. The simple
combination of atoms and sequences is part of what makes Euphoria such a
revolutionary programming language in the first place. Introducing
structures into Euphoria is not going to "simplify" anything, not by a long
shot. Sure, *some* things will be simpler to do, but many other things are
going to be a lot more complicated. Structures -- as they have currently
been suggested -- would seriously compromise Euphoria's current elegance and
ease-of-use. There is no way around it.

Now, before I get flamed to death, I would like to say that I started out IN
FAVOR of adding structures to Euphoria. Having to define constants for every
element of a sequence "record" is aggravating in the extreme! I loved how
the object-oriented style of structures (particularly in Visual Basic)
simplified code and made it easy to read, and helped to avoid naming
conflicts. So I immediately tried to figure out how structures could be done
in Euphoria. However, the more I thought about it, the more I realized how
difficult it would really be. The variable-length, recursive sequence is
FUNDAMENTALLY at odds with the fixed-length, field-named structure, IMO.

I think the main problem we face with Euphoria is that the *idea* of
sequences is so different from data storage in other languages. This is why
the pro-structure camp is constantly accused of trying to convert Euphoria
into "C++ with sequences" -- we're trying to take structure ideas from C++,
Visual Basic, Pascal, etc., and shoehorn them into Euphoria. It won't work.
The shoe doesn't fit.

We need something different. Something that carries out the same task as
structures, but does it in a much more elegant fashion. I believe *that*
should be the focus of our efforts here -- not to come up with clever ways
of writing structure definitions, but coming up with a way to incorporate
the *essential* features of structures into the Euphoria way of programming.
That, I believe, is what Robert Craig means when he says he wants "something
elegant and powerful."

Now, if only we knew what that "something" was!

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu