Structures

new topic     » goto parent     » topic index » view thread      » older message » newer message

Hello all,

  It's been a while since I've been active on the list
here, work projects have left me little time to pursue
programming for fun. I've still followed the list
passively when time has permitted because I still find
Euphoria a fascinating topic.

  The subject of structures has been of particular
interest to me. It's one of those topics that manages
to resurrect itself every couple of months or so and
stir up heated discussions on both sides of the fence.

  I'm admittedly biased, but until recently I didn't
feel that the "anti-structure" camp had made many
compelling arguments towards how the addition of
structures would "damage" Euphoria. Arguments centered
around the idea that if the programmer could emulate
something then there was no need to add a new feature,
regardless of the problems that the emulation incurred.
I'm glad this idea was never a dominant theory in
computer science or we'd all still be coding entirely
in assembly;)

  For me, a couple of the recent posts are the first to
really show that the ideas put forth so far concerning
structures haven't addressed all the issues they needed
to. Type checking has been one of the leading arguments
*for* adding structures to Euphoria, but not all of the
aspects concerning how the typing of structures should
be handled have been investigated.

  When I first became interested in the idea of adding
structures to Euphoria some time ago, I posted to the
list a bit of "conceptual" code to try to work out how
structures could be handled. It delves into a few more
details about type checking than many of the other
examples posted about structures, including the use of
structures as parameters, etc. You can find it in the
mail archives here:


  Gabriel Boehme's post made me aware that I had neglected
to think through an important area: sequences containing
structures (even though I had gone as far as pointing out
that they would be possible). Specifically I had missed
the question of how they could be accessed given that
the containing sequence might have mixed data within it.
-------------------------------------------------------------------------
-- here's the problem:

structure my_struct1 begin
  sequence a_member
end structure

structure my_struct2 begin
  integer a_member
end structure

sequence a_sequence
atom an_atom
my_struct1 a_struct1
my_struct2 a_struct2
integer some_var
.
.
.
a_sequence = {a_struct1, a_struct2, an_atom}
.
.
.
a_sequence[some_var].a_member = "foo" -- Bad idea!
-------------------------------------------------------------------------
  Jacques Guy's idea that the structure's declaration
should implicitly generate a type-checking function as well
would alleviate this problem, making the following possible:

if my_struct1(a_sequence[some_var]) then
  a_sequence[some_var].a_member = "foo"
end if

  Such an assignment would be quite slow though because
the interpretor can't just assume you were thoughtful enough
to perform all the necessary type-checking. On every pass
it would have to:
  1. obtain the type of a_sequence[some_var], and verify that
     it is a structure.
  2. verify that a_sequence[some_var] has a member named a_member
     (this will be very slow. It can't hard-code an index number
      because it has no idea which structure is being used,
      so it must look up the index number every time)
  3. Perform the normal type-check to verify that the member
     can hold what's being assigned to it.

  The alternatives (that I've thought of) to this are
not very attractive...
One is to disallow access to the members of a structure
burried within a sequence, forcing the programmer to
shuffle the structure out to a temporary variable defined
of that type in order to read or modify it's members.
The other is to treat the structure like a sequence and
access it's members through the [] operator...
although I'll explain later why this should be allowed,
it's not the ideal solution to this particular problem
because it defeats the usefulness of having structures
to begin with... it re-instates the need for either
"magic-number" indicies, or constants that are liable
to suffer from namespace conflicts.

  The automatic formation of the type check function for
the structure also addresses another of Mr. Boehme's
concerns: "What is the type of a structure?" Every
structure definition is it's own new type...

  Using the declarations from the example above, an
assignment such as this would be illegal because they
have different types:

a_struct1 = a_struct2

  Even if the two structures contained identical member
types in the same order, it would not be safe to allow
this. The problem is that if the author later comes in
and adds something to my_struct1 without doing the same
to my_struct2, the code will break all over the place.
It's better to have the assignments take place explicitly
on a member-by-member basis.

  Structures should be abled to be *converted* to
sequences thru a simple assignment:

a_sequence = a_struct1

  But the reverse shouldn't hold true for the same reason
that assigning structures of different types is a bad idea.
Also Mr. Boehme wrote that "functions like prepend(),
reverse(), and upper()" would cause some problems if
structures were treated as sequences, but this one-way
conversion policy prevents that.

some_sequence = append(some_struct, 12) -- perfectly ok.

some_struct = append(some_struct, 12)   -- illegal, the data returned
                                        -- by append() is a sequence
                                        -- and cannot be assigned to
                                        -- the structure.

  Remember earlier I said that structures should allow
you to access it's members with the [] operator... This
is so that structures can be treated like a sequence when
the situation calls for it. This solves Mr. Boehme's
concern over what would happen if a structure were for
some reason shuffled through old code like this:

if sequence(x) then
  sum = 0
  for i = 1 to length(x) do
    sum = sum + x[i]
  end for
else
  -- x must be an atom
  sum = x
end if

  At first, you might bawk at the idea of having a
structure act like a sequence in a situation like this,
but I'd argue that passing a structure through it
unintentionally is more of a logic error (read as
programmer stupidity;) than a type violation. In fact
there may be a few situations where it's beneficial...
Suppose x was a structure holding a list of golf
scores for several days.

  Treating a structure like a sequence also solves a
much bigger problem that would arise from old code...
updating old libraries. Allowing a structure to be
accessed in this manner allows authors of libraries
to redo/update their old libraries to use structures
independant of the old code that uses them. All that
would be required is that the library retain the myriad
of constants that were used to index the old emulated
structures...

-- old indicies, must to maintained to support old code
constant BAR = 1
constant BAZ = 2

-- old foo structure
-- sequence foo

-- new foo structure
structure foo begin
  integer bar
  atom baz
end structure

-- Allows both old and new syntax for accessing the structure
foo[BAR] = 17  -- works, although type checking isn't there
foo.baz = 12.3 -- new code benefits from reduced potential
               -- of namespace conflicts, and automatic type-checking

  With this allowance the user of the library can update their
old code at their leisure when updating to a new library instead
of having to redo it all at once.

  Lastly, Jacques Guy posted the following:

> Nothing prevents a sequence from containing routine id's. For instance:
> [Euphoria with records]
>
> record point
>   integer x,y
>   procedure draw()
>    -- your code here
>   end procedure
> end record
>
> point a
>
> a.x=1 a.y=1
> a.draw()

  This goes a little beyond just including the routine_id
into a structure. To do that you'd simply use an integer
to hold the id and use call_proc() or call_func() to
invoke the "method". This is really a simplistic beginning
to a class implementation. I'd actually love to see
Euphoria implement a class construct, with the features
that make classes very powerful... maybe along the lines
of something based on the way Python handles them, (with
improvements... Python still has a few quirks in this area)
but certainly friendlier than the way C++ handles them.
There would then be no need to have structures because
classes could handle what structures are intended to do.
I really can't foresee this coming to pass though; to
implement classes well would more than likely require a
complete re-write of Euphoria's core, and would probably
yield a slower overall execution speed than Euphoria
currently enjoys.

Regards,
Christopher D. Hickman

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu