OpenEuphoria: Forum: Re: Problems with structures

Re: Problems with structures

new topic » goto parent » topic index » view thread » older message » newer message
Posted by "Boehme, Gabriel" <gboehme at MUSICLAND.COM> Feb 04, 1999
542 views
Robert Pilkington <pilking at BELLATLANTIC.NET> writes:

>>As you can see, it's very easy to understand what's going on here. You
have
>>a deck of 52 cards, identified here by suit and value. You don't have to
>>define constants to use as subscripts (i.e. Suit = 1, Value = 2) -- you
can
>>give the individual portions field names to uniquely identify them. Plus,
>>the compiler will *always* know if a field reference is valid or not, and
>>can tell you right away if you've made a mistake. If I were to try this...
>>
>>DIM Hand(1 TO 7) AS STRING * 9
>>Hand(1).Suit = "Spade"
>>
>>..the compiler would immediately flag it as an error -- the "Hand" array
is
>>not defined to contain "Card"-type data, so the ".Suit" reference has no
>>meaning here. The same goes for all the other languages containing
>>structures -- since every data element's type *must* be defined
beforehand,
>>the compiler can know instantly if a field reference is valid or not.
>
>
>Right. Argument #1 for structures: Type checking. Argument #2 hasn't been
>touched on in your message, so I'll point it out here: Speed. Structures,
in
>theory, should be faster than the dynamic sequence Euphoria has.

I would agree with argument #1, on some level -- named references should be
available *somehow*. But it doesn't work within the context of Euphoria.
Neither does argument #2, not that I can see. Read on for more details...

>What about:
>
>Card Hand -- Define Hand as Card.
>sequence Card Hand -- Make it an "array"

Does Euphoria have arrays? No. Euphoria has *sequences*. Once again, we are
creating contradictory syntax for sequence definitions, and diluting the
*essential* features of generic sequences...

>Now, before you say the interpreter could see this:
>
>sequence Card
>Hand
>
>Remember: What would happen here:
>
>type Card(atom x)
>    return atom(x)
>end type
>
>sequence Card
>
>We get an error, right? That's because the interpreter *KNOWS* that Card
has
>been defined as a type. Now, if it *KNOWS* that Card was defined as a
>structure, then it would know what to do with the next "line". The only
>problem is that Euphoria's syntax is usually pretty predictable. This is
>slightly less consistent.

This is true -- "sequence structure variable" just looks confusing. This
method of variable declaration could cause problems within routines, as
well. Please try this yourself -- create a Euphoria executable containing
just this code:

type five(integer x)
   return x = 5
end type

procedure test()
integer five
   five = 4
   ? five
end procedure

test()

Now run it. Hey, it works! Type-declarations are allowed to be used as
variable names within routines. Would structures be any different?

Sequences are *by definition*, a collection of *any* Euphoria objects. The
only current restriction on this is to declare a sequence as a different
type:

my_sequence Card

This way, it's obvious how to type-check Card -- just look at the
type-declaration of my_sequence() to see what's legal. With your
three-identifier declaration, it's unclear just how the type-checking takes
place. Argument #1 for structures should not *create* any type-check
confusion -- it should *solve* it.

>Now, this next argument you provide, has one huge flaw in it:
>
>>Secondly, examples such as these don't deal adequately with generic
Euphoria
>>objects. Most of the structure ideas I've seen show how we could define
and
>>use the structures, but don't really examine how *normal* Euphoria would
>>work with them. What if I try this:
>>
>>object one_card
>>one_card = Deck[1]
>>? one_card.Suit
>>? one_card.Value
>>
>>It seems simple enough at first glance -- the object "one_card" is
assigned
>>the value from "Deck[1]", which is in "Card" format. So naturally the
field
>>references would make sense. But how does the Euphoria "compiler" know for
>>sure what's in "one_card"? How does it know that "one_card.Suit" is a
valid
>>field reference? After all, "one_card" is a generic object. There's no
>>guarantee that it will *always* contain "Card"-structured data. In theory,
I
>>suppose the Euphoria "compiler" could try to get really clever  -- it
knows
>>that "Deck[1]" contains "Card"-structured data, so it "allows" the field
>>references to be attached to the object "one_card" in this particular
case.
>>In practice, however, this would be a *nightmare* to implement, especially
>>if the value is passed back via a function:
>
>Well, how does Euphoria know that an object is a sequence or atom? It would
>also know if it is a structure or not. Think about this:
>
>object a
>a = 1
>? a[1]  -- Error!
>
>Why on Earth would it do that? Could it be that Euphoria somehow magically
>knew that a was currently an atom? So why would it be so hard to keep this
>from happening?
>
>Object a
>a = 1
>? a.Suit
>
>Same basic principle: Access a part of a that doesn't exist. Both cases, a
>crash.

Yes, but a crash at *runtime*. Structures in every other language can be
verified at "compile" time. Euphoria won't flag "? a[1]" as an error until
the statement is actually executed. Please, try this on your own -- create a
Euphoria executable containing *just* this code:

procedure try_this()
   object a
   a = 1
   ? a[1]
end procedure

Notice that we're *not* calling try_this() from anywhere, so we're not
executing anything. The only way "? a[1]" can be flagged as an error is if
the "compiler" catches it. Now run the executable. Well, what do you know,
no errors!

Again, every other language can verify its structure field references at
compile-time -- please refer back to the QuickBasic example given at the
beginning. Euphoria's "compiler" would be unable to do this. True, it lets
subscript references like "a[1]" go by, but subscripts at least potentially
apply to *any* sequence within the program. Structure field names would
*only* apply to their specifically-defined structures -- yet they, too must
be allowed *anywhere* in the program. Programmers from other languages would
wonder why Euphoria is unable to do "compile"-time field reference checks.

>>object one_card
>>one_card = 1
>>? one_card.Suit
>>
>>As anyone can see, this is ridiculous. There is no way that "one_card" can
>>possibly contain "Card" data here. But "one_card" has the *potential* of
>>holding "Card"-structured data, so this *must* be allowed to pass. This
will
>>cause an error at runtime, of course, but the point is that this shouldn't
>>be allowed past the "compile" step in the first place.
>
>
>Euphoria wouldn't allow one_card[1] to pass (at runtime). But it has the
>*potential* of holding sequence data, so it *must* be allowed to pass,
>right? It would crash at runtime. I don't see any problem here. Well, this
>problem already exists... So I don't see any NEW problem with the crashing
>with structures. 

With "one_card[1]", Euphoria doesn't have to do any major type-check work to
determine if "one_card" is a sequence or atom. For "one_card.Suit", how will
Euphoria know if "one_card" contains "Card"-type data? Why, it will have to
do a type-check, of course -- EVERY TIME IT'S REFERENCED WITH A FIELD NAME.

object one_card
one_card = Deck[1]
? one_card.Suit  -- type-check required here
? one_card.Value  -- type-check required here, too

This would be a MAJOR performance-killer. Kinda goes against argument #2 for
structures, doesn't it?

True, the "without type_check" directive could turn this off. But this would
(again!) create major changes in the way Euphoria does business. Currently,
the only time it needs to do user-defined type-checks is when a value is
assigned to the variable -- now it needs to do them just to *reference* the
varaible!


>And this is actually the only really good argument against structures: They
>just don't fit. As was pointed out, we need "something".
>
>Something that fits these three parameters, of possible:
>
>Is faster when not being resized. Remember, there is a lot of code that
uses
>a sequence to emulate a structure. The 2nd level, usually, never grows or
>shrinks. Could the speed to access such a non-dynamic part of a sequence be
>sped up? Is it already like that?

This might be a possibility. However, Euphoria's current design allows for
changes to *any* of the sequences. Exceptions would require extra logic,
which would kill speed.

>Has type-checking: What if the data of a sequence, which somewhere has an
>atom where it should have a sequence, is sent to a generic save-to-disk
>routine? Then when it is loaded, the corrupted data will crash when sent as
>an argument to say pixel() as the coordinents. This could be a tough bug to
>track down, especially in a large program saved to disk by Edom. (Data
isn't
>human-readable on disk) trace() would be the only hope, and it WOULD work,
>it would just take a bit of work.

Type check the data before you access it. I don't understand why
pro-structure people are so allergic to big type-checks. They're only
performance killers when type-checking is *on*. That should be perfectly
fine when you're testing your program. When you're confident your program is
stable, you can specify "without type_check" at the beginning of your
source. It's true that this *could* be a problem if you haven't plugged all
the code leaks in your program. That's why type-checks are so cool to begin
with. Structures are an attempt to "force" Euphoria to rigidly "lock" the
types of data it can use and pass around. But structures would *still*
require type-checks -- they'd just be hidden under the surface of the
"structure" syntax. Plus, they'd introduce all sorts of other exceptional
conditions that sequences don't have to deal with, so I don't see what we're
gaining with structures.

For problems as described above -- an incoming file that has somehow become
corrupted -- just type-check your incoming data *once* before passing it on
to the rest of your program. You know, call the type like a function to
verify that the data is legal. This way, if there's a problem, you get a
very easy-to-understand type-check error -- it's stopped before it gets too
far into your program, and doesn't become "tough to track down". You still
seem to be thinking in terms of other programming languages -- *they* don't
have anything like type-checks or structure verification, so it doesn't
occur to you that Euphoria's type-checking can catch this *before* it
becomes a major problem. (I hope I'm not being too presumptuous here...)

Euphoria type-checks are the best I've found in *any* language. You can
restrict any value to any structure or range you desire. Instead of being
tied to machine-types like other languages, Euphoria allows you to logically
define your types in any way you need to. Yes, this kind of type-checking is
slower. But the flexibility and runtime safety that's gained, IMHO, is worth
the exchange.

>Elemenates name-space conflicts. If two pseudo structures have the same
>field, LIFE, then that field has to be in the *SAME SPOT* in each
structure,
>because we can't have two constant declarations of LIFE. This can be quite
>irritating, when you have to reorder a logically based constant order
>because you need another pseudo structure to have access to LIFE. I have
run
>into this porting C code to Euphoria. It can be worked around, but it is
>*VERY* annoying.

I agree with this point. But I don't see this as a structure issue -- more
of a NAMESPACE issue instead. Perhaps solutions to namespace problems will
eliminate the need for structures?

Gabriel Boehme
new topic » goto parent » topic index » view thread » older message » newer message
OpenEuphoria

Re: Problems with structures

Search

Include:

Quick Links

User menu

Misc Menu