Re: Problems with structures
- Posted by Robert Pilkington <pilking at BELLATLANTIC.NET> Feb 04, 1999
- 486 views
>I agree that something *like* structures is needed and would be very useful >in Euphoria, for very much the same reasons they're useful in other >languages. An example from QuickBasic: True. I'm glad the discussion on structures has matured past "We need structures" and "Structures are from C, so I don't like them". I just hope it stays that way. >TYPE Card > Suit AS STRING * 9 > Value AS INTEGER >END TYPE > >DIM Deck(1 TO 52) AS Card >Deck(1).Suit = "Club" >Deck(1).Value = 2 >PRINT Deck(1).Suit, Deck(1).Value >As you can see, it's very easy to understand what's going on here. You have >a deck of 52 cards, identified here by suit and value. You don't have to >define constants to use as subscripts (i.e. Suit = 1, Value = 2) -- you can >give the individual portions field names to uniquely identify them. Plus, >the compiler will *always* know if a field reference is valid or not, and >can tell you right away if you've made a mistake. If I were to try this... > >DIM Hand(1 TO 7) AS STRING * 9 >Hand(1).Suit = "Spade" > >..the compiler would immediately flag it as an error -- the "Hand" array is >not defined to contain "Card"-type data, so the ".Suit" reference has no >meaning here. The same goes for all the other languages containing >structures -- since every data element's type *must* be defined beforehand, >the compiler can know instantly if a field reference is valid or not. Right. Argument #1 for structures: Type checking. Argument #2 hasn't been touched on in your message, so I'll point it out here: Speed. Structures, in theory, should be faster than the dynamic sequence Euphoria has. >Now, let's look at a Euphoria-style example, based on the above. [Note: in >all my examples, I will use "?" to indicate general screen output, rather >than the specific "?" statement.] > > -- we'll assume the string_9() type is already defined as a >maximum-length-9 sequence of integer values > >structure Card > string_9 Suit > integer Value >end structure > >sequence Deck of Card > > -- we'll ignore for the moment the problem of initialization... > >Deck[1].Suit = "Club" >Deck[1].Value = 2 >? Deck[1] > >This seems just fine, doesn't it? Just throw in structure definitions and >"of" statements and you're all set! It sure *seems* to simplify things -- >and if you don't like it, just don't use structures! That's pretty much the >pro-structure viewpoint, and it seems quite reasonable at first glance. Well, that's why it got so much support. Then a few of us thought about it... There is a *LOT* of special syntax associated with structures in other languages. >string_9 Hand of Card > >What is the interpreter to do? Yes, the "string_9" type conflicts with the >"Card" structure, but the interpreter won't know that until "Hand" is >initialized -- at *runtime*. This can cause much confusion, since a >definition like this won't cause a "compile" error. I suppose we could >restrict the "of" clause to be used only with "sequence" declarations, but >then what if we have our own user-defined sequence types that we *want* to >use with "of"? Like this: What about: Card Hand -- Define Hand as Card. sequence Card Hand -- Make it an "array" Now, before you say the interpreter could see this: sequence Card Hand Remember: What would happen here: type Card(atom x) return atom(x) end type sequence Card We get an error, right? That's because the interpreter *KNOWS* that Card has been defined as a type. Now, if it *KNOWS* that Card was defined as a structure, then it would know what to do with the next "line". The only problem is that Euphoria's syntax is usually pretty predictable. This is slightly less consistent. Now, this next argument you provide, has one huge flaw in it: >Secondly, examples such as these don't deal adequately with generic Euphoria >objects. Most of the structure ideas I've seen show how we could define and >use the structures, but don't really examine how *normal* Euphoria would >work with them. What if I try this: > >object one_card >one_card = Deck[1] >? one_card.Suit >? one_card.Value > >It seems simple enough at first glance -- the object "one_card" is assigned >the value from "Deck[1]", which is in "Card" format. So naturally the field >references would make sense. But how does the Euphoria "compiler" know for >sure what's in "one_card"? How does it know that "one_card.Suit" is a valid >field reference? After all, "one_card" is a generic object. There's no >guarantee that it will *always* contain "Card"-structured data. In theory, I >suppose the Euphoria "compiler" could try to get really clever -- it knows >that "Deck[1]" contains "Card"-structured data, so it "allows" the field >references to be attached to the object "one_card" in this particular case. >In practice, however, this would be a *nightmare* to implement, especially >if the value is passed back via a function: Well, how does Euphoria know that an object is a sequence or atom? It would also know if it is a structure or not. Think about this: object a a = 1 ? a[1] -- Error! Why on Earth would it do that? Could it be that Euphoria somehow magically knew that a was currently an atom? So why would it be so hard to keep this from happening? Object a a = 1 ? a.Suit Same basic principle: Access a part of a that doesn't exist. Both cases, a crash. >object one_card >one_card = 1 >? one_card.Suit > >As anyone can see, this is ridiculous. There is no way that "one_card" can >possibly contain "Card" data here. But "one_card" has the *potential* of >holding "Card"-structured data, so this *must* be allowed to pass. This will >cause an error at runtime, of course, but the point is that this shouldn't >be allowed past the "compile" step in the first place. Euphoria wouldn't allow one_card[1] to pass (at runtime). But it has the *potential* of holding sequence data, so it *must* be allowed to pass, right? It would crash at runtime. I don't see any problem here. Well, this problem already exists... So I don't see any NEW problem with the crashing with structures. >This is just the tip of the iceberg. How about a generic sequence containing >mixed data -- all kinds of very different structures. If we want to access >their data, we've first got to type-check what's coming in; then, we're >forced to define specific variables for *each* of these structures, just so >that we can access their fields! The overhead of defining those extra >variables will certainly make performance-conscious programmers mad. Kinda works against that 'speed' theory of ours, eh? And initializing the structures is also something that hasn't been worked out. >I could go on, but I trust my point is clear. Structures will not "simplify" >anything in Euphoria -- they will add a whole slew of new rules and >exceptions to existing rules. Whether you *want* to use them or not will >make no difference -- they *will* have an effect on how you write your >programs. The kinds of structures described here *cannot* work within the >context of Euphoria. The reason is simple -- they do not fit! These >structure ideas all have their roots in rigid, every-data-type-defined >programming languages, and there's no way we can introduce them into >Euphoria -- not without a lot of fundamental changes to the language. > >Introducing structures into Euphoria would *not* turn it into C. But it >would be very different from the Euphoria we all know and love. And this is actually the only really good argument against structures: They just don't fit. As was pointed out, we need "something". Something that fits these three parameters, of possible: Is faster when not being resized. Remember, there is a lot of code that uses a sequence to emulate a structure. The 2nd level, usually, never grows or shrinks. Could the speed to access such a non-dynamic part of a sequence be sped up? Is it already like that? Has type-checking: What if the data of a sequence, which somewhere has an atom where it should have a sequence, is sent to a generic save-to-disk routine? Then when it is loaded, the corrupted data will crash when sent as an argument to say pixel() as the coordinents. This could be a tough bug to track down, especially in a large program saved to disk by Edom. (Data isn't human-readable on disk) trace() would be the only hope, and it WOULD work, it would just take a bit of work. Elemenates name-space conflicts. If two pseudo structures have the same field, LIFE, then that field has to be in the *SAME SPOT* in each structure, because we can't have two constant declarations of LIFE. This can be quite irritating, when you have to reorder a logically based constant order because you need another pseudo structure to have access to LIFE. I have run into this porting C code to Euphoria. It can be worked around, but it is *VERY* annoying.