Structures
- Posted by ddhinc <ddhinc at ALA.NET> Feb 02, 1999
- 446 views
Hello all, It's been a while since I've been active on the list here, work projects have left me little time to pursue programming for fun. I've still followed the list passively when time has permitted because I still find Euphoria a fascinating topic. The subject of structures has been of particular interest to me. It's one of those topics that manages to resurrect itself every couple of months or so and stir up heated discussions on both sides of the fence. I'm admittedly biased, but until recently I didn't feel that the "anti-structure" camp had made many compelling arguments towards how the addition of structures would "damage" Euphoria. Arguments centered around the idea that if the programmer could emulate something then there was no need to add a new feature, regardless of the problems that the emulation incurred. I'm glad this idea was never a dominant theory in computer science or we'd all still be coding entirely in assembly;) For me, a couple of the recent posts are the first to really show that the ideas put forth so far concerning structures haven't addressed all the issues they needed to. Type checking has been one of the leading arguments *for* adding structures to Euphoria, but not all of the aspects concerning how the typing of structures should be handled have been investigated. When I first became interested in the idea of adding structures to Euphoria some time ago, I posted to the list a bit of "conceptual" code to try to work out how structures could be handled. It delves into a few more details about type checking than many of the other examples posted about structures, including the use of structures as parameters, etc. You can find it in the mail archives here: Gabriel Boehme's post made me aware that I had neglected to think through an important area: sequences containing structures (even though I had gone as far as pointing out that they would be possible). Specifically I had missed the question of how they could be accessed given that the containing sequence might have mixed data within it. ------------------------------------------------------------------------- -- here's the problem: structure my_struct1 begin sequence a_member end structure structure my_struct2 begin integer a_member end structure sequence a_sequence atom an_atom my_struct1 a_struct1 my_struct2 a_struct2 integer some_var . . . a_sequence = {a_struct1, a_struct2, an_atom} . . . a_sequence[some_var].a_member = "foo" -- Bad idea! ------------------------------------------------------------------------- Jacques Guy's idea that the structure's declaration should implicitly generate a type-checking function as well would alleviate this problem, making the following possible: if my_struct1(a_sequence[some_var]) then a_sequence[some_var].a_member = "foo" end if Such an assignment would be quite slow though because the interpretor can't just assume you were thoughtful enough to perform all the necessary type-checking. On every pass it would have to: 1. obtain the type of a_sequence[some_var], and verify that it is a structure. 2. verify that a_sequence[some_var] has a member named a_member (this will be very slow. It can't hard-code an index number because it has no idea which structure is being used, so it must look up the index number every time) 3. Perform the normal type-check to verify that the member can hold what's being assigned to it. The alternatives (that I've thought of) to this are not very attractive... One is to disallow access to the members of a structure burried within a sequence, forcing the programmer to shuffle the structure out to a temporary variable defined of that type in order to read or modify it's members. The other is to treat the structure like a sequence and access it's members through the [] operator... although I'll explain later why this should be allowed, it's not the ideal solution to this particular problem because it defeats the usefulness of having structures to begin with... it re-instates the need for either "magic-number" indicies, or constants that are liable to suffer from namespace conflicts. The automatic formation of the type check function for the structure also addresses another of Mr. Boehme's concerns: "What is the type of a structure?" Every structure definition is it's own new type... Using the declarations from the example above, an assignment such as this would be illegal because they have different types: a_struct1 = a_struct2 Even if the two structures contained identical member types in the same order, it would not be safe to allow this. The problem is that if the author later comes in and adds something to my_struct1 without doing the same to my_struct2, the code will break all over the place. It's better to have the assignments take place explicitly on a member-by-member basis. Structures should be abled to be *converted* to sequences thru a simple assignment: a_sequence = a_struct1 But the reverse shouldn't hold true for the same reason that assigning structures of different types is a bad idea. Also Mr. Boehme wrote that "functions like prepend(), reverse(), and upper()" would cause some problems if structures were treated as sequences, but this one-way conversion policy prevents that. some_sequence = append(some_struct, 12) -- perfectly ok. some_struct = append(some_struct, 12) -- illegal, the data returned -- by append() is a sequence -- and cannot be assigned to -- the structure. Remember earlier I said that structures should allow you to access it's members with the [] operator... This is so that structures can be treated like a sequence when the situation calls for it. This solves Mr. Boehme's concern over what would happen if a structure were for some reason shuffled through old code like this: if sequence(x) then sum = 0 for i = 1 to length(x) do sum = sum + x[i] end for else -- x must be an atom sum = x end if At first, you might bawk at the idea of having a structure act like a sequence in a situation like this, but I'd argue that passing a structure through it unintentionally is more of a logic error (read as programmer stupidity;) than a type violation. In fact there may be a few situations where it's beneficial... Suppose x was a structure holding a list of golf scores for several days. Treating a structure like a sequence also solves a much bigger problem that would arise from old code... updating old libraries. Allowing a structure to be accessed in this manner allows authors of libraries to redo/update their old libraries to use structures independant of the old code that uses them. All that would be required is that the library retain the myriad of constants that were used to index the old emulated structures... -- old indicies, must to maintained to support old code constant BAR = 1 constant BAZ = 2 -- old foo structure -- sequence foo -- new foo structure structure foo begin integer bar atom baz end structure -- Allows both old and new syntax for accessing the structure foo[BAR] = 17 -- works, although type checking isn't there foo.baz = 12.3 -- new code benefits from reduced potential -- of namespace conflicts, and automatic type-checking With this allowance the user of the library can update their old code at their leisure when updating to a new library instead of having to redo it all at once. Lastly, Jacques Guy posted the following: > Nothing prevents a sequence from containing routine id's. For instance: > [Euphoria with records] > > record point > integer x,y > procedure draw() > -- your code here > end procedure > end record > > point a > > a.x=1 a.y=1 > a.draw() This goes a little beyond just including the routine_id into a structure. To do that you'd simply use an integer to hold the id and use call_proc() or call_func() to invoke the "method". This is really a simplistic beginning to a class implementation. I'd actually love to see Euphoria implement a class construct, with the features that make classes very powerful... maybe along the lines of something based on the way Python handles them, (with improvements... Python still has a few quirks in this area) but certainly friendlier than the way C++ handles them. There would then be no need to have structures because classes could handle what structures are intended to do. I really can't foresee this coming to pass though; to implement classes well would more than likely require a complete re-write of Euphoria's core, and would probably yield a slower overall execution speed than Euphoria currently enjoys. Regards, Christopher D. Hickman