Feature requests for Eu 2.5.
- Posted by Christian.CUVIER at agriculture.gouv.fr Aug 19, 2003
- 556 views
Beware the long post... The following is a commented list of features that I'd find desirable in Eu 2.5. A/ Variable management 1/ Pass by reference in routines Description: when some code calls a routine, the routine may update=20 some vars=20 whose references were passed to it. Syntax: x=3DfuncA(sequence s, update integer i,...) Here, even if funcA modifies s, these changes will be lost on=20 treturn, while=20 the changes to i will be preserved. By default, for routines that are not types, arguments are constant.=20 A "const"=20 keyword could be used to make things explicit. By default, types should have their arguments passed by reference, and the keyword "const" can be used to override this. 2/ Variable sharing Description: enable two or more routines to access a symbol which is=20 not local=20 (filewide) or global. Syntax: share x with rt1 [as y ][in file] When this code is found in the declarations of routine rt2, the=20 symbol x is also made available to the routine rt1. x would be seen in rt1 as x if no "as" clause is=20 there, pr as y otherwise. The routine rt1 may be in another file, in which case the=20 "file" clause is needed. x must have been declared before being shared. =20=20=20 3/ Static variables in routines Description: Allow routines to keep track of the values of private=20 symbols between executions. =20=20=20 Syntax: static <type> <var-list> In order for a static variable to be initialized before it's first=20 read, uninitialized variables should be handled. 4/ Allow uninitialized symbols to be passed to and returned from=20 routines. So that initialization routines can be used. Obviously, reading the=20 value=20 of an unitialized symbol will raise an error. =20=20=20 5/ Arraying of symbols. Description: allow symbols to be referenced by a sequence. Syntax: array arrname v1[,v2,....] This would create a sequence arrname whose first element aliases=20 identifier v1,=20 the second, if any, the symbol v2, and so on. This statement is a=20 declaration statement. The idea of this, aas well as its usefulness, comes from the SAS=20 language. 6/ Namespace hierarchy Description: Ensures that two symbols without a namespace don't=20 collide when this was not meant. When two or more global symbols are identified by the same string, Eu= =20 considers it an error, since it can't tell which one is meant. But such collisions may come=20 from conflicting names in unrelated libraries. A simple solution is to adopt rules for choice, issuing a warning for= =20 the programmer's information. The rules could be: a/ A conflict between explicitly namespaced symbols is an error=20 condition; b/ Let the distance between two files be the minimal numbe of include= =20 statements that allows=20 the two symbols to clash. Then, only consider the symbols which were=20 defined in the closest file(s). c/ In case of ties, a linear link will take precedence over a broken=20 link. A linear link means that the symbol is defined in a file that directly=20 or indirectly includes or is included by the one in which the reference is being=20 resolved. d/ In case of a tie, and if the links are of the same direction, an=20 error must be raised. It means that two libraries define the same symbol, and more info is=20 needed. e/ The remaining case is when a symbol is defined both in a file=20 including the current file and in a file included in the current one. Then, the downward link (the=20 latter) is to be preferred. 7/ Scoped symbols Descriptio: some symbols may be defined in a portion of routine/main=20 code only. Syntax: scope <var decls> ... endscope Note that C-style braces, much easier to type, can't be used in Eu. B/ Routine management 1/ Forwarding. Description: Allows to use a routine before it is defined. Syntax:=20 forward function f([args]) .... x=3Df(something) ... function f ... code defining the function ... end function The alternative in 2.4 is to use call_proc/func, which obfuscates the= =20 code. 2/ Allow discarding of function return values. Description: It is sometimes useful to call a function as if it were=20 a procedure. Syntax: ~thefunc([args]) where thefunc is a function. =20=20=20 3/ Return of several symbols. Descritption: Allow function to simultaneously update several=20 variables. Syntax: {x,y,....z}=3Df([args]) f must return a sequence each element of which is assigned to the=20 corresponding variable. Extra returned values are to be ignored. 4/ Optional parameters. Description: Allow the optional specification of parameters in=20 routine calls. SQyntax: <routine> r([normal parms])([optional parms]) And, when called, the optional parameters may or may not be passed. =20=20=20 5/ Default values for arguments. Description: Allow to skip the most frequent value for a routine=20 argument. Syntax:=20 <routine> r(integer x=3D32767,sequence s) .... end <routine> ... y=3Dr(,s) --argument x is 32767 Obviously, the comma is optional when no defaulted argulment is=20 followed by a non-defaulted argument. =20=20=20 6/ Named parameters Description: Allow parameters to be passed as name=3Dvalue. Named and unnamed arguments cannot be mixed in just any way. =20=20=20 7/ Nested routines Description: For nested routines, the code of the routine they are=20 defined in behaves exactly as the code outside a routine for a non-nested routine. Syntax:=20 sequence t ..... routine r1(...) integer i sequence s routine r11(....) integer i ... end routine ... end routine In the example above, both routines r1 and r11 can see the sequence t= =20 as a public variable. If i is defined outside r1, r1 does not see this=20 symbol. It defines another i that shadows the public i. Likewise, r11's i=20 shadows r1's i. Retrieving the value of a shadowed symbol may resuire a special=20 syntax. Normally, a nested routine is only called by the routine in which it=20 is nested. This limitation might be relaxed in some cases. 8/ Routine_ids for builtins. Description: routine_id() will return a value even for built-in=20 routines. 9/ Routine redefinition. Description: Coding a routine with the same name as a built-in or=20 otherwise previously specified routine would not cause an error, but the new=20 routine would be called instead of the former, hijacking its routine_id. Syntax: No special synntax required. recover <routine decl> would undo all previous redefinitions. Note that nested routines, shared and static variables will greatly=20 reduce the use of global syymbols, hence alleviating the namespace collisiion=20 problem. C/ Instruction flow cntrol 1/ Optional argument for the exit statement. Description: Breaks out of several loop levels at the same time. Syntax: exit [arg] Arg may be: a/ A positive number, which is the extra number of loop levels to=20 leave. b/ A negative number, counting the loop levels from the top down.=20 Thus, exit -1 means "exit the topmost loop". c/ exit 0 can be tolerated as a synonym for plain exit. d/ A label name, provided loops can be labelled. e/ A for loop indexvariable name. =20=20=20=20=20=20 2/ Next statement Description: Skips the rest of the designated loop and start a new=20 iteration. Syntax: next [arg] Same choices of [arg] as for exit. =20=20=20 3/ Retry statement Description: Restarts the current iteration of the designated for=20 loop. Syntax: retry [arg] Same rules as above. retry does not quite make sense for while loops, since it would=20 duplicate next. So, it will count for loops only. This special behaviour might not be=20 desirable=20 if a repeat ... until construct is immplemented, because retry and next=20 would have=20 different meaning there.. 4/ Exif statement Description: Same as exit, but applies to if blocks. =20=20=20 5/ Select statement=20=20=20 Description: Allows a decision to be made in more than two ways. Syntax: select <expr> case x1: code to execute if the value of expr is x1 case x2: ... case x3 thru x4: case <5 case f(v,_)=3D0: .... [otherwise ....] end select =20=20=20=20=20=20 Each option inside the select statement is a case statement. A wide=20 range of ways to specify conditions can be devised, including using the=20 anonymous _=20 symbol in complex constructs. The optional otherwise clause is executed if no case statement caused= =20 the=20 program flow to break out of the select statement. Flow goes from ne case statement to the next, except when the break=20 keyword=20 causes control to be passed to the statement following the closing end=20 select statement. 6/ xwhile loop Description: same as a while loop, except that exit occurs as soon as the condition specified in the xwhile statement is no longer true,=20 without the need for endlessly repeating tests. 7/ Exception handling Description: When some condition occurs, execute a specific handler. Syntax: setExcHandler(condition_code,handler_routine_id) The handler may access and modify any variable in scope when the=20 handler is invoked. The handler may abort the program, reexecute the last instruction or=20 return normally. Condition codes may not relate onlly to errors. 8/ Guards Description: Check for conditions inside a given scope and exeute=20 code when this happens. Syntax: on/when/whenever <condition> do ... end do The scope of the guards may vary: - on: current block only - when: current routine - whenever: from now on 9/ Dynamic code execution. Description: Allow execution of text generated or fetched somewhere. Syntax: execute(string) The string in interpreted as if it had been loaded in memory in the=20 first place. Note that the include statement allows to do this, but only=20 outside loops or routines. 10/ Selective type checking. Description: Right now, either every assignment invokes type checking= =20 routines, or some=20 supposedly minimal amount, not nown to programmer, is performed. The=20 idea is to specify variables that will be checked without type_check. Syntax: check <variable decl> The check prefix is ignored when type_check is on. =20=20=20 11/ Additional type checking. Description: On assignment, specified variables would go through a=20 user-definable set of validity checks. The user can add or remove such checks,=20 performed regardless of the type_check flag. Syntax: check(i1,var name/id)=20 add_check(i2,var name/id) del_check(i2,var name/id) uncheck(var nname/id) These do what they say, i1 and i2 being routine_ids of additional=20 checking functions. Additional type checks are coded just like ordinary type functions. 12/ Watch facility Description: When a given variable is read, its watch function is=20 invoked,=20 returning the value the variable (possibly) holds. Syntax: watch <variable decl> The routine invoked will be a rtype routine, which follows the=20 same patterns=20 and rules than type functions. Example: rtype integer (integer x) if x<0 then return 0 else return x end if end rtype would make all negative integers appear as 0. D/ Sequences and slices 1/ Negatives indexes Description: inside any sequence index specifiation, a negative value= =20 might=20 be used to count the elements backwards. So,=20 longsequencename[length(longsequencename)] could be coded longsequencename[-1]. 2/ More slices Decription: slices might appear in index specifications in other=20 places than just the last. Thus, matrix[..][3] would be the 3rd row of a column-bnased matrix. =20=20=20 3/ Shorthands for "length(this)" Description: Allow more flexible coding of slices that extend to the=20 end of a sequence. Syntax: matrix[..][3] --see above stack[$] --last element of stack word[2..] --chop first letter off word 4/ Composition of sequences Description: defines a subsequence using a sequence of indexes. Syntax: t=3D{3,-1,5,3} s=3D<some sequence> s1=3Ds=F8t s1 is a sequence of lenth 4: {s[3],s[-1],s[5],s[3]= } Pairs in t could specify slices. =20=20=20 5/ Dynamic indexing Description: Allow variable length index specification Syntax: s=3D{1,3,2} t=3Du[[s]] --t=3Du[1][3][2] Again, pairs could be used in s to specify slices. 6/ Sequence manipulation routines Description: replace, insert and move elemnts around in a sequence. Syntax: replace(seq,places_list,subst_list)=20 --places_list is a list of pairs of indexes delimiting=20 subsequences in seq. --Each of these will be replaced by the matching element in=20 subst_list. insert(seq,where,x) --x is inserted at position where in seq,=20 whose length increased by 1 inserts(seq,where,x) --if x is a sequence, its elements are=20 inserted starting at position=20 --where in seq.If x is an atom, same as insert. move(seq,start,end,where) --the subsequence seq[start..end] is moved so that it starts at= =20 where. remove'szq,where) --where is an index or a pair of indexes specifying element(s)=20 to be removed from seq. =20=20=20=20=20=20=20=20=20 Note that all these could be just some functions in misc.e Their=20 being built-ins would probably increase speed. E/ Object programming capabilities. =20 1/ Structures Description: define some fixed layout for specified types of=20 sequences. Syntax: struct customer(sequence name,integer zipcode,integer=20 lastamount,...) This defines a type which could be used just as any type. =20=20=20 2/ Classes Description: classes are structures with methods, which are routines=20 applying to to the underlying struct. Classes are supposed to inherit from other classes, and they may have= =20 virtual methods. =20=20=20 3/ Dot notation Methods apply to object, which are just some special types. The most=20 commonly known syntax to applyy something to something is entity.method([args]).=20 There is always a first hidden argument to any method, which is the entity it=20 is applied to. This could be referenced by the keyword self. The dot notation could be extended to ordinary routines. F/ Miscellaneous 1/ Pre- and post-inc/decremnet operators C-style 2/ Allow concatenation of logical relations, such as 0<=3Dx<=3D9. 3/ Allow assignments inside conditions. The symbol :=3D could be used here since =3D has a relational meaning. =20 I most likely forgot a few useful things, but these would make=20 programming with=20 Euphoria so much more comfortable and efficient, without breaking any=20 existing code (oh, unless the namespace thing hurts EuGTK, donn't know=20 for sure) .