Modification to Variable Scopes

Euphoria 4.0 allows variables to be defined within code blocks, and for their scopes to be limited to these blocks. However, while the variables cannot be referenced outside of these blocks, they are not dereferenced when the scope is exited. This has implications both for copy on write issues, which can mean a performance hit, but also for automated cleanup, which relies on dereferencing of euphoria objects.

The following is an idea for resolving this issue.

New SymTab Type

In addition to temps, literals, variables, constants and subprograms, a scope entry could be used to track the smaller scopes with which variables are associated. Currently, symtab.e tracks these smaller scopes using the scope_stack variable. The variables in each scope are tracked. For each entry in the stack, a new SymTab entry would be made of type M_SCOPE.

Similar to the chain of symbols in the S_NEXT field of a SymTab entry, a new SymTab entry for variables, S_NEXT_IN_SCOPE could be added, with the effect of creating a linked list of all variables in the scope. The scope entry would point to the first symbol in the scope, and the last variable in the scope would have a 0 for that field.


In order to use the new scopes, a new opcode, EXIT_SCOPE could be created. It would take a pointer to the SymTab entry to be exited. Then it would loop through the variables and dereference them, and set their values to NOVALUE, similar in fashion to what's done at the end of a routine.

IL Code Implications

Using the EXIT_SCOPE opcode requires the code emitter to know when a scope is being exited. At times, multiple scopes might be exited, as when an exit from a loop happens within an if block (or any other nested block situation, or goto, etc).

The EXIT_OPCODE would probably need to be emitted at the end of an if/switch/for/while/loop block, so that under normal circumstances, the variables are cleaned up. However, it might be better in these cases, to handle that as part of the END_ opcodes. Consider a while loop that never executes. There's no need to dereference any variables defined within the while block in that case.

An alternative is to emit the opcode at the source of the exit, such as a break or exit. It might be necessary to emit multiple EXIT_SCOPEs if multiple scopes are exited.

Backend Considerations

A return from a routine would only dereference those variables in the routines "top" scope, since any cleanup on smaller scopes would already have been done. We could embed logic in the end of loop ops to cause cleanup (when necessary). Basically, we'd put a label inside the EXIT_SCOPE opcode, and use a goto to clean up the scope. The caller would increment pc and set obj_ptr (or whatever EXIT_SCOPE uses to identify the scope in question), and goto the code that does the actual work.


Quick Links

User menu

Not signed in.

Misc Menu