Scope

Scope

An identifier is a name given to an object (variable, constant, enum) or a routine (procedure, function, type). A declaration statement gives an object or routine its identifier.

The scope of an identifier is the part of the program within which the name can be used.

Euphoria uses lexical scope, meaning that scope can be determined by looking at the source-code before the program is run.

Each identifier name must be unique. Within any scope, attempting to use the same identifier in different declarations results in a nameclash error.

Nameclashing is easy to produce. Short meaningful names are in limited supply. It is easy to 'recycle' names; identifiers like 'x', 'pixel', 'file' will have multiple uses. Library files may have names you wanted to use. Programs written by multiple authors will often have naming conflicts.

A related problem occurs when the same identifier is used for different purposes. These are subtle and difficult to find bugs because, as far as syntax goes, the programs have been written properly.

Nameclash problems are reduced when: scopes are small, and scopes are isolated. The smaller the scope, the less likely there will be a problem with identifier use. When scopes are isolated, otherwise identical names can be used as identifiers.

Each file encapsulates a local scope -- invisible to the external scope of outside files. Within a file, each routine encapsulates a private scope -- invisible to the local scope of a file. Within a statement, blocks of code represented nested scope, distinct from other scopes.

Using keywords, selected identifiers from an external scope can be made visible to selected files. The scope of a file may also be identified with a namespace qualifier. As a result a programmer then has fine control how any identifier may be used.

Scope rules let you to take a large program, and view it as a collection of small easy to manage parts.

Using Scope

Here is an overview of scope. Other sections will give examples and detailed explanations.

The scope of an identifier is the part of the program within which the name can be used.

Using scope involves two fundamental strategies:

  • Declare an identifier (object, routine) just before you use it
  • Restrict the scope as much as you can (as to extent and accessibility)

Before you even start:

  • Built-in identifiers defined and available from anywhere
  • Effectively they belong to a super, all encompassing, scope
  • Namespace is eu

A declaration assigns an identifier, a 'name' to:

  • Objects ( variables, constant, enum)
  • Routines (procedure, function, type)

There are four kinds of parts in source-code:

  • File (local)
  • Routine blocks (private)
  • Flow control blocks (nestable)
  • Outside files (external)

There are two kinds of identifiers to consider:

  • Top-level only (procedure, function, type, constant, enum)
  • Nestable ( variables )

Variables can be declared in a nested scope:

  • Higher level identifiers are visible to lower level nested scopes
  • Lower level identifiers are not visible to higher level scopes

Variables an be declared in a private scope:

  • External identifiers will be hidden (shadowed)
  • External identifiers can be revealed with a namespace qualifier

Code from external files may be added to the local scope using the include keyword:

  • Top-level code will execute
  • All identifiers are hidden by default
  • Duplicated 'include' files are quietly ignored

External identifiers are made visible to the local scope using these keywords:

  • global
  • export
  • public

Included files themselves:

  • Have their own local scope
  • Follow the same rules about including other files
  • May be one of thirty levels of nesting

The keyword as, is used to create a namespace qualifier which:

  • Can be used to distinguish between, otherwise, identical identifiers
  • Can reveal 'shadowed' identifiers
  • Built-in identifiers belong the the eu namespace

The keyword override, lets you replace an existing declaration with your own declaration.

As a last resort, the source-code will have to be re-written to eliminate all remaining nameclashes.

Built-in, Local, External

images/builtin-local-external.svg

Built-in identifiers are used like identifiers found in the standard library, except that they are part of the interpreter (for speed and convenience.) As part of the interpreter, they are available for use anywhere in a program. That is why their scope could be described as a super scope that covers all other scopes.

There is a list of built-in identifiers. They can be used as identifier names. A new declaration will shadow the Euphoria built-in identifier; the new user declared identifier will be used instead of hidden Euphoria identifier.

A shadowed built-in can still be used when the eu namespace qualifier is added to an identifier name.

Each file represents a scope that is limited by the extent of the file itself. This is called local scope, and it finishes at the end of the file.

Each file views the scope of outside files as external -- which are all invisible by default.

Note: Each file follows the same rules for scope. (The main file (.ex) is only special in that is the starting point of the program.)

Top-Level

images/local-private-nested.svg

Statements written into a file start out being written at the top-level. Some statements can only be written at the top-level (procedures, functions, types, constants, enums). Flow control statements (if, switch, while, loop) can be top-level or nested within other statements. Code written within another statement is at a nested-level, and nesting can continue down many levels. Variables can be declared at the top level or in a nested level.

A routine (procedure, function, type) is declared only at the top-level of a file. That means routines can not be nested inside other routines. The default scope of a routine is for the entire local scope of a file. That means you can use a routine from the first line of a file even if the routine is declared at the end of the file. The code inside a routine can call on other routines even if they are declared later in the file.

Routine_id: Previous Euphoria versions did not have this convenience. A special calling mechanism based on routine_id() was used instead.

Each routine encloses a block of code (shown here as ... ). Thus: procedure ... end procedure, or function ... end function, enclose a block. The code within a routine is in a new private scope isolated from the local (top-level) scope of the file. routine parameters and variables are invisible outside of the routine.

The default scope for a constant or enum is from the point of declaration to the end of the file. They can not be nested, and are always written into the local scope.

The default scope of a variable is from the point of declaration to the end of the scope it was declared in. For local scope this means to the end of a file. For private scope it means to the end of the routine. For nested scopes this means to the end of the code block.

A code block is that part of a control flow statement between the keywords that form that statement. A single code block (shown here as ... ) is formed for these statements: while ... end while, loop ... end loop. Some for ... end for statements are written with only one code block.

A conditional statement may contain several code blocks.

  • In an 'if' statement the keywords then, elsif, else end if delimit blocks of code.
  • In a switch statement each 'case' is a separate block of code, found between then ... case delimiters.

Example:

Example MAIN.EX Result Notes
1

-- main.ex
atom x = 1
procedure foo()
atom x = 222
? x
end procedure
foo()

222
'x' in private scope
shadows 'x' of local scope
2

-- main.ex
namespace Top
atom x = 1
procedure foo()
atom x = 200
? Top:x
end procedure
foo()

1
'x' from 'Top' is now displayed

In the example, the local scope of the file main.ex was given the namespace Top. It was used inside the private scope a procedure to display the outside 'x' rather than the 'x' declared inside the procedure.

The keyword namespace can only be used to name the scope of a file (at the top level ). Using namespace inside the private scope of a routine has no effect.

Rules Do Not Change

A reminder that rules for scope do not change--there are no special exemptions. A routine is visible for its entire valid scope. A variable is visible only from the point declaration to the end of scope. As programs become larger by the addition of more files, the fundamental scope rules between any two files remains the same. The good news is that while programs may get larger and more complicated, Euphoria remains simple and understandable.

Modules

From the viewpoint of any one file, all declarations in outside files are considered to be external. The default scope for any file is restricted to the file itself--that means all external declarations are invisible.

Writing an include statement has the effect of 'including' the contents of an external file at the place where the include statement is written. All top level statements from the external file will be executed. By default, all declarations will be invisible. Additional include statements for the same file will be quietly ignored (their top-level statements can only execute once.)

Any two files can have a relationship were one is "doing the including" and the other is "being included". The files being included are often just called "include files". These descriptions are a bit wordy, so we can equally consider a main file as the includer and a module file as the included. There is nothing special or exotic about 'main' and 'module', they are just descriptions.

Modular programming: Modular programming is taking a large program and breaking it up into convenient smaller files, and then calling the smaller files modules. The scope rules being discussed then govern the visibility of identifiers between each module. Other than scope rules, modules have no special properties.

The common file extensions ( .ex for main and .e for a module) have no special significance--they are just convenient labels. Euphoria programs can execute with no file extensions at all.

Example:

MAIN Module Result Note

-- MAIN.EX
include module.e
include module.e

-- module.e
puts(1, "This is the include file" )

--> This is the include file
include happens once per file

The top-level puts() statement from module.e is executed only once. By default the declarations in an include file are invisible:

MAIN Module Result Note
Doing the including Being included

-- MAIN.EX
include m2.e

-- module.e
atom x = 2
? x

2
Top-level statement in the include file executes

-- MAIN.EX
include module.e
? x

-- m2.e
atom x = 2

--error
--undeclared variable
the 'x' declared in module.e is invisible to MAIN.EX

To make an identifier inside module visible to the main (external) file each identifier in the module must have one of these keywords preceding it: global, export, public.

The global keyword makes the declaration of an identifier visible everywhere in a program.

Main Module Result Note

--MAIN.EX
include module.e
? x

--module.e
global x = 2

2
'x' is seen everywhere

A global identifier will be seen everywhere, regardless of which file it was declared in. This is antagonistic to the idea that scope should be as small as possible. Using global variables is often considered to be bad programming style.

Example:

Main Module Result Note

--MAIN.EX
include module.e
? x

--module.e
global atom x = 2

2
'x' is seen everywhere

--MAIN.EX
include module.e
global atom x = 1

--module.e
? x

1
'x' is seen everywhere

The export keyword makes an identifier from a module visible to the main file--but not further. The visibility of an export identifier can go only one level.

Example:

Main ModuleA ModuleB Result Note

--MAIN.EX
include moduleA.e

--moduleA.e
include moduleB.e
? x

-- moduleB.e
export atom x = 30

30
'x' is seen in moduleB

--MAIN.EX
include moduleA.e
? x

--moduleA.e
include moduleB.e

-- moduleB.e
export atom x = 30

--error
--x undefined
'x' is NOT seen in MAIN

The public keywords make it possible for a public identifier to be visible to several files. For a single main to module include, the public keyword works much like the export keyword. For a longer chain of includes, public include can be used to make the identifier visible along the chain of includes.

Main ModuleA ModuleB Result Note

--MAIN.EX
include moduleA.e

--moduleA.e
include moduleB.e
? x

-- moduleB.e
public atom x = 300

300
'x' is seen in moduleB

--MAIN.EX
include moduleA.e
? x

--moduleA.e
include moduleB.e

-- moduleB.e
public atom x = 30

--error
--x undefined
'x' is NOT seen in MAIN

--MAIN.EX
public include moduleA.e
? x

--moduleA.e
public include moduleB.e

-- moduleB.e
public atom x = 30

300
'x' is seen in MAIN

The rules for scope, and how include files work is always the same. The "trick" is that you must look at each file individually as 'main', and all other files as external. Only then will you understand how identifiers are shared between many files.

Example:

Example First File Second File Result Notes
1
ONE <== alpha

-- ONE.EX
include alpha.e
? x

-- alpha.e
export atom x = 2

2
'x' visible in main
2
ONE ==> alpha

-- ONE.EX
include alpha.e
atom y = 1

-- alpha.e
? y

--error
-- y undefined
'y' not visible
has NOT been exported and included
3
ONE ==> alpha

-- ONE.EX
export atom y = 1

-- alpha.e
include ONE.EX
? y

1
'y' visible
HAS been exported and included

Example 1 is similar to the previous examples; x has to be 'exported' from the module and 'included by the main file. In Example 2 the export-include relationship is broken. Example 3 is analogous to Example 1. The y has to be 'exported' and then 'included'.

Hint: It is not .ex and .e that make a pair of files a main-module pair. The file doing the including is always "main", and the file being included is always the "module".

Hint: If sharing identifiers between several modules, an export-include will be required--in both directions--between each pair of files.

Hint: Use export to limit the scope of identifiers between files. Use public to extend the scope of identifiers between files.

Inventing truly unique identifier names is hard work and often a waste of time. Nameclashes are inevitable. The cure is to give an name to the local scope of the file or files being included; such an identifier is called a namespace. The namespace is created using the as keyword. A namespace is then used as a namespace qualifier, a prefix to an identifier, to make each identifier unique.

Example Main module Result Notes
1

-- MAINE.EX
include module.e
atom x = 1
? x

-- module.e
export atom x = 2

1
--problem
-- which x?
'x' is declared in two places
2

-- MAINE.EX
include module.e as M
atom x = 1
? x
? M:x

-- module.e
export atom x = 2

1
2
'x' still declared in two places
now has two identifiers

In Example 1 there is some ambiguity. The identifier x is used in two separate scopes to declare two separate variables. The value for x that was output ( 1 ) is the "most obvious" one; there is no way to determine if the intended output should have been ( 2 ), that of the module file.

In Example 2 there is no confusion. The module is included as the namespace M. It is used as a namespace qualifier to distinguish the x in the module as M:x, from the x in the main file. Now two unique values can be displayed.

Example:

It is possible to include two modules that each contain the same identifier. In this example a namespace qualifier for each included files solves the nameclash problem.

MAIN Module1 Module2 Result Notes

-- MAIN.EX
include module1 as M1
include module2 as M2
? M1:x
? M2:x

-- module1.e
export atom x = 1

-- module2.e
export atom x = 222

1
222
two namespace qualifiers are needed

Without the namespace qualifiers this example would crash.

Override

The keyword override lets you write a routine that shadows an existing identifier.

Example 1:

A module contains a routine that you wish worked differently. Rather than edit that module, an override can let you use your own routine.

-- main.ex 
include std/math.e 
 
override function sin( object x ) 
   return eu:sin( deg2rad( x ) ) 
end function 
 
? sin(30) 
--> 0.5 
-- correct answer only if angles measured in degrees 
-- incorrect if angles measured in 'expected' radians 

The sin() function requires angles measured in radians, but many people prefer to measure angles in degrees. To perform the actual calculation the original sin function is still needed. It is called using the eu namespace qualifier. The override keyword creates a new function that uses the the old identifier, but now contains a degree to radian conversion.

This 'changes' the expected behavior of the sine function. This can be a dangerous programming technique.

Example 2:

In this example the puts() procedure is re-written to output values to two separate locations:

override procedure puts(integer channel,sequence text) 
    eu:puts(log_file, text) 
    eu:puts(channel, text) 
end procedure 

This example 'extends' the expected behavior of the puts() procedure. Extra things happen, but the original program does not change.

A warning will be issued when you use override; since one routine changes the behavior of another routine. Not only can this be confusing, but it is easy to break code. Code that was calling the former routine expects no difference in service, so there should not be any.

If an identifier is declared global, public or export (but not override) and there is a built-in of the same name, Euphoria will not assume an 'override', and will choose the built-in. A warning will be generated whenever this happens.

Search



Quick Links

User menu

Not signed in.

Misc Menu