4.2 Declarations

Identifiers, which encompass all explicitly declared variable, constant or routine names, may be of any length. Upper and lower case are distinct. Identifiers must start with a letter or underscore and then be followed by any combination of letters, digits and underscores. The following reserved words have special meaning in Euphoria and cannot be used as identifiers:

and             export           public
as              fallthru         retry  
break           for              return 
by              function         routine
case            global           switch 
constant        goto             then
continue        if               to 
do              ifdef            type 
else            include          until
elsedef         label            while
elsif           loop             with 
elsifdef        namespace        without
end             not              xor 
entry           or
enum            override
exit            procedure

The Euphoria editor displays these words in blue

The following are Euphoria built-in routines. It is best if you do not use these for your own identifiers:

abort           getenv          peek4s          system
and_bits        gets            peek4u          system_exec
append          hash            peeks           tail
arctan          head            platform        tan
atom            include_paths   poke            task_clock_start
c_func          insert          poke2           task_clock_stop
c_proc          integer         poke4           task_create
call            length          position        task_list
call_func       log             power           task_schedule
call_proc       machine_func    prepend         task_self
clear_screen    machine_proc    print           task_status
close           match           printf          task_suspend
command_line    match_from      puts            task_yield
compare         mem_copy        rand            time
cos             mem_set         remainder       trace
date            not_bits        remove          xor_bits
delete          object          repeat          ?
delete_routine  open            replace         &
equal           option_switches routine_id      $
find            or_bits         sequence        
find_from       peek            sin             
floor           peek_string     splice          
get_key         peek2s          sprintf
getc            peek2u          sqrt

Identifiers can be used in naming the following:

  • procedures
  • functions
  • types
  • variables
  • constants
  • enums

4.2.1.1 procedures

These perform some computation and may contain a list of parameters, e.g.

procedure empty()
end procedure

procedure plot(integer x, integer y)
    position(x, y)
    puts(1, '*')
end procedure

There are a fixed number of named parameters, but this is not restrictive since any parameter could be a variable-length sequence of arbitrary objects. In many languages variable-length parameter lists are impossible. In C, you must set up strange mechanisms that are complex enough that the average programmer cannot do it without consulting a manual or a local guru.

A copy of the value of each argument is passed in. The formal parameter variables may be modified inside the procedure but this does not affect the value of the arguments. Pass by reference can be achieved using indexes into some fixed sequence.

Performance Note:
The interpreter does not actually copy sequences or floating-point numbers unless it becomes necessary. For example,

y = {1,2,3,4,5,6,7,8.5,"ABC"}
x = y

The statement x = y does not actually cause a new copy of y to be created. Both x and y will simply "point" to the same sequence. If we later perform x[3] = 9, then a separate sequence will be created for x in memory (although there will still be just one shared copy of 8.5 and "ABC"). The same thing applies to "copies" of arguments passed in to subroutines.

For a number of procedures or functions--see below--some parameters may have the same value in many cases. The most expected value for any parameter may be given a default value. Omitting the value of such a parameter on a given call will cause its default value to be passed.

procedure foo(sequence s, integer n=1)
    ? n + length(s)
end procedure

foo("abc")    -- prints out 4 = 3 + 1. n was not specified, so was set to 1.
foo("abc", 3) -- prints out 6 = 3 + 3

This is not limited to the last parameter(s):

procedure bar(sequence s="abc", integer n, integer p=1)
    ? length(s)+n+p
end procedure

bar(, 2)      -- prints out 6 = 3 + 2 + 1
bar(2)        -- errors out, as 2 is not a sequence
bar(, 2,)     -- same as bar(,2)
bar(, 2, 3)   -- prints out 8 = 3 + 22 + 3
bar({}, 2, )  -- prints out 3 = 0 + 2 + 1
bar()         -- errors out, second parameter is omitted,
              -- but doesn't have a default value

Any expression may be used in a default value. Parameters that have been already mentioned may even be part of the expression:

procedure baz(sequence s, integer n=length(s))
    ? n
end procedure

baz("abcd") -- prints out 4

4.2.1.2 functions

These are just like procedures, but they return a value, and can be used in an expression, e.g.

function max(atom a, atom b)
    if a >= b then
        return a
    else
        return b
    end if
end function

4.2.1.3 return statement

Any Euphoria object can be returned. You can, in effect, have multiple return values, by returning a sequence of objects. e.g.

return {x_pos, y_pos}

However, Euphoria does not have variable lists. When you return a sequence, you still have to dispatch its contents to variables as needed. And you cannot pass a sequence of parameters to a routine, unless using call_func or call_proc, which carries a performance penalty.

We will use the general term "subroutine", or simply "routine" when a remark is applicable to both procedures and functions.

Defaulted parameters can be used in functions exactly as they are in procedures. See the section above for a few examples.

4.2.1.4 types

These are special functions that may be used in declaring the allowed values for a variable. A type must have exactly one parameter and should return an atom that is either true (non-zero) or false (zero). Types can also be called just like other functions. See Specifying the Type of a variable.

Although there are no restrictions to using defaulted parameters with types, their use is so much constrained by a type having exactly one parameter that they are of little practical help there.

You cannot use a type to perform any adjustment to the value being checked, if only because this value may be the temporary result of an expression, not an actual variable.

4.2.1.5 variables

These may be assigned values during execution e.g.

-- x may only be assigned integer values
integer x
x = 25

-- a, b and c may be assigned *any* value
object a, b, c
a = {}
b = a
c = 0

When you declare a variable you name the variable (which protects you against making spelling mistakes later on) and you define which sort of values may legally be assigned to the variable during execution of your program.

The simple act of declaring a variable does not assign any value to it. If you attempt to read it before assigning any value to it, Euphoria will issue a run-time error as "variable xyz has never been assigned a value".

To guard against forgetting to initialize a variable, and also because it may make the code clearer to read, you can combine declaration and assignment:

integer n = 5

This is equivalent to

integer n
n = 5

It is not infrequent that one defines a private variable that bears the same name as one already in scope. You can reuse the value of that variable when performing an initialization on declare by using a default namespace for the current file:

namespace app

integer n
n=5

procedure foo()
    integer n = app:n + 2
    ? n
end procedure

foo() -- prints out 7

4.2.1.6 constants

These are variables that are assigned an initial value that can never change e.g.

constant MAX = 100
constant Upper = MAX - 10, Lower = 5
constant name_list = {"Fred", "George", "Larry"}

The result of any expression can be assigned to a constant, even one involving calls to previously defined functions, but once the assignment is made, the value of the constant variable is "locked in".

Constants may not be declared inside a subroutine.

4.2.1.7 enum

An enumerated value is a special type of constant where the first value defaults to the number 1 and each item after that is incremented by 1 by default. An optional by keyword can be supplied to change the increment value. As with sequences, enums can also be terminated with a $ for ease of editing enum lists that may change frequently during development.

enum ONE, TWO, THREE, FOUR

-- ONE is 1, TWO is 2, THREE is 3, FOUR is 4

You can change the value of any one item by assigning it a numeric value. Enums can only take numeric values. You cannot set the starting value to an expression or other variable. Subsequent values are always the previous value plus one, unless they too are assigned a default value.

enum ONE, TWO, THREE, ABC=10, DEF, XYZ

-- ONE is 1, TWO is 2, THREE is 3
-- ABC is 10, DEF is 11, XYZ is 12

Euphoria sequences use integer indexes, but with enum you may write code like this:

enum X, Y
sequence point = { 0,0 }
point[X] = 3
point[Y] = 4

There is also a special form of enum, an enum type. This is a simple way to write a user-defined type based on the set of values in a specific enum group. The type created this way can be used anywhere a normal user-defined type can be use.

For example,

enum type RGBA RED, GREEN, BLUE, ALPHA end type

-- Only allow values of RED, GREEN, BLUE, or ALPHA as parameters
function xyz( RGBA x, RGBA y)
	return
end function

However there is one significant difference when it comes to enum types. For normal types, when calling the type function, it returns either 0 or 1. The enum type function returns 0 if the argument is not a member of the enum set, and it returns a positive integer when the argument is a member. The value returned is the ordinal number of the member in the enum's definition, regardless of what the member's value is. As an exception to this, if two enums share the same value, then they will share the same ordinal number. The ordinal numbers of enums surrounding these will continue to increment as if every enum had a unique ordinal number, causing some numbers to be skipped.

For example,

enum type color RED=4, GREEN=7, BLACK=1, BLUE=3 , PINK=10 end type

? color(RED)   --> 1
? color(GREEN) --> 2
? color(BLACK) --> 3
? color(BLUE)  --> 4
? color(PINK)  --> 5

constant color_names = {"rouge", "vert", "noir", "bleu", "rose"}

puts(1, color_names[color(BLUE)]) --> bleu

But with the exception,

enum type color RED, GREEN=7, BLACK=1, BLUE=3 , PINK=10 end type
? color(RED) --> 1
? color(GREEN) --> 2
? color(BLACK) --> 1
? color(BLUE) --> 4
? color(PINK) --> 5

Note that none of the enums have an ordinal number with a value of 3. This is simply skipped.

By default, unless an enum member is being specifically set to some value, its value will be one more than the previous member's value, with the first default value being 1. This default can be overridden. The syntax is:

enum by DELTA member1, member2, ... ,memberN

where 'DELTA' is a literal number with an optional operation code (*, +, -, /) preceding it.

Examples:

enum by 2 A,B,C=6,D      --> values are 1,3,6,8
enum by -2 A=10,B,C,D    --> values are 10,8,6,4
enum by * 2 A,B,C,D,E    --> values are 1,2,4,8,16
enum by / 3 A=81,B,C,D,E --> values are 81,27,9,3,1

Also note that enum members do not have to be integers.

enum by / 2 A=5,B,C --> values are 5, 2.5, 1.25

4.2.2 Specifying the type of a variable

So far you've already seen some examples of variable types but now we will define types more precisely.

Variable declarations have a type name followed by a list of the variables being declared. For example,

object a

global integer x, y, z

procedure fred(sequence q, sequence r)

The types: object, sequence, atom and integer are predefined. Variables of type object may take on any value. Those declared with type sequence must always be sequences. Those declared with type atom must always be atoms.

Variables declared with type integer must be atoms with integer values from -1073741824 to +1073741823 inclusive. You can perform exact calculations on larger integer values, up to about 15 decimal digits, but declare them as atom, rather than integer.

Note:
In a procedure or function parameter list like the one for fred() above, a type name may only be followed by a single parameter name.
Performance Note:
Calculations using variables declared as integer will usually be somewhat faster than calculations involving variables declared as atom. If your machine has floating-point hardware, Euphoria will use it to manipulate atoms that are not integers. If your machine doesn't have floating-point hardware (this may happen on old 386 or 486 PCs), Euphoria will call software floating-point arithmetic routines contained in euid.exe (or in Windows). You can force eui.exe to bypass any floating-point hardware, by setting an environment variable:

SET NO87=1

The slower software routines will be used, but this could be of some advantage if you are worried about the floating-point bug in some early Pentium chips.

4.2.2.1 User-defined types

To augment the predefined types, you can create user-defined types. All you have to do is define a single-parameter function, but declare it with type ... end type instead of function ... end function. For example,

type hour(integer x)
    return x >= 0 and x <= 23
end type

hour h1, h2

h1 = 10      -- ok
h2 = 25      -- error! program aborts with a message

Variables h1 and h2 can only be assigned integer values in the range 0 to 23 inclusive. After each assignment to h1 or h2 the interpreter will call hour(), passing the new value. The value will first be checked to see if it is an integer (because of "integer x"). If it is, the return statement will be executed to test the value of x (i.e. the new value of h1 or h2). If hour() returns true, execution continues normally. If hour() returns false then the program is aborted with a suitable diagnostic message.

"hour" can be used to declare subroutine parameters as well:

procedure set_time(hour h)

set_time() can only be called with a reasonable value for parameter h, otherwise the program will abort with a message.

A variable's type will be checked after each assignment to the variable (except where the compiler can predetermine that a check will not be necessary), and the program will terminate immediately if the type function returns false. Subroutine parameter types are checked each time that the subroutine is called. This checking guarantees that a variable can never have a value that does not belong to the type of that variable.

Unlike other languages, the type of a variable does not affect any calculations on the variable, nor the way its contents are displayed. Only the value of the variable matters in an expression. The type just serves as an error check to prevent any "corruption" of the variable. User-defined types can catch unexpected logical errors in your program. They are not designed to catch or correct user input errors. In particular, they cannot adjust a wrong value to some other, presumably legal, one.

Type checking can be turned off or on between subroutines using the with type_check or without type_check (see specialstatements). It is initially on by default.

Note to Bench markers:
When comparing the speed of Euphoria programs against programs written in other languages, you should specify without type_check at the top of the file. This gives Euphoria permission to skip run-time type checks, thereby saving some execution time. All other checks are still performed, e.g. subscript checking, uninitialized variable checking etc. Even when you turn off type checking, Euphoria reserves the right to make checks at strategic places, since this can actually allow it to run your program faster in many cases. So you may still get a type check failure even when you have turned off type checking. Whether type checking is on or off, you will never get a machine-level exception. You will always get a meaningful message from Euphoria when something goes wrong. (This might not be the case when you poke directly into memory, or call routines written in C or machine code.)
Euphoria's method of defining types is simpler than what you will find in other languages, yet Euphoria provides the programmer with greater flexibility in defining the legal values for a type of data. Any algorithm can be used to include or exclude values. You can even declare a variable to be of type object which will allow it to take on any value. Routines can be written to work with very specific types, or very general types.

For many programs, there is little advantage in defining new types, and you may wish to stick with the four predefined types. Unlike other languages, Euphoria's type mechanism is optional. You don't need it to create a program.

However, for larger programs, strict type definitions can aid the process of debugging. Logic errors are caught close to their source and are not allowed to propagate in subtle ways through the rest of the program. Furthermore, it is easier to reason about the misbehavior of a section of code when you are guaranteed that the variables involved always had a legal value, if not the desired value.

Types also provide meaningful, machine-checkable documentation about your program, making it easier for you or others to understand your code at a later date. Combined with the subscript checking, uninitialized variable checking, and other checking that is always present, strict run-time type checking makes debugging much easier in Euphoria than in most other languages. It also increases the reliability of the final program since many latent bugs that would have survived the testing phase in other languages will have been caught by Euphoria.

Anecdote 1:
In porting a large C program to Euphoria, a number of latent bugs were discovered. Although this C program was believed to be totally "correct", we found: a situation where an uninitialized variable was being read; a place where element number "-1" of an array was routinely written and read; and a situation where something was written just off the screen. These problems resulted in errors that weren't easily visible to a casual observer, so they had survived testing of the C code.
Anecdote 2:
The Quick Sort algorithm presented on page 117 of Writing Efficient Programs by Jon Bentley has a subscript error! The algorithm will sometimes read the element just before the beginning of the array to be sorted, and will sometimes read the element just after the end of the array. Whatever garbage is read, the algorithm will still work - this is probably why the bug was never caught. But what if there isn't any (virtual) memory just before or just after the array? Bentley later modifies the algorithm such that this bug goes away--but he presented this version as being correct. Even the experts need subscript checking!
Performance Note:
When typical user-defined types are used extensively, type checking adds only 20 to 40 percent to execution time. Leave it on unless you really need the extra speed. You might also consider turning it off for just a few heavily-executed routines. Profiling can help with this decision.

4.2.2.2 integer

An Euphoria integer is a mathematical integer restricted to the range -1,073,741,824 to +1,073,741,823. As a result, a variable of the integer type, while allowing computations as fast as possible, cannot hold 32-bit machine addresses, even though the latter are mathematical integers. You must use the atom type for this purpose. Also, even though the product of two integers is a mathematical integer, it may not fit into an integer, and should be kept in an atom instead.

4.2.2.3 atom

An atom can hold three kinds of data:

  • Mathematical integers in the range -power(2,53) to +power(2,53)
  • Floating point numbers, in the range -power(2,1024)+1 to +power(2,1024)-1
  • Large mathematical integers in the same range, but with a fuzz that grows with the magnitude of the integer.

power(2,53) is slightly above 9.1015, power(2,1024) is in the 10308 range.

Because of these constraints, which arise in part from common hardware limitations, some care is needed for specific purposes:

  • The sum or product of two integers is an atom, but may not be an integer.
  • Memory addresses, or handles acquired from anything non Euphoria, including the operating system, must be stored as an atom.
  • For large numbers, usual operations may yield strange results:
integer n = power(2, 27) -- ok
integer n_plus = n + 1, n_minus = n - 1 -- ok
atom a = n * n -- ok
atom a1 = n_plus * n_minus -- still ok
? a - a1 -- prints 0, should be 1 mathematically

This is not an Euphoria bug. The IEEE 754 standard for floating point numbers provides for 53 bits of precision for any real number, and an accurate computation of a-a1 would require 54 of them. Intel FPU chips do have 64 bit precision registers, but the low order 16 bits are only used internally, and Intel recommends against using them for high precision arithmetic. Their SIMD machine instruction set only uses the IEEE 754 defined format.

4.2.2.4 sequence

A sequence is a type that is a container. A sequence has elements which can be accessed through their index, like in my_sequence[3]. sequences are so generic as being able to store all sorts of data structures: strings, trees, lists, anything. Accesses to sequences are always bound checked, so that you cannot read or write an element that does not exist, ever. A large amount of extraction and shape change operations on sequences is available, both as built-in operations and library routines. The elements of a sequence can have any type.

sequences are implemented very efficiently. Programmers used to pointers will soon notice that they can get most usual pointer operations done using sequence indexes. The loss in efficiency is usually hard to notice, and the gain in code safety and bug prevention far outweighs it.

4.2.2.5 object

This type can hold any data Euphoria can handle, both atoms and sequences.

The object() type returns 0 if a variable is not initialized, else 1.

4.2.3 Scope

4.2.3.1 Why scopes, and what are they?

The scope of an identifier is the portion of the program where its declaration is in effect, i.e. where that identifier is visible.

Euphoria has many pre-defined procedures, functions and types. These are defined automatically at the start of any program. The Euphoria editor shows them in magenta. These pre-defined names are not reserved. You can override them with your own variables or routines.

It is possible to use a user-defined identifier before it has been declared, provided that it will be declared at some point later in the program.

For example, procedures, functions and types can call themselves or one another recursively. Mutual recursion, where routine A calls routine B which directly or indirectly calls routine A, implies one of A or B being called before it is defined. This was traditionally the most frequent situation which required using the routine_id() mechanism, but is now supported directly. See Indirect Routine Calling for more details on the routine_id() mechanism.

4.2.3.2 Defining the scope of an identifier

The scope of an identifier is a description of what code can 'access' it. Code in the same scope of an identifier can access that identifier and code not in the same scope cannot access it.

The scope of a variable depends upon where and how it is declared.

  • If it is declared within a for, while, loop or switch, its scope starts at the declaration and ends at the respective end statement.
  • In an if statement, the scope starts at the declaration and ends either at the next else, elsif or end if statement.
  • If a variable is declared within a routine (known as a private variable) and outside one of the structures listed above, the scope of the variable starts at the declaration and ends at the routine's end statement.
  • If a variable is declared outside of a routine (known as a module variable), and does not have a scope modifier, its scope starts at the declaration and ends at the end of the file it is declared in.

The scope of a constant that does not have a scope modifier, starts at the declaration and ends at the end of the file it is declared in.

The scope of a enum that does not have a scope modifier, starts at the declaration and ends at the end of the file it is declared in.

The scope of all procedures, functions and types, which do not have a scope modifier, starts at the beginning of the source file and ends at the end of the source file in which they are declared. In other words, these can be accessed by any code in the same file.

Constants, enums, module variables, procedures, functions and types, which do not have a scope modifier are referred to as local. However, these identifiers can have a scope modifier preceding their declaration, which causes their scope to extend beyond the file they are declared in.

  • If the keyword global precedes the declaration, the scope of these identifiers extends to the whole application. They can be accessed by code anywhere in the application files.
  • If the keyword public precedes the declaration, the scope extends to any file that explicitly includes the file in which the identifier is declared, or to any file that includes a file that in turn public includes the file containing the public declaration.
  • If the keyword export precedes the declaration, the scope only extends to any file that directly includes the file in which the identifier is declared.

When you include a Euphoria file in another file, only the identifiers declared using a scope modifier are accessible to the file doing the include. The other declarations in the included file are invisible to the file doing the include, and you will get an error message, "Errors resolving the following references", if you try to use them.

There is a variant of the include statement, called public include, which will be discussed later and behaves differently on public symbols.

Note that constant and enum declarations must be outside of any subroutine.

Euphoria encourages you to restrict the scope of identifiers. If all identifiers were automatically global to the whole program, you might have a lot of naming conflicts, especially in a large program consisting of files written by many different programmers. A naming conflict might cause a compiler error message, or it could lead to a very subtle bug, where different parts of a program accidentally modify the same variable without being aware of it. Try to use the most restrictive scope that you can. Make variables private to one routine where possible, and where that is not possible, make them local to a file, rather than global to the whole program. And whenever an identifier needs to be known from a few files only, make it public or export so as to hide it from whoever does not need to see it -- and might some day define the same identifier.

For example:

-- sublib.e
export procedure bar()
?0
end procedure

-- some_lib.e
include sublib.e
export procedure foo()
?1
end procedure
bar() -- ok, declared in sublib.e

-- my_app.exw
include some_lib.e
foo() -- ok, declared in some_lib.e
bar() -- error! bar() is not declared here

Why not declare foo() as global, as it is meant to be used anywhere? Well, one could, but this will increase the risks of name conflicts. This is why, for instance, all public identifiers from the standard library have public scope. global should be used rarely, if ever. Because earlier versions of Euphoria didn't have public or export, it has to remain there for a while. One should be very sure of not polluting any foreign file's symbol table before using global scope. Built-in identifiers act as if declared as global -- but they are not declared in any Euphoria coded file.

4.2.3.3 Using namespaces

Identifiers marked as global, public or export are known as exposed variables because they can be used in files other than the one they were declared in.

All other identifiers can only be used within their own file. This information is helpful when maintaining or enhancing the file, or when learning how to use the file. You can make changes to the internal routines and variables, without having to examine other files, or notify other users of the include file.

Sometimes, when using include files developed by others, you will encounter a naming conflict. One of the include file authors has used the same name for a exposed identifier as one of the other authors. One of way of fixing this, if you have the source, is to simply edit one of the include files to correct the problem, however then you'd have repeat this process whenever a new version of the include file was released.

Euphoria has a simpler way to solve this. Using an extension to the include statement, you can say for example:

include johns_file.e as john
include bills_file.e as bill

john:x += 1
bill:x += 2

In this case, the variable x was declared in two different files, and you want to refer to both variables in your file. Using the namespace identifier of either john or bill, you can attach a prefix to x to indicate which x you are referring to. We sometimes say that john refers to one namespace, while bill refers to another distinct namespace. You can attach a namespace identifier to any user-defined variable, constant, procedure or function. You can do it to solve a conflict, or simply to make things clearer. A namespace identifier has local scope. It is known only within the file that declares it, i.e. the file that contains the include statement. Different files might define different namespace identifiers to refer to the same included file.

There is a special, reserved namespace, eu for referring to built-in Euphoria routines. This can be useful when a built-in routine has been overridden:

procedure puts( integer fn, object text )
    eu:puts(fn, "Overloaded puts says: "& text )
end procedure

puts(1, "Hello, world!\n")
eu:puts(1, "Hello, world!\n")

Files can also declare a default namespace to be used with the file. When a file with a default namespace is included, if the include statement did not specify a namespace, then the default namespace will be automatically declared in that file. If the include statement declares a namespace for the newly included file, then the specified namespace will be available instead of the default. No two files can use the same namespace identifier. If two files with the same default namespaces are included, at least one will be required to have a different namespace to be specified.

To declare a default namespace in a file, the first token (whitespace and comments are ignored) should be 'namespace' followed by the desired name:

-- foo.e :  this file does some stuff
namespace foo

A namespace that is declared as part of an include statement is local to the file where the include statement is. A default namespace declared in a file is considered a public symbol in that file. Namespaces and other symbols (e.g., variables, functions, procedures and types) can have the same name without conflict. A namespace declared through an include statement will mask a default namespace declared in another file, just like a normal local variable will mask a public variable in another file. In this case, rather than using the default namespace, declare a new namespace through the include statement.

Note that declaring a namespace, either through the include statement or as a default namespace does not require that every symbol reference must be qualified with that namespace. The namespace simply allows the user to deconflict symbols in different files with the same name, or to allow the programmer to be explicit about where symbols are coming from for the purposes of clarity, or to avoid possible future conflicts.

A qualified reference does not absolutely restrict the reference to symbols that actually reside within the specified file. It can also apply to symbols included by that file. This is especially useful for multi-file libraries. Programmers can use a single namespace for the library, even though some of the visible symbols in that library are not declared in the main file:

-- lib.e
namespace lib

public include sublib.e

public procedure main()
...

-- sublib.e
public procedure sub()
...

-- app.ex
include lib.e

lib:main()
lib:sub()

Now, what happens if you do not use 'public include'?

-- lib2.e
include sublib.e
...

-- app2.ex
include lib.e
lib:main()
lib:sub() -- error.  sub() is visible in lib2.e but not in app2.ex

4.2.3.4 The visibility of public and export identifiers

When a file needs to see the public or exported identifiers in another file that includes the first file, the first file must include that other (including) file.

For example,

-- Parent file: foo.e --
public integer Foo = 1
include bar.e -- bar.e needs to see Foo
showit() -- execute a routine in bar.e
-- Included file: bar.e --
include foo.e -- included so I can see Foo
constant xyz = Foo + 1

public procedure showit()
? xyz
end procedure

Public symbols can only be seen by the file that explicitly includes the file where those public symbols are declared.

For example,

-- Parent file: foo.e --
include bar.e
showit() -- execute a public routine in bar.e

If however, a file wants a third file to also see the symbols that it can, it needs to do a public include.

For example,

-- Parent file: foo.e --
public include bar.e
showit() -- execute a public routine in bar.e

public procedure fooer()
   . . .
end procedure
-- Appl file: runner.ex --
include foo.e
showit() -- execute a public routine that foo.e can see in bar.e
fooer()  -- execute a public routine in foo.e

The public include facility is designed to make having a library composed of multiple files easy for an application to use. It allows the main library file to expose symbols in files that it includes as if the application had actually included them. That way, symbols meant for the end user can be declared in files other than the main file, and the library can still be organized however the author prefers without affecting the end user.

Another example
Given that we have two files LIBA.e and LIBB.e ...

-- LIBA.e --
public constant
    foo1 = 1,
    foo2 = 2

export function foobarr1()
    return 0
end function

export function foobarr2()
    return 0
end function

and

-- LIBB.e --
-- I want to pass on just the constants not
-- the functions from LIBA.e.
public include LIBA.e

The export scope modifier is used to limit the extent that symbols can be accessed. It works just like public except that export symbols are only ever passed up one level only. In other words, if a file wants to use an export symbol, that file must include it explicitly.

In this example above, code in LIBB.e can see both the public and export symbols declared in LIBA.e (foo1, foo2 foobarr1 and foobarr2) because it explicitly includes LIBA.e. And by using the public prefix on the include of LIBA.e, it also allows any file that includes LIBB.e to the public symbols from LIBA.e but they will not see any export symbols declared in LIBA.e.

In short, a public include is used expose public symbols that are included, up one level but not any export symbols that were include.

4.2.3.5 The complete set of resolution rules

Resolution is the process by which the interpreter determines which specific symbol will actually be used at any given point in the code. This is usually quite easy as most symbol names in a given scope are unique and so Euphoria does not have to choose between them. However, when the same symbol name is used in different but enclosing scopes, Euphoria has to make a decision about which symbol the coder is referring to.

When Euphoria sees an identifier name being used, it looks for the name's declaration starting from the current scope and moving outwards through the enclosing scopes until the name's declaration is found.

The hierarchy of scopes can be viewed like this ...

global/public/export
  file
     routine
        block 1
           block 2
           ...
              block n
So, if a name is used at a block level, Euphoria will first check for its declaration in the same block, and if not found will check the enclosing blocks until it reaches the routine level, in which case it checks the routine (including parameter names), and then check the file that the block is declared in and finally check the global/public/export symbols.

By the way, Euphoria will not allow a name to be declared if it is already declared in the same scope, or enclosing block or enclosing routine. Thus the following examples are illegal...

integer a
if x then
   integer a -- redefinition not allowed.
end if
if x then
   integer a
   if y then
      integer a -- redefinition not allowed.
   end if
end if
procedure foo(integer a)
if x then
  integer a -- redefinition not allowed.
end if
end procedure

But note that this below is valid ...

integer a = 1
procedure foo()
    integer a = 2
    ? a
end procedure
? a

In this situation, the second declaration of 'a' is said to shadow the first one. The output from this example will be ...

2
1

Symbols all declared in the same file (be they in blocks, routines or at the file level) are easy to check by Euphoria for scope clashes. However, a problem can arise when symbol names declared as global/public/export in different files are placed in the same scope during include processing. As it is quite possible for these files to come from independent developers that are not aware of each other's symbol names, the potential for name clashes is high. A name clash is just when the same name is declared in the same scope but in different files. Euphoria cannot generally decide which name you were referring to when this happens, so it needs you help to resolve it. This is where the namespace concept is used.

A namespace is just a name that you assign to an include file so that your code can exactly specify where an identifier that your code is using actually comes from. Using a namespace with an identifier, for example:

include somefile.e as my_lib
include another.e
my_lib:foo()

enables Euphoria to resolve the identifier (foo) as explicitly coming from the file associated with the namespace "my_lib". This means that if foo was also declared as global/public/export in another.e then that foo would be ignored and the foo in somefile.e would be used instead. Without that namespace, Euphoria would have complained (Errors resolving the following references:)

If you need to use both foo symbols you can still do that by using two different namespaces. For example:

include somefile.e as my_lib
include another.e  as her_ns
my_lib:foo() -- Calls the one in somefile.e
her_ns:foo() -- Calls the one in another.e

Note that there is a reserved namespace name that is always in use. The special namespace eu is used to let Euphoria know that you are accessing a built-in symbol rather than one of the same name declared in someone's file.

For example...

include somefile.e as my_lib
 result = my_lib:find(something) -- Calls the 'find' in somefile.e
 xy = eu:find(X, Y) -- Calls Euphoria's built-in 'find'

The controlling variable used in a for statement is special. It is automatically declared at the beginning of the loop block, and its scope ends at the end of the for-loop. If the loop is inside a function or procedure, the loop variable cannot have the same name as any other variable declared in the routine or enclosing block. When the loop is at the top level, outside of any routine, the loop variable cannot have the same name as any other file-scoped variable. You can use the same name in many different for-loops as long as the loops are not nested. You do not declare loop variables as you would other variables because they are automatically declared as atoms. The range of values specified in the for statement defines the legal values of the loop variable.

Variables declared inside other types of blocks, such as a loop, while, if or switch statement use the same scoping rules as a for-loop index.

4.2.3.6 The override qualifier

There are times when it is necessary to replace a global, public or export identifier. Typically, one would do this to extend the capabilities of a routine. Or perhaps to supersede the user defined type of some public, export or global variable, since the type itself may not be global.

This can be achieved by declaring the identifier as override:

override procedure puts(integer channel,sequence text)
    eu:puts(log_file, text)
    eu:puts(channel, text)
end procedure

A warning will be issued when you do this, because it can be very confusing, and would probably break code, for the new routine to change the behavior of the former routine. Code that was calling the former routine expects no difference in service, so there should not be any.

If an identifier is declared global, public or export, but not override, and there is a built-in of the same name, Euphoria will not assume an override, and will choose the built-in. A warning will be generated whenever this happens.