OpenEuphoria: Euphoria v4.0

All data objects in Euphoria are either atoms or sequences. An atom is a single numeric value. A sequence is a collection of objects, either atoms or sequences themselves. A sequence can contain any mixture of atom and sequences; a sequence does not have to contain all the same data type. Because the objects contained in a sequence can be an arbitrary mix of atoms or sequences, it is an extremely versatile data structure, capable of representing any sort of data.

A sequence is represented by a list of objects in brace brackets { }, separated by commas with an optional sequence terminator, $. Atoms can have any integer or double-precision floating point value. They can range from approximately -1e300 (minus one times 10 to the power 300) to +1e300 with 15 decimal digits of accuracy. Here are some Euphoria objects:

-- examples of atoms:
0
1000
98.6
-1e6
23_100_000
x
$

-- examples of sequences:
{2, 3, 5, 7, 11, 13, 17, 19}
{1, 2, {3, 3, 3}, 4, {5, {6}}}
{{"jon", "smith"}, 52389, 97.25}
{} -- the 0-element sequence

By default, number literals use base 10, but you can have integer literals written in other bases, namely binary (base 2), octal (base 8), and hexadecimal (base 16). To do this, the number is prefixed by a 2-character code that lets Euphoria know which base to use.

Code	Base
0b	2 = Binary
0t	8 = Octal
0d	10 = Decimal
0x	16 = Hexadecimal

For example:

0b101 --> decimal 5
0t101 --> decimal 65
0d101 --> decimal 101
0x101 --> decimal 257

Additionally, hexadecimal integers can also be written by prefixing the number with the '#' character.

For example:

#FE             -- 254
#A000           -- 40960
#FFFF00008      -- 68718428168
-#10            -- -16

Only digits and the letters A, B, C, D, E, F, in either uppercase or lowercase, are allowed in hexadecimal numbers. Hexadecimal numbers are always positive, unless you add a minus sign in front of the # character. So for instance

FFFFFFFF is a huge positive number (4294967295), not -1, as some machine-language programmers might expect.

Sometimes, and especially with large numbers, it can make reading numeric literals easier when they have embedded grouping characters. We are familiar with using commas (periods in Europe) to group large numbers by three-digit subgroups. In Euphoria we use the underscore character to achieve the same thing, and we can group them anyway that is useful to us.

atom big = 32_873_787   -- Set 'big' to the value 32873787

atom salary = 56_110.66 -- Set salary to the value 56110.66

integer defflags = #0323_F3CD

object phone = 61_3_5536_7733

integer bits = 0b11_00010_1

Sequences can be nested to any depth, i.e. you can have sequences within sequences within sequences and so on to any depth (until you run out of memory). Brace brackets are used to construct sequences out of a list of expressions. These expressions can be constant or evaluated at run-time. e.g.

{ x+6, 9, y*w+2, sin(0.5) }

All sequences can include a special end of sequence marker which is the $ character. This is for convience of editing lists that may change often as development proceeds.

sequence seq_1 = { 10, 20, 30, $ }
sequence seq_2 = { 10, 20, 30 }

equal(seq_1, seq_2) -- TRUE

The "Hierarchical Objects" part of the Euphoria acronym comes from the hierarchical nature of nested sequences. This should not be confused with the class hierarchies of certain object-oriented languages.

Why do we call them atoms? Why not just "numbers"? Well, an atom is just a number, but we wanted to have a distinctive term that emphasizes that they are indivisible (that's what "atom" means in Greek). In the world of physics you can 'split' an atom into smaller parts, but you no longer have an atom--only various particles. You can 'split' a number into smaller parts, but you no longer have a number--only various digits.

Atoms are the basic building blocks of all the data that a Euphoria program can manipulate. With this analogy, sequences might be thought of as "molecules", made from atoms and other molecules. A better analogy would be that sequences are like directories, and atoms are like files. Just as a directory on your computer can contain both files and other directories, a sequence can contain both atoms and other sequences (and those sequences can contain atoms and sequences and so on).

.          object
.           /  \
.          /    \
.        atom  sequence

As you will soon discover, sequences make Euphoria very simple and very powerful. Understanding atoms and sequences is the key to understanding Euphoria.

Performance Note:: Does this mean that all atoms are stored in memory as eight-byte floating-point numbers? No. The Euphoria interpreter usually stores integer-valued atoms as machine integers (four bytes) to save space and improve execution speed. When fractional results occur or integers get too big, conversion to IEEE eight-byte floating-point format happens automatically.

4.1.1.2 Character Strings and Individual Characters

A character string is just a sequence of characters. It may be entered in a number of ways ...

Using double-quotes e.g.

"ABCDEFG"

Using raw string notation e.g.

-- Using back-quotes
`ABCDEFG`

-- Using three double-quotes
"""ABCDEFG"""

Using binary strings e.g.

b"1001 00110110 0110_0111 1_0101_1010" -- ==> {#9,#36,#67,#15A}

Using hexadecimal byte strings e.g.

x"65 66 67 AE" -- ==> {#65,#66,#67,#AE}

When you put too many hex characters together they are split up appropriately for you:

x"656667AE"  -- 8-bit  ==> {#65,#66,#67,#AE}

The rules for double-quote strings are:

They begin and end with a double-quote character
They cannot contain a double-quote
They must be only on a single line
They cannot contain the TAB character
If they contain the back-slash '\' character, that character must immediately be followed by one of the special escape codes. The back-slash and escape code will be replaced by the appropriate single character equivalent. If you need to include double-quote, end-of-line, back-slash, or TAB characters inside a double-quoted string, you need to enter them in a special manner.

e.g.

"Bill said\n\t\"This is a back-slash \\ character\".\n"

Which, when displayed should look like ...

Bill said
    "This is a back-slash \ character".

The rules for raw strings are:

Enclose with three double-quotes """...""" or back-quote. `...`
The resulting string will never have any carriage-return characters in it.
If the resulting string begins with a new-line, the initial new-line is removed and any trailing new-line is also removed.
A special form is used to automatically remove leading whitespace from the source code text. You might code this form to align the source text for ease of reading. If the first line after the raw string start token begins with one or more underscore characters, the number of consecutive underscores signifies the maximum number of whitespace characters that will be removed from each line of the raw string text. The underscores represent an assumed left margin width. Note, these leading underscores do not form part of the raw string text.

e.g.

-- No leading underscores and no leading whitespace

Bill said

Which, when displayed should look like ...

Bill said
    "This is a back-slash \ character".

-- No leading underscores and but leading whitespace

   Bill said

Which, when displayed should look like ...

   Bill said
      "This is a back-slash \ character".

-- Leading underscores and leading whitespace

_____Bill said

Which, when displayed should look like ...

Bill said
    "This is a back-slash \ character".

Extended string literals are useful when the string contains new-lines, tabs, or back-slash characters because they do not have to be entered in the special manner. The back-quote form can be used when the string literal contains a set of three double-quote characters, and the triple quote form can be used when the text literal contains back-quote characters. If a literal contains both a back quote and a set of three double-quotes, you will need to concatenate two literals.

object TQ, BQ, QQ
TQ = `This text contains """ for some reason.`
BQ = """This text contains a back quote ` for some reason."""
QQ = """This text contains a back quote ` """ & `and """ for some reason.`

The rules for binary strings are...

they begin with the pair b" and end with a double-quote (") character
they can only contain binary digits (0-1), and space, underscore, tab, newline, carriage-return. Anything else is invalid.
an underscore is simply ignored, as if it was never there. It is used to aid readability.
each set of contiguous binary digits represents a single sequence element
they can span multiple lines
The non-digits are treated as punctuation and used to delimit individual values.

b"1 10 11_0100 01010110_01111000" == {0x01, 0x02, 0x34, 0x5678}

The rules for hexadecimal strings are:

They begin with the pair x" and end with a double-quote (") character
They can only contain hexadecimal digits (0-9 A-F a-f), and space, underscore, tab, newline, carriage-return. Anything else is invalid.
An underscore is simply ignored, as if it was never there. It is used to aid readability.
Each pair of contiguous hex digits represents a single sequence element with a value from 0 to 255
They can span multiple lines
The non-digits are treated as punctuation and used to delimit individual values.

x"1 2 34 5678_AbC" == {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C}

Character strings may be manipulated and operated upon just like any other sequences. For example the string we first looked at "ABCDEFG" is entirely equivalent to the sequence:

{65, 66, 67, 68, 69, 70, 71}

which contains the corresponding ASCII codes. The Euphoria compiler will immediately convert "ABCDEFG" to the above sequence of numbers. In a sense, there are no "strings" in Euphoria, only sequences of numbers. A quoted string is really just a convenient notation that saves you from having to type in all the ASCII codes. It follows that "" is equivalent to {}. Both represent the sequence of zero length, also known as the empty sequence. As a matter of programming style, it is natural to use "" to suggest a zero length sequence of characters, and {} to suggest some other kind of sequence. An individual character is an atom. It must be entered using single quotes. There is a difference between an individual character (which is an atom), and a character string of length 1 (which is a sequence). e.g.

'B'   -- equivalent to the atom 66 - the ASCII code for B
"B"   -- equivalent to the sequence {66}

Again, 'B' is just a notation that is equivalent to typing 66. There are no "characters" in Euphoria, just numbers (atoms). However, it is possible to use characters without ever having to use their numerical representation.

Keep in mind that an atom is not equivalent to a one-element sequence containing the same value, although there are a few built-in routines that choose to treat them similarly.

4.1.1.3 Escaped Characters

Special characters may be entered using a back-slash:

Code	Meaning
\n	newline
\r	carriage return
\t	tab
\\	backslash
\"	double quote
\'	single quote
\0	null
\e	escape
\E	escape
\b/d..d/	A binary coded value, the \b is followed by 1 or more binary digits. Inside strings, use the space character to delimit or end a binary value.
\x/hh/	A 2-hex-digit value, e.g. "\x5F" ==> {95}
\u/hhhh/	A 4-hex-digit value, e.g. "\u2A7C" ==> {10876}
\U/hhhhhhhh/	An 8-hex-digit value, e.g. "\U8123FEDC" ==> {2166619868}

For example, "Hello, World!\n", or '\\'. The Euphoria editor displays character strings in green.

Note that you can use the underscore character '_' inside the \b, \x, \u, and \U values to aid readability, e.g. "\U8123_FEDC" ==> {2166619868}

4.1.2 Identifiers

An identifier is just the name you give something in your program. This can be a variable, constant, function, procedure, parameter, or namespace. An identifier must begin with either a letter or an underscore, then followed by zero or more letters, digits or underscore characters. There is no theoretical limit to how large an identifier can be but in practice it should be no more than about 30 characters.

Identifiers are case-sensitive. This means that "Name" is a different identifier from "name", or "NAME", etc...

Examples of valid identifiers:

n
color26
ShellSort
quick_sort
a_very_long_indentifier_that_is_really_too_long_for_its_own_good
_alpha

Examples of invalid identifiers:

0n         -- must not start with a digit

Shell Sort -- Cannot have spaces in identifiers.
quick-sort -- must only consist of letters, digits or underscore.

4.1.3 Comments

Comments are ignored by Euphoria and have no effect on execution speed. The editor displays comments in red.

There are three forms of comment text:

The line format comment is started by two dashes and extends to the end of the current line.

e.g.

-- This is a comment which extends to the end of this line only.

The multi-line format comment is started by /* and extends to the next occurrence of */, even if that occurs on a different line.

e.g.

   extends over a number
   of text lines.
*/

On the first line only of your program, you can use a special comment beginning with the two character sequence #!. This is mainly used to tell Unix shells which program to execute the 'script' program with.

e.g.

/home/rob/euphoria/bin/eui

This informs the Linux shell that your file should be executed by the Euphoria interpreter, and gives the full path to the interpreter. If you make your file executable, you can run it, just by typing its name, and without the need to type "eui". On Windows this line is just treated as a comment (though Apache Web server on Windows does recognize it.). If your file is a shrouded .il file, use eub.exe instead of eui.

Line comments are typically used to annotate a single (or small section) of code, whereas multi-line comments are typically used to give larger pieces of documentation inside the source text.

4.1.4 Expressions

Like other programming languages, Euphoria lets you calculate results by forming expressions. However, in Euphoria you can perform calculations on entire sequences of data with one expression, where in most other languages you would have to construct a loop. In Euphoria you can handle a sequence much as you would a single number. It can be copied, passed to a subroutine, or calculated upon as a unit. For example,

{1,2,3} + 5

is an expression that adds the sequence {1,2,3} and the atom 5 to get the resulting sequence {6,7,8}.

We will see more examples later.

4.1.4.1 Relational Operators

The relational operators < > <= >= = != each produce a 1 (true) or a 0 (false) result.

8.8 < 8.7   -- 8.8 less than 8.7 (false)
-4.4 > -4.3 -- -4.4 greater than -4.3 (false)
8 <= 7      -- 8 less than or equal to 7 (false)
4 >= 4      -- 4 greater than or equal to 4 (true)
1 = 10      -- 1 equal to 10 (false)
8.7 != 8.8  -- 8.7 not equal to 8.8 (true)

As we will soon see you can also apply these operators to sequences.

4.1.4.2 Logical Operators

The logical operators and, or, xor, and not are used to determine the "truth" of an expression. e.g.

1 and 1     -- 1 (true)
1 and 0     -- 0 (false)
0 and 1     -- 0 (false)
0 and 0     -- 0 (false)

1 or  1     -- 1 (true)
1 or  0     -- 1 (true)
0 or  1     -- 1 (true)
0 or  0     -- 0 (false)

1 xor 1     -- 0 (false)
1 xor 0     -- 1 (true)
0 xor 1     -- 1 (true)
0 xor 0     -- 0 (false)

not 1       -- 0 (false)
not 0       -- 1 (true)

You can also apply these operators to numbers other than 1 or 0. The rule is: zero means false and non-zero means true. So for instance:

5 and -4    -- 1 (true)
not 6       -- 0 (false)

These operators can also be applied to sequences. See below.

In some cases short_circuit evaluation will be used for expressions containing and or or. Specifically, short circuiting applies inside decision making expressions. These are found in the if statement, while statement and the loop until statement. More on this later.

4.1.4.3 Arithmetic Operators

The usual arithmetic operators are available: add, subtract, multiply, divide, unary minus, unary plus.

3.5 + 3  -- 6.5
3 - 5    -- -2
6 * 2    -- 12
7 / 2    -- 3.5
-8.1     -- -8.1
+8       -- +8

Computing a result that is too big (i.e. outside of -1e300 to +1e300) will result in one of the special atoms +infinity or -infinity. These appear as inf or -inf when you print them out. It is also possible to generate nan or -nan. "nan" means "not a number", i.e. an undefined value (such as inf divided by inf). These values are defined in the IEEE floating-point standard. If you see one of these special values in your output, it usually indicates an error in your program logic, although generating inf as an intermediate result may be acceptable in some cases. For instance, 1/inf is 0, which may be the "right" answer for your algorithm.

Division by zero, as well as bad arguments to math library routines, e.g. square root of a negative number, log of a non-positive number etc. cause an immediate error message and your program is aborted.

The only reason that you might use unary plus is to emphasize to the reader of your program that a number is positive. The interpreter does not actually calculate anything for this.

4.1.4.4 Operations on Sequences

All of the relational, logical and arithmetic operators described above, as well as the math routines described in Language Reference, can be applied to sequences as well as to single numbers (atoms).

When applied to a sequence, a unary (one operand) operator is actually applied to each element in the sequence to yield a sequence of results of the same length. If one of these elements is itself a sequence then the same rule is applied again recursively. e.g.

x = -{1, 2, 3, {4, 5}}   -- x is {-1, -2, -3, {-4, -5}}

If a binary (two-operand) operator has operands which are both sequences then the two sequences must be of the same length. The binary operation is then applied to corresponding elements taken from the two sequences to get a sequence of results. e.g.

x = {5, 6, 7, 8} + {10, 10, 20, 100}
-- x is {15, 16, 27, 108}
x = {{1, 2, 3}, {4, 5, 6}} + {-1, 0, 1} -- ERROR: 2 != 3
-- but
x = {{1, 2, 3} + {-1, 0, 1}, {4, 5, 6} + {-1, 0, 1}} -- CORRECT
-- x is {{0, 2, 4}, {3, 5, 7}}

If a binary operator has one operand which is a sequence while the other is a single number (atom) then the single number is effectively repeated to form a sequence of equal length to the sequence operand. The rules for operating on two sequences then apply. Some examples:

y = {4, 5, 6}
w = 5 * y                          -- w is {20, 25, 30}

x = {1, 2, 3}
z = x + y                          -- z is {5, 7, 9}
z = x < y                          -- z is {1, 1, 1}

w = {{1, 2}, {3, 4}, {5}}
w = w * y                          -- w is {{4, 8}, {15, 20}, {30}}

w = {1, 0, 0, 1} and {1, 1, 1, 0}  -- {1, 0, 0, 0}
w = not {1, 5, -2, 0, 0}           -- w is {0, 0, 0, 1, 1}

w = {1, 2, 3} = {1, 2, 4}          -- w is {1, 1, 0}

-- note that the first '=' is assignment, and the
-- second '=' is a relational operator that tests
-- equality

Note: When you wish to compare two strings (or other sequences), you should not (as in some other languages) use the '=' operator:

if "APPLE" = "ORANGE" then  -- ERROR!

'=' is treated as an operator, just like '+', '*' etc., so it is applied to corresponding sequence elements, and the sequences must be the same length. When they are equal length, the result is a sequence of ones an zeros. When they are not equal length, the result is an error. Either way you'll get an error, since an if-condition must be an atom, not a sequence. Instead you should use the equal() built-in routine:

if equal("APPLE", "ORANGE") then  -- CORRECT

In general, you can do relational comparisons using the compare() built-in routine:

if compare("APPLE", "ORANGE") = 0 then  -- CORRECT

You can use compare() for other comparisons as well:

if compare("APPLE", "ORANGE") < 0 then  -- CORRECT
-- enter here if "APPLE" is less than "ORANGE" (TRUE)

Especially useful is the idiom compare(x, "") = 1 to determine whether x is a non empty sequence. compare(x, "") = -1 would test for x being an atom, but atom(x) = 1 does the same faster and is clearer to read.

4.1.4.5 Subscripting of Sequences

A single element of a sequence may be selected by giving the element number in square brackets. Element numbers start at 1. Non-integer subscripts are rounded down to an integer.

For example, if x contains {5, 7.2, 9, 0.5, 13} then x[2] is 7.2. Suppose we assign something different to x[2]:

x[2] = {11,22,33}

Then x becomes: {5, {11,22,33}, 9, 0.5, 13}. Now if we ask for x[2] we get {11,22,33} and if we ask for x[2][3] we get the atom 33. If you try to subscript with a number that is outside of the range 1 to the number of elements, you will get a subscript error. For example x[0], x[-99] or x[6] will cause errors. So will x[1][3] since x[1] is not a sequence. There is no limit to the number of subscripts that may follow a variable, but the variable must contain sequences that are nested deeply enough. The two dimensional array, common in other languages, can be easily represented with a sequence of sequences:

x = {
    {5, 6, 7, 8, 9},      -- x[1]
    {1, 2, 3, 4, 5},      -- x[2]
    {0, 1, 0, 1, 0}       -- x[3]
}

where we have written the numbers in a way that makes the structure clearer. An expression of the form x[i][j] can be used to access any element.

The two dimensions are not symmetric however, since an entire "row" can be selected with x[i], but you need to use vslice() in the Standard Library to select an entire column. Other logical structures, such as n-dimensional arrays, arrays of strings, structures, arrays of structures etc. can also be handled easily and flexibly:

3-D array:

y = {
    {{1,1}, {3,3}, {5,5}},
    {{0,0}, {0,1}, {9,1}},
    {{-1,9},{1,1}, {2,2}}
}

-- y[2][3][1] is 9

Array of strings:

s = {"Hello", "World", "Euphoria", "", "Last One"}

-- s[3] is "Euphoria"
-- s[3][1] is 'E'

A Structure:

employee = {
    {"John","Smith"},
    45000,
    27,
    185.5
}

To access "fields" or elements within a structure it is good programming style to make up an enum that names the various fields. This will make your program easier to read. For the example above you might have:

enum NAME, SALARY, AGE, WEIGHT
enum FIRST_NAME, LAST_NAME

employees = {
    {{"John","Smith"}, 45000, 27, 185.5},   -- a[1]
    {{"Bill","Jones"}, 57000, 48, 177.2},   -- a[2]
    -- .... etc.
}

-- employees[2][SALARY] would be 57000.

The length() built-in function will tell you how many elements are in a sequence. So the last element of a sequence s, is:

s[length(s)]

A short-hand for this is:

s[$]

Similarly,

s[length(s)-1]

can be simplified to:

s[$-1]

The $ may only appear between square braces and it equals the length of the sequence that is being subscripted. Where there's nesting, e.g.:

s[$ - t[$-1] + 1]

The first $ above refers to the length of s, while the second $ refers to the length of t (as you'd probably expect). An example where $ can save a lot of typing, make your code clearer, and probably even faster is:

longname[$][$] -- last element of the last element

Compare that with the equivalent:

longname[length(longname)][length(longname[length(longname)])]

Subscripting and function side-effects:

In an assignment statement, with left-hand-side subscripts:

lhs_var[lhs_expr1][lhs_expr2]... = rhs_expr

The expressions are evaluated, and any subscripting is performed, from left to right. It is possible to have function calls in the right-hand-side expression, or in any of the left-hand-side expressions. If a function call has the side-effect of modifying the lhs_var, it is not defined whether those changes will appear in the final value of the lhs_var, once the assignment has been completed. To be sure about what is going to happen, perform the function call in a separate statement, i.e. do not try to modify the lhs_var in two different ways in the same statement. Where there are no left-hand-side subscripts, you can always assume that the final value of the lhs_var will be the value of rhs_expr, regardless of any side-effects that may have changed lhs_var.

Euphoria data structures are almost infinitely flexible.

Arrays in many languages are constrained to have a fixed number of elements, and those elements must all be of the same type. Euphoria eliminates both of those restrictions by defining all arrays (sequences) as a list of zero or more Euphoria objects whose element count can be changed at any time. You can easily add a new structure to the employee sequence above, or store an unusually long name in the NAME field and Euphoria will take care of it for you. If you wish, you can store a variety of different employee "structures", with different sizes, all in one sequence. However, when you retrieve a sequence element, it is not guaranteed to be of any type. You, as a programmer, need to check that the retrieved data is of the type you'd expect, Euphoria will not. The only thing it will check is whether an assignment is legal. For example, if you try to assign a sequence to an integer variable, Euphoria will complain at the time your code does the assignment.

Not only can a Euphoria program represent all conventional data structures but you can create very useful, flexible structures that would be hard to declare in many other languages.

Note that expressions in general may not be subscripted, just variables. For example: {5+2,6-1,7*8,8+1}[3] is not supported, nor is something like: date()[MONTH]. You have to assign the sequence returned by date() to a variable, then subscript the variable to get the month.

4.1.4.6 Slicing of Sequences

A sequence of consecutive elements may be selected by giving the starting and ending element numbers. For example if x is {1, 1, 2, 2, 2, 1, 1, 1} then x[3..5] is the sequence {2, 2, 2}. x[3..3] is the sequence {2}. x[3..2] is also allowed. It evaluates to the zero length sequence {}. If y has the value: {"fred", "george", "mary"} then y[1..2] is {"fred", "george"}.

We can also use slices for overwriting portions of variables. After x[3..5] = {9, 9, 9} x would be {1, 1, 9, 9, 9, 1, 1, 1}. We could also have said x[3..5] = 9 with the same effect. Suppose y is {0, "Euphoria", 1, 1}. Then y[2][1..4] is "Euph". If we say y[2][1..4] = "ABCD" then y will become {0, "ABCDoria", 1, 1}.

In general, a variable name can be followed by 0 or more subscripts, followed in turn by 0 or 1 slices. Only variables may be subscripted or sliced, not expressions.

We need to be a bit more precise in defining the rules for empty slices. Consider a slice s[i..j] where s is of length n. A slice from i to j, where j = i - 1 and i >= 1 produces the empty sequence, even if i = n + 1. Thus 1..0 and n + 1..n and everything in between are legal (empty) slices. Empty slices are quite useful in many algorithms. A slice from i to j where j < i - 1 is illegal , i.e. "reverse" slices such as s[5..3] are not allowed.

We can also use the $ shorthand with slices, e.g.

s[2..$]
s[5..$-2]
s[$-5..$]
s[$][1..floor($/2)] -- first half of the last element of s

4.1.4.7 Concatenation of Sequences and Atoms - The '&' Operator

Any two objects may be concatenated using the & operator. The result is a sequence with a length equal to the sum of the lengths of the concatenated objects. e.g.

{1, 2, 3} & 4              -- {1, 2, 3, 4}

4 & 5                      -- {4, 5}

{{1, 1}, 2, 3} & {4, 5}    -- {{1, 1}, 2, 3, 4, 5}

x = {}
y = {1, 2}
y = y & x                  -- y is still {1, 2}

You can delete element i of any sequence s by concatenating the parts of the sequence before and after i:

s = s[1..i-1] & s[i+1..length(s)]

This works even when i is 1 or length(s), since s[1..0] is a legal empty slice, and so is s[length(s)+1..length(s)].

4.1.4.8 Sequence-Formation

Finally, sequence-formation, using braces and commas:

{a, b, c, ... }

is also an operator. It takes n operands, where n is 0 or more, and makes an n-element sequence from their values. e.g.

x = {apple, orange*2, {1,2,3}, 99/4+foobar}

The sequence-formation operator is listed at the bottom of the a precedence chart.

4.1.4.9 Other Operations on Sequences

Some other important operations that you can perform on sequences have English names, rather than special characters. These operations are built-in to eui.exe/euiw.exe, so they'll always be there, and so they'll be fast. They are described in detail in the Language Reference, but are important enough to Euphoria programming that we should mention them here before proceeding. You call these operations as if they were subroutines, although they are actually implemented much more efficiently than that.

length(sequence s)

Returns the length of a sequence s.

This is the number of elements in s. Some of these elements may be sequences that contain elements of their own, but length just gives you the "top-level" count. Note however that the length of an atom is always 1. e.g.

length({5,6,7})             -- 3
length({1, {5,5,5}, 2, 3})  -- 4 (not 6!)
length({})                  -- 0
length(5)                   -- 1

repeat(object o1, integer count)

Returns a sequence that consists of an item repeated count times. e.g.

repeat(0, 100)         -- {0,0,0,...,0}   i.e. 100 zeros
repeat("Hello", 3)     -- {"Hello", "Hello", "Hello"}
repeat(99,0)           -- {}

The item to be repeated can be any atom or sequence.

append(sequence s1, object o1)

Returns a sequence by adding an object o1 to the end of a sequence s1.

append({1,2,3}, 4)         -- {1,2,3,4}
append({1,2,3}, {5,5,5})   -- {1,2,3,{5,5,5}}
append({}, 9)              -- {9}

The length of the new sequence is always 1 greater than the length of the original sequence. The item to be added to the sequence can be any atom or sequence.

prepend(sequence s1, object o1)

Returns a new sequence by adding an element to the beginning of a sequence s. e.g.

append({1,2,3}, 4)         -- {1,2,3,4}
prepend({1,2,3}, 4)        -- {4,1,2,3}

append({1,2,3}, {5,5,5})   -- {1,2,3,{5,5,5}}
prepend({}, 9)             -- {9}
append({}, 9)              -- {9}

The length of the new sequence is always one greater than the length of the original sequence. The item to be added to the sequence can be any atom or sequence.

These two built-in functions, append() and prepend(), have some similarities to the concatenate operator, &, but there are clear differences. e.g.

-- appending a sequence is different
append({1,2,3}, {5,5,5})   -- {1,2,3,{5,5,5}}
{1,2,3} & {5,5,5}          -- {1,2,3,5,5,5}

-- appending an atom is the same
append({1,2,3}, 5)         -- {1,2,3,5}
{1,2,3} & 5                -- {1,2,3,5}

insert(sequence in_what, object what, atom position)

This function takes a target sequence, in_what, shifts its tail one notch and plugs the object what in the hole just created. The modified sequence is returned. For instance:

s = insert("Joe",'h',3)     -- s is "Johe", another string
s = insert("Joe","h",3)     -- s is {'J','o',{'h'},'e'}, not a string
s = insert({1,2,3},4,-0.5)  -- s is {4,1,2,3}, like prepend()
s = insert({1,2,3},4,8.5)   -- s is {1,2,3,4}, like append()

The length of the returned sequence is one more than the one of in_what. This is the same rule as for append() and prepend() above, which are actually special cases of insert().

splice(sequence in_what, object what, atom position)

If what is an atom, this is the same as insert(). But if what is a sequence, that sequence is inserted as successive elements into in_what at position. Example:

s = splice("Joe",'h',3)
    -- s is "Johe", like insert()
s = splice("Joe","hn Do",3)
    -- s is "John Doe", another string
s = splice("Joh","n Doe",9.3)
    -- s is "John Doe", like with the & operator
s = splice({1,2,3},4,-2)
    -- s is {4,1,2,3}, like with the & operator in reversed order

The length of splice(in_what, what, position) always is length(in_what) + length(what), like for concatenation using &.

4.1.5 Precedence Chart

When two or more operators follow one another in an expression, there must be rules to tell in which order they should be evaluated, as different orders usually lead to different results. It is common and convenient to use a precedence order on operators. Operators with the highest degree of precedence are evaluated first, then those with highest precedence among what remains, and so on.

The precedence of operators in expressions is as follows:

highest precedence

**highest precedence**

function/type calls
unary-  unary+  not
*  /
+  -
&
<  >  <=  >=  =  !=
and  or  xor

lowest precedence

{ , , , }

Thus 2+6*3 means 2+(6*3) rather than (2+6)*3. Operators on the same line above have equal precedence and are evaluated left to right. You can force any order of operations by placing round brackets ( ) around an expression. For instance, 6/3*5 is 2*5, not 6/15.

Different languages or contexts may have slightly different precedence rules. You should be careful when translating a formula from a language to another; Euphoria is no exception. Adding superfluous parentheses to explicitly denote the exact order of evaluation does not cost much, and may help either readers used to some other precedence chart or translating to or from another context with slightly different rules. Watch out for and and or, or * and /.

The equals symbol '=' used in an assignment statement is not an operator, it's just part of the syntax of the language.

	Up					TOC	Index
<< 3 Using Euphoria			< 3.5 Command line switches	Up: 4 Language Reference	4.2 Declarations >	5 Formal Syntax >>

OpenEuphoria

Euphoria v4.0

4.1 Definition

4.1.1 Objects

4.1.1.1 Atoms and Sequences

4.1.1.2 Character Strings and Individual Characters

4.1.1.3 Escaped Characters

4.1.2 Identifiers

4.1.3 Comments

4.1.4 Expressions

4.1.4.1 Relational Operators

4.1.4.2 Logical Operators

4.1.4.3 Arithmetic Operators

4.1.4.4 Operations on Sequences

4.1.4.5 Subscripting of Sequences

4.1.4.6 Slicing of Sequences

4.1.4.7 Concatenation of Sequences and Atoms - The '&' Operator

4.1.4.8 Sequence-Formation

4.1.4.9 Other Operations on Sequences

length(sequence s)

repeat(object o1, integer count)

append(sequence s1, object o1)

prepend(sequence s1, object o1)

insert(sequence in_what, object what, atom position)

splice(sequence in_what, object what, atom position)

4.1.5 Precedence Chart

Search

Quick Links