Wiki Diff r1_lang_def 1, revision #1 to tip




== Definition
:<<LEVELTOC level=2 depth=4>>
=== Objects ===
@[predefinedtypes|]

==== Atoms and Sequences

All data **objects** in Euphoria are either **atoms** or **sequences**. An **atom** is
a single numeric value. A **sequence** is a collection of objects, either atoms or sequences
themselves. A sequence can contain any mixture of atom and sequences; a sequence
does not have to contain all the same data type. Because the **objects**
contained in a sequence can be an arbitrary mix of atoms or sequences, it is
an extremely versatile data structure, capable of representing any sort of data.

A sequence is represented by a list of objects in brace brackets **{ }**, separated by
commas with an optional sequence terminator, ##$##. Atoms can have any integer or
double-precision floating point value. They can range from approximately -1e300 (minus
one times 10 to the power 300) to +1e300 with 15 decimal digits of accuracy. Here are
some Euphoria objects:

<eucode>
-- examples of atoms:
0
1000
98.6
-1e6
23_100_000
x
$

-- examples of sequences:
{2, 3, 5, 7, 11, 13, 17, 19}
{1, 2, {3, 3, 3}, 4, {5, {6}}}
{{"jon", "smith"}, 52389, 97.25}
{} -- the 0-element sequence
</eucode>

By default, number literals use //base 10//, but you can have integer literals
written in other bases, namely binary //(base 2)//, octal //(base 8)//, and
hexadecimal //(base 16)//. To do this, the number is prefixed by a 2-character
code that lets Euphoria know which base to use.
|= Code |= Base |
| 0b | 2 = **B**inary |
| 0t | 8 = Oc**t**al |
| 0d | 10 = **D**ecimal |
| 0x | 16 = He**x**adecimal |

For example:
<eucode>
0b101 --> decimal 5
0t101 --> decimal 65
0d101 --> decimal 101
0x101 --> decimal 257
</eucode>

Additionally, hexadecimal integers can also be written by prefixing the number with
the '#' character.

For example:
<eucode>
#FE -- 254
#A000 -- 40960
#FFFF00008 -- 68718428168
-#10 -- -16
</eucode>

Only digits and the letters A, B, C, D, E, F, in either uppercase or lowercase,
are allowed in hexadecimal numbers. Hexadecimal numbers are always positive,
unless you add a minus sign in front of the # character. So for instance
#FFFFFFFF is a huge positive number
(4294967295), **not** ##-1##, as some machine-language programmers might expect.

Sometimes, and especially with large numbers, it can make reading numeric
literals easier when they have embedded grouping characters. We are familiar
with using commas (periods in Europe) to group large numbers by three-digit
subgroups. In Euphoria we use the underscore character to achieve the same
thing, and we can group them
anyway that is useful to us.

<eucode>
atom big = 32_873_787 -- Set 'big' to the value 32873787

atom salary = 56_110.66 -- Set salary to the value 56110.66

integer defflags = #0323_F3CD

object phone = 61_3_5536_7733

integer bits = 0b11_00010_1
</eucode>

**Sequences** can be nested to any depth, i.e. you can have sequences within
sequences within sequences and so on to any depth (until you run out of
memory). Brace brackets are used to construct sequences out of a list of
expressions. These expressions can be constant or evaluated at run-time.
e.g.

<eucode>
{ x+6, 9, y*w+2, sin(0.5) }
</eucode>

All sequences can include a special //end of sequence// marker which is the ##$##
character. This is for convience of editing lists that may change often as development
proceeds.

<eucode>
sequence seq_1 = { 10, 20, 30, $ }
sequence seq_2 = { 10, 20, 30 }

equal(seq_1, seq_2) -- TRUE
</eucode>

The **"Hierarchical Objects"** part of the Euphoria acronym comes from the
hierarchical nature of nested sequences. This should not be confused with the
class hierarchies of certain object-oriented languages.

Why do we call them atoms? Why not just "numbers"? Well, an ##atom## //is//
just a number, but we wanted to have a distinctive term that emphasizes that
they are indivisible (that's what "atom" means in Greek). In the world of
physics you can 'split' an atom into smaller parts, but you no
longer have an atom~--only various particles. You can 'split' a
number into smaller parts, but you no longer have a number~--only various
digits.

Atoms are the basic building blocks of all the
data that a Euphoria program can manipulate. With this analogy, **sequence**s
might be thought of as "molecules", made from atoms and other molecules. A
better analogy would be that sequences are like directories, and atoms are
like files. Just as a directory on your computer can contain both files and
other directories, a sequence can contain both atoms and other sequences
(and //those// sequences can contain atoms and sequences and so on).

{{{
. object
. / \
. / \
. atom sequence
}}}

As you will soon discover, sequences make Euphoria very simple //and// very
powerful. **Understanding atoms and sequences is the key to understanding
Euphoria.**

;**Performance Note~:**
:Does this mean that all atoms are stored in memory as eight-byte floating-point
numbers? No. The Euphoria interpreter usually stores integer-valued atoms as
machine integers (four bytes) to save space and improve execution speed. When
fractional results occur or integers get too big, conversion to IEEE
eight-byte floating-point format happens automatically.

==== Character Strings and Individual Characters

A **character string** is just a ##sequence## of characters. It may be
entered in a number of ways ...

* Using double-quotes e.g.

<eucode>
"ABCDEFG"
</eucode>

* Using raw string notation e.g.

<eucode>
-- Using back-quotes
`ABCDEFG`
</eucode>
or
<eucode>
-- Using three double-quotes
"""ABCDEFG"""
</eucode>

* Using binary strings e.g.

<eucode>
b"1001 00110110 0110_0111 1_0101_1010" -- ==> {#9,#36,#67,#15A}
</eucode>

* Using hexadecimal byte strings e.g.

<eucode>
x"65 66 67 AE" -- ==> {#65,#66,#67,#AE}
</eucode>

When you put too many hex characters together they are split up appropriately for you:

<eucode>
x"656667AE" -- 8-bit ==> {#65,#66,#67,#AE}
</eucode>

**The rules for double-quote strings are:**
# They begin and end with a double-quote character
# They cannot contain a double-quote
# They must be only on a single line
# They cannot contain the TAB character
# If they contain the back-slash '\' character, that character must immediately
be followed by one of the special //escape// codes. The back-slash and escape
code will be replaced by the appropriate single character equivalent.
If you need to include double-quote, end-of-line, back-slash, or TAB characters
inside a double-quoted string, you need to enter them in a special manner.

e.g.

<eucode>
"Bill said\n\t\"This is a back-slash \\ character\".\n"
</eucode>
Which, when displayed should look like ...
{{{
Bill said
"This is a back-slash \ character".
}}}


**The rules for raw strings are:**
# Enclose with three double-quotes {{{"""..."""}}} or back-quote. {{{`...`}}}
# The resulting string will never have any carriage-return characters in it.
# If the resulting string begins with a new-line, the initial new-line is
removed and any trailing new-line is also removed.
# A special form is used to automatically remove leading whitespace from the
source code text. You might code this form to align the source text for ease of
reading. If the first line after the raw string start token begins
with one or more underscore characters, the number of consecutive underscores
signifies the maximum number of whitespace characters that will be removed from
each line of the raw string text. The underscores represent an assumed left
margin width. **Note**, these leading underscores do not form part of the raw
string text.

e.g.

<eucode>
-- No leading underscores and no leading whitespace
`
Bill said
"This is a back-slash \ character".
`
</eucode>
Which, when displayed should look like ...
{{{
Bill said
"This is a back-slash \ character".
}}}

<eucode>
-- No leading underscores and but leading whitespace
`
Bill said
"This is a back-slash \ character".
`
</eucode>
Which, when displayed should look like ...
{{{
Bill said
"This is a back-slash \ character".
}}}
<eucode>
-- Leading underscores and leading whitespace
`
_____Bill said
"This is a back-slash \ character".
`
</eucode>
Which, when displayed should look like ...
{{{
Bill said
"This is a back-slash \ character".
}}}

Extended string literals are useful when the string contains new-lines, tabs,
or back-slash characters because they do not have to be entered
in the special manner. The back-quote form can be used when the string literal
contains a set of three double-quote characters, and the triple quote form can
be used when the text literal contains back-quote characters. If a literal
contains both a back quote and a set of three double-quotes, you will need to
concatenate two literals.

<eucode>
object TQ, BQ, QQ
TQ = `This text contains """ for some reason.`
BQ = """This text contains a back quote ` for some reason."""
QQ = """This text contains a back quote ` """ & `and """ for some reason.`
</eucode>

**The rules for binary strings are...**

# they begin with the pair ##b"## and end with a double-quote (##"##) character
# they can only contain binary digits (0-1), and space, underscore,
tab, newline, carriage-return. Anything else is invalid.
# an underscore is simply ignored, as if it was never there. It is used to aid
readability.
# each set of contiguous binary digits represents a single sequence element
# they can span multiple lines
# The non-digits are treated as punctuation and used to delimit individual
values.

<eucode>
b"1 10 11_0100 01010110_01111000" == {0x01, 0x02, 0x34, 0x5678}
</eucode>

**The rules for hexadecimal strings are:**
# They begin with the pair ##x"## and end with a double-quote (##"##) character
# They can only contain hexadecimal digits (0-9 A-F a-f), and space, underscore,
tab, newline, carriage-return. Anything else is invalid.
# An underscore is simply ignored, as if it was never there. It is used to aid
readability.
# Each pair of contiguous hex digits represents a single sequence element with a
value from 0 to 255
# They can span multiple lines
# The non-digits are treated as punctuation and used to delimit individual
values.
<eucode>
x"1 2 34 5678_AbC" == {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C}
</eucode>


Character strings may be manipulated and operated upon just like any other
sequences. For example the string we first looked at "ABCDEFG" is entirely
equivalent to the sequence:

<eucode>
{65, 66, 67, 68, 69, 70, 71}
</eucode>

which contains the corresponding ASCII codes. The Euphoria compiler will
immediately convert "ABCDEFG" to the above sequence of numbers. In a sense,
there are no "strings" in Euphoria, only sequences of numbers. A quoted string
is really just a convenient notation that saves you from having to type in all
the ASCII codes.
@[emptyseq|]
It follows that "" is equivalent to {}. Both represent
the sequence of zero length, also known as the **empty sequence**. As
a matter of programming style, it is natural to use "" to suggest a zero length
sequence of characters, and {} to suggest some other kind of sequence.
An **individual character** is an **atom**. It must be entered using single
quotes. There is a difference between an individual character (which is an
atom), and a character string of length 1 (which is a sequence). e.g.

<eucode>
'B' -- equivalent to the atom 66 - the ASCII code for B
"B" -- equivalent to the sequence {66}
</eucode>

Again, ##'B'## is just a notation that is equivalent to typing ##66##. There
are no "characters" in Euphoria, just numbers (atoms). However, it is
possible to use characters without ever having to use their numerical
representation.

Keep in mind that an atom is //not// equivalent to a one-element sequence
containing the same value, although there are a few built-in routines that
choose to treat them similarly.

====Escaped Characters====

Special characters may be entered using a back-slash:
|=Code | Meaning|
| \n | newline |
| \r | carriage return |
| \t | tab |
| {{{\\}}} | backslash |
| \" | double quote |
| \' | single quote |
| \0 | null |
| \e | escape |
| \E | escape |
| \b/d..d/ | A binary coded value, the \b is followed by 1 or more binary digits. \\
Inside strings, use the space character to delimit or end a binary value.
| \x/hh/ | A 2-hex-digit value, e.g. "\x5F" ==> {95} |
| \u/hhhh/ | A 4-hex-digit value, e.g. "\u2A7C" ==> {10876} |
| \U/hhhhhhhh/ | An 8-hex-digit value, e.g. "\U8123FEDC" ==> {2166619868} |

For example, ##"Hello, World!\n"##, or ##'~\~\'##. The demonstration editor ##edx## displays character
strings in green.

Note that you can use the underscore character ##'_'## inside the
##\b##, ##\x##, ##\u##, and ##\U##
values to aid readability, e.g. ##"\U8123_FEDC" ==> {2166619868}##

=== Identifiers

An identifier is just the name you give something in your program. This can be
a variable, constant, function, procedure, parameter, or namespace. An identifier
must begin with either a letter or an underscore, then followed by zero or more
letters, digits or underscore characters. There is no theoretical limit to how
large an identifier can be but in practice it should be no more than about 30
characters.

Identifiers are **case-sensitive**. This means that ##"Name"## is a different
identifier from ##"name"##, or ##"NAME"##, etc...

Examples of valid identifiers~:
<eucode>
n
color26
ShellSort
quick_sort
a_very_long_indentifier_that_is_really_too_long_for_its_own_good
_alpha
</eucode>

Examples of invalid identifiers~:
<eucode>
0n -- must not start with a digit
^color26 -- must not start with a punctuation character
Shell Sort -- Cannot have spaces in identifiers.
quick-sort -- must only consist of letters, digits or underscore.
</eucode>

@[source_comments|]
=== Comments
Comments are ignored by Euphoria and have no effect on execution speed.
For example the ##edx## editor displays comments in red.

There are three forms of comment text:

* The //line// format comment is started by two dashes and extends to the
end of the current line.

e.g.
<eucode>
-- This is a comment which extends to the end of this line only.
</eucode>

* The //multi-line// format comment is started by ##/*## and extends to the
next occurrence of ##*/##, even if that occurs on a different line.

e.g.
<eucode>
/* This is a comment which
extends over a number
of text lines.
*/
</eucode>

* On the first line only of your program, you can use a special comment
beginning with the two character sequence ###!##. This is mainly used to tell
//Unix// shells which program to execute the 'script' program with.

e.g.
<eucode>
#!/home/rob/euphoria/bin/eui
</eucode>

This informs the Linux shell that your file should be executed by the
Euphoria interpreter, and gives the full path to the interpreter. If you make
your file executable, you can run it, just by typing its name, and without the
need to type "##eui##". On //Windows// this line is just
treated as a comment (though Apache Web server on //Windows// does
recognize it.). If your file is a shrouded ##.il## file, use ##eub.exe##
instead of ##eui##.

Line comments are typically used to annotate a single (or small section) of
code, whereas multi-line comments are typically used to give larger pieces of
documentation inside the source text.

=== Expressions

Like other programming languages, Euphoria lets you calculate results by
forming expressions. However, in Euphoria you can perform calculations on
entire sequences of data with one expression, where in most other languages you
would have to construct a loop. In Euphoria you can handle a sequence much as
you would a single number. It can be copied, passed to a subroutine, or
calculated upon as a unit. For example,

<eucode>
{1,2,3} + 5
</eucode>

is an expression that adds the sequence ##{1,2,3}## and the ##atom 5## to get
the resulting sequence ##{6,7,8}##.

We will see more examples later.

==== Relational Operators

The relational operators **##< > <= >= = !=##** each produce a ##1## (true) or
a ##0## (false) result.

<eucode>
8.8 < 8.7 -- 8.8 less than 8.7 (false)
-4.4 > -4.3 -- -4.4 greater than -4.3 (false)
8 <= 7 -- 8 less than or equal to 7 (false)
4 >= 4 -- 4 greater than or equal to 4 (true)
1 = 10 -- 1 equal to 10 (false)
8.7 != 8.8 -- 8.7 not equal to 8.8 (true)
</eucode>

As we will soon see you can also apply these operators to sequences.

==== Logical Operators

The logical operators ##and##, ##or##, ##xor##, and ##not## are used to
determine the "truth" of an expression. e.g.

<eucode>
1 and 1 -- 1 (true)
1 and 0 -- 0 (false)
0 and 1 -- 0 (false)
0 and 0 -- 0 (false)

1 or 1 -- 1 (true)
1 or 0 -- 1 (true)
0 or 1 -- 1 (true)
0 or 0 -- 0 (false)

1 xor 1 -- 0 (false)
1 xor 0 -- 1 (true)
0 xor 1 -- 1 (true)
0 xor 0 -- 0 (false)

not 1 -- 0 (false)
not 0 -- 1 (true)
</eucode>

You can also apply these operators to numbers other than ##1## or ##0##. The
rule is: zero means false and non-zero means true. So for instance:

<eucode>
5 and -4 -- 1 (true)
not 6 -- 0 (false)
</eucode>

These operators can also be applied to sequences. See below.

In some cases [[:short_circuit]] evaluation will be used for expressions
containing ##and## or ##or##. Specifically, short circuiting applies inside decision making expressions. These are found in the [[:if statement]], [[:while statement]] and the [[:loop until statement]]. More on this later.

==== Arithmetic Operators

The usual arithmetic operators are available: add, subtract, multiply, divide,
unary minus, unary plus.

<eucode>
3.5 + 3 -- 6.5
3 - 5 -- -2
6 * 2 -- 12
7 / 2 -- 3.5
-8.1 -- -8.1
+8 -- +8
</eucode>

Computing a result that is too big (i.e. outside of -1e300 to +1e300) will
result in one of the special atoms **+infinity** or **-infinity**. These appear
as ##inf## or ##-inf## when you print them out. It is also possible to generate
##nan## or ##-nan##. "nan" means "not a number", i.e. an undefined value (such
as ##inf## divided by ##inf##). These values are defined in the
IEEE floating-point standard. If you see one of these special values in your
output, it usually indicates an error in your program logic, although
generating inf as an intermediate result may be acceptable in some cases. For
instance, ##1/inf## is ##0##, which may be the "right" answer for your
algorithm.

Division by zero, as well as bad arguments to math library routines, e.g.
square root of a negative number, log of a non-positive number etc. cause an
immediate error message and your program is aborted.

The only reason that you might use unary plus is to emphasize to the reader of
your program that a number is positive. The interpreter does not actually
calculate anything for this.

==== Operations on Sequences

All of the relational, logical and arithmetic operators described above, as
well as the math routines described in [[:Language Reference]], can be applied
to sequences as well as to single numbers (atoms).

When applied to a sequence, a unary (one operand) operator is actually applied
to each element in the sequence to yield a sequence of results of the same
length. If one of these elements is itself a sequence then the same rule is
applied again recursively. e.g.

<eucode>
x = -{1, 2, 3, {4, 5}} -- x is {-1, -2, -3, {-4, -5}}
</eucode>

If a binary (two-operand) operator has operands which are both sequences then
the two sequences must be of the same length. The binary operation is then
applied to corresponding elements taken from the two sequences to get a
sequence of results. e.g.

<eucode>
x = {5, 6, 7, 8} + {10, 10, 20, 100}
-- x is {15, 16, 27, 108}
x = {{1, 2, 3}, {4, 5, 6}} + {-1, 0, 1} -- ERROR: 2 != 3
-- but
x = {{1, 2, 3} + {-1, 0, 1}, {4, 5, 6} + {-1, 0, 1}} -- CORRECT
-- x is {{0, 2, 4}, {3, 5, 7}}
</eucode>

If a binary operator has one operand which is a sequence while the other is a
single number (atom) then the single number is effectively repeated to form a
sequence of equal length to the sequence operand. The rules for operating on
two sequences then apply. Some examples:

<eucode>
y = {4, 5, 6}
w = 5 * y -- w is {20, 25, 30}

x = {1, 2, 3}
z = x + y -- z is {5, 7, 9}
z = x < y -- z is {1, 1, 1}

w = {{1, 2}, {3, 4}, {5}}
w = w * y -- w is {{4, 8}, {15, 20}, {30}}

w = {1, 0, 0, 1} and {1, 1, 1, 0} -- {1, 0, 0, 0}
w = not {1, 5, -2, 0, 0} -- w is {0, 0, 0, 1, 1}

w = {1, 2, 3} = {1, 2, 4} -- w is {1, 1, 0}

-- note that the first '=' is assignment, and the
-- second '=' is a relational operator that tests
-- equality
</eucode>

**Note:** When you wish to compare two strings (or other sequences), you
should **not** (as in some other languages) use the '=' operator:

<eucode>
if "APPLE" = "ORANGE" then -- ERROR!
</eucode>

'##=##' is treated as an operator, just like '##+##', '##*##' etc., so it is
applied to
corresponding sequence elements, and the sequences must be the same length.
When they are equal length, the result is a sequence of ones an zeros. When they
are not equal length, the result is an error. Either way you'll get an error,
since an if-condition must be an atom, not a sequence. Instead you should use
the ##equal## built-in routine:

<eucode>
if equal("APPLE", "ORANGE") then -- CORRECT
</eucode>

In general, you can do relational comparisons using the ##compare## built-in
routine:

<eucode>
if compare("APPLE", "ORANGE") = 0 then -- CORRECT
</eucode>

You can use ##compare## for other comparisons as well:

<eucode>
if compare("APPLE", "ORANGE") < 0 then -- CORRECT
-- enter here if "APPLE" is less than "ORANGE" (TRUE)
</eucode>

Especially useful is the idiom ##compare(x, "") = 1## to determine whether ##x##
is a non empty sequence. ##compare(x, "") = -1## would test for ##x## being an
atom, but ##atom(x) = 1## does the same faster and is clearer to read.

==== Subscripting of Sequences

A single element of a sequence may be selected by giving the element number in
square brackets. Element numbers start at 1. Non-integer subscripts are rounded
down to an integer.

For example, if ##x## contains ##{5, 7.2, 9, 0.5, 13}## then ##x[2]## is
##7.2##. Suppose we assign something different to ##x[2]##:

<eucode>
x[2] = {11,22,33}
</eucode>

Then ##x## becomes: ##{5, {11,22,33}, 9, 0.5, 13}##. Now if we ask for
##x[2]## we get ##{11,22,33}## and if we ask for ##x[2][3]## we get the
##atom## 33. If you try to subscript with a number that is outside of the range
##1## to the number of elements, you will get a subscript error. For example
##x[0]##, ##x[-99]## or ##x[6]## will cause errors. So will ##x[1][3]## since
##x[1]## is not a sequence. There is no limit to the number of subscripts that
may follow a variable, but the variable must contain sequences that are nested
deeply enough. The two dimensional array, common in other languages, can be
easily represented with a sequence of sequences:

<eucode>
x = {
{5, 6, 7, 8, 9}, -- x[1]
{1, 2, 3, 4, 5}, -- x[2]
{0, 1, 0, 1, 0} -- x[3]
}
</eucode>

where we have written the numbers in a way that makes the structure
clearer. An expression of the form x[i][j] can be used to access any element.

The two dimensions are not symmetric however, since an entire "row" can be
selected with x[i], but you need to use [[:vslice]] in the Standard Library
to select an entire column. Other logical structures, such as n-dimensional
arrays, arrays of strings, structures, arrays of structures etc. can also be
handled easily and flexibly:

3-D array:

<eucode>
y = {
{{1,1}, {3,3}, {5,5}},
{{0,0}, {0,1}, {9,1}},
{{-1,9},{1,1}, {2,2}}
}

-- y[2][3][1] is 9
</eucode>

Array of strings:

<eucode>
s = {"Hello", "World", "Euphoria", "", "Last One"}

-- s[3] is "Euphoria"
-- s[3][1] is 'E'
</eucode>

A Structure:

<eucode>
employee = {
{"John","Smith"},
45000,
27,
185.5
}
</eucode>

To access "fields" or elements within a structure it is good programming style
to make up an enum that names the various fields. This will make your program
easier to read. For the example above you might have:

<eucode>
enum NAME, SALARY, AGE, WEIGHT
enum FIRST_NAME, LAST_NAME

employees = {
{{"John","Smith"}, 45000, 27, 185.5}, -- a[1]
{{"Bill","Jones"}, 57000, 48, 177.2}, -- a[2]
-- .... etc.
}

-- employees[2][SALARY] would be 57000.
</eucode>

The ##length## built-in function will tell you how many elements are in a
sequence. So the last element of a sequence ##s##, is:

<eucode>
s[length(s)]
</eucode>

A short-hand for this is:

<eucode>
s[$]
</eucode>

Similarly,

<eucode>
s[length(s)-1]
</eucode>

can be simplified to:

<eucode>
s[$-1]
</eucode>

The ##$## may only appear between square braces and it equals the length of the
sequence that is being subscripted. Where there's nesting, e.g.:

<eucode>
s[$ - t[$-1] + 1]
</eucode>

The first ##$## above refers to the length of ##s##, while the second ##$##
refers to the length of ##t## (as you'd probably expect). An example where
##$## can save a lot of typing, make your code clearer, and probably even faster
is:

<eucode>
longname[$][$] -- last element of the last element
</eucode>

Compare that with the equivalent:

<eucode>
longname[length(longname)][length(longname[length(longname)])]
</eucode>

**Subscripting and function side-effects:**

In an assignment statement,
with left-hand-side subscripts:

<eucode>
lhs_var[lhs_expr1][lhs_expr2]... = rhs_expr
</eucode>

The expressions are evaluated, and any subscripting is performed, from left
to right. It is possible to have function calls in the right-hand-side
expression, or in any of the left-hand-side expressions. If a function call
has the side-effect of modifying the lhs_var, it is not defined whether those
changes will appear in the final value of the lhs_var, once the assignment has
been completed. To be sure about what is going to happen, perform the function
call in a separate statement, i.e. do not try to modify the lhs_var in two
different ways in the same statement. Where there are no left-hand-side
subscripts, you can always assume that the final value of the lhs_var will be
the value of rhs_expr, regardless of any side-effects that may have changed
lhs_var.

**Euphoria data structures are almost infinitely flexible.**

Arrays in many
languages are constrained to have a fixed number of elements, and those
elements must all be of the same type. Euphoria eliminates both of those
restrictions by defining all arrays (sequences) as a list of zero or more
Euphoria objects whose element count can be changed at any time.
You can easily add a new structure to the employee sequence
above, or store an unusually long name in the NAME field and Euphoria will take
care of it for you. If you wish, you can store a variety of different employee
"structures", with different sizes, all in one sequence. However, when you
retrieve a sequence element, it is not guaranteed to be of any type. You, as a
programmer, need to check that the retrieved data is of the type you'd expect,
Euphoria will not. The only thing it will check is whether an assignment is
legal. For example, if you try to assign a sequence to an integer variable,
Euphoria will complain at the time your code does the assignment.

Not only can a Euphoria program represent all conventional data
structures but you can create very useful, flexible structures that would be
hard to declare in many other languages.

Note that expressions in general may not be subscripted, just variables. For
example: ##{5+2,6-1,7*8,8+1}[3]## is //not// supported, nor is something like:
##date()[MONTH]##. You have to assign the sequence returned by ##date## to a
variable, then subscript the variable to get the month.

==== Slicing of Sequences

A sequence of consecutive elements may be selected by giving the starting and
ending element numbers. For example if ##x## is ##{1, 1, 2, 2, 2, 1, 1, 1}##
then ##x[3..5]## is the sequence ##{2, 2, 2}##. ##x[3..3]## is the sequence
##{2}##. ##x[3..2]## is also allowed. It evaluates to the zero length sequence
##{}##. If ##y## has the value: ##{"fred", "george", "mary"}## then
##y[1..2]## is ##{"fred", "george"}##.

We can also use slices for overwriting portions of variables. After
##x[3..5] = {9, 9, 9}## ##x## would be ##{1, 1, 9, 9, 9, 1, 1, 1}##. We could
also have said ##x[3..5] = 9## with the same effect. Suppose ##y## is
##{0, "Euphoria", 1, 1}##. Then ##y[2][1..4]## is ##"Euph"##. If we say
##y[2][1..4] = "ABCD"## then ##y## will become ##{0, "ABCDoria", 1, 1}##.

In general, a variable name can be followed by 0 or more subscripts, followed
in turn by 0 or 1 slices. Only variables may be subscripted or sliced, not
expressions.

We need to be a bit more precise in defining the rules for **empty
slices**. Consider a slice ##s[i..j]## where ##s## is of length ##n##. A slice
from ##i## to ##j##, where ##j = i - 1## and ##i >= 1## produces the
[[:emptyseq "empty sequence"]],
even if ##i = n + 1##. Thus
##1..0## and ##n + 1..n## and everything in between are legal
**(empty) slices**. Empty
slices are quite useful in many algorithms. A slice from ##i## to ##j## where
##j < i - 1## is illegal , i.e. "reverse" slices such as ##s[5..3]## are not
allowed.

We can also use the ##$## shorthand with slices, e.g.

<eucode>
s[2..$]
s[5..$-2]
s[$-5..$]
s[$][1..floor($/2)] -- first half of the last element of s
</eucode>

==== Concatenation of Sequences and Atoms - The '&' Operator ====
@[amp concat|]
@[amp_concat|]

Any two objects may be concatenated using the **&** operator. The
result is a sequence with a length equal to the sum of the lengths of the
concatenated objects.
e.g.

<eucode>
{1, 2, 3} & 4 -- {1, 2, 3, 4}

4 & 5 -- {4, 5}

{{1, 1}, 2, 3} & {4, 5} -- {{1, 1}, 2, 3, 4, 5}

x = {}
y = {1, 2}
y = y & x -- y is still {1, 2}
</eucode>

You can delete element ##i## of any sequence s by concatenating the parts of the
sequence before and after ##i##:

<eucode>
s = s[1..i-1] & s[i+1..length(s)]
</eucode>

This works even when ##i## is ##1## or ##length(s)##, since ##s[1..0]## is a
legal empty slice, and so is ##s[length(s)+1..length(s)]##.

==== Sequence-Formation

Finally, sequence-formation, using braces and commas:

<eucode>
{a, b, c, ... }
</eucode>

is also an operator. It takes n operands, where ##n## is ##0## or more, and
makes an n-element sequence from their values. e.g.

<eucode>
x = {apple, orange*2, {1,2,3}, 99/4+foobar}
</eucode>

The sequence-formation operator is listed at the bottom of the a
[[:precedence chart]].

==== Multiple Assignment

Special sequence notation on the left hand side of an assignment can be made to
assign to multiple variables with a single statement. This can be useful for
using functions that return multiple values in a sequence, such as ##[[:value]]##.

<eucode>
atom success, val

{ success, val } = value( "100" )

-- success = GET_SUCCESS
-- val = 100
</eucode>

It is also possible to ignore some of the values in the right hand side. Any
elements beyond the number supplied on the left hand side are ignored. Other
values can also be ignored by using a question mark ('##?##') instead of a variable
name:

<eucode>
{ ?, val } = value( "100" )
</eucode>

Variables may only appear once on the left hand side, however, they may appear
on both the left and right hand side. For instance, to swap the values of two
variables:

<eucode>
{ a, b } = { b, a }
</eucode>

==== Other Operations on Sequences

Some other important operations that you can perform on sequences have English
names, rather than special characters. These operations are built-in to
**eui.exe/euiw.exe**, so they'll always be there, and so they'll be fast. They
are described in detail in the [[:Language Reference]], but are
important enough to Euphoria programming that we should mention them here before
proceeding. You call these operations as if they were subroutines, although
they are actually implemented much more efficiently than that.

===== length(sequence s)

Returns the length of a sequence s.

This is the number of elements in s. Some of these elements may be
sequences that contain elements of their own, but ##length## just gives you the
"top-level" count. Note however that the length of an atom is always ##1##.
e.g.

<eucode>
length({5,6,7}) -- 3
length({1, {5,5,5}, 2, 3}) -- 4 (not 6!)
length({}) -- 0
length(5) -- 1
</eucode>

===== repeat(object o1, integer count)

Returns a sequence that consists of an item repeated count times.
e.g.

<eucode>
repeat(0, 100) -- {0,0,0,...,0} i.e. 100 zeros
repeat("Hello", 3) -- {"Hello", "Hello", "Hello"}
repeat(99,0) -- {}
</eucode>

The item to be repeated can be any atom or sequence.

===== append(sequence s1, object o1)

Returns a sequence by adding an object o1 to the end of a sequence
s1.

<eucode>
append({1,2,3}, 4) -- {1,2,3,4}
append({1,2,3}, {5,5,5}) -- {1,2,3,{5,5,5}}
append({}, 9) -- {9}
</eucode>

The length of the new sequence is always 1 greater than the length of
the original sequence. The item to be added to the sequence can be any atom or
sequence.

===== prepend(sequence s1, object o1)

Returns a new sequence by adding an element to the beginning of a
sequence s. e.g.

<eucode>
append({1,2,3}, 4) -- {1,2,3,4}
prepend({1,2,3}, 4) -- {4,1,2,3}

append({1,2,3}, {5,5,5}) -- {1,2,3,{5,5,5}}
prepend({}, 9) -- {9}
append({}, 9) -- {9}
</eucode>

The length of the new sequence is always one greater than the length of
the original sequence. The item to be added to the sequence can be any atom or
sequence.

These two built-in functions, ##append## and
##prepend##, have some similarities to the concatenate operator,
##&##, but there are clear differences. e.g.

<eucode>
-- appending a sequence is different
append({1,2,3}, {5,5,5}) -- {1,2,3,{5,5,5}}
{1,2,3} & {5,5,5} -- {1,2,3,5,5,5}

-- appending an atom is the same
append({1,2,3}, 5) -- {1,2,3,5}
{1,2,3} & 5 -- {1,2,3,5}
</eucode>

===== insert(sequence in_what, object what, atom position)

This function takes a target sequence, in_what, shifts its tail one notch and
plugs the object what in the hole just created. The modified sequence is
returned. For instance:

<eucode>
s = insert("Joe",'h',3) -- s is "Johe", another string
s = insert("Joe","h",3) -- s is {'J','o',{'h'},'e'}, not a string
s = insert({1,2,3},4,-0.5) -- s is {4,1,2,3}, like prepend()
s = insert({1,2,3},4,8.5) -- s is {1,2,3,4}, like append()
</eucode>

The length of the returned sequence is one more than the one of ##in_what##.
This is the same rule as for ##append## and ##prepend## above, which are
actually special cases of ##insert##.

===== splice(sequence in_what, object what, atom position)

If what is an ##atom##, this is the same as ##insert##. But if what is a
sequence, that sequence is inserted as successive elements into ##in_what##
at ##position##. Example:

<eucode>
s = splice("Joe",'h',3)
-- s is "Johe", like insert()
s = splice("Joe","hn Do",3)
-- s is "John Doe", another string
s = splice("Joh","n Doe",9.3)
-- s is "John Doe", like with the & operator
s = splice({1,2,3},4,-2)
-- s is {4,1,2,3}, like with the & operator in reversed order
</eucode>

The length of ##splice(in_what, what, position)## always is ##length(in_what)
+ length(what)##, like for concatenation using ##&##.

=== Precedence Chart

When two or more operators follow one another in an expression, there must be
rules to tell in which order they should be evaluated, as different orders
usually lead to different results. It is common and convenient to use a
**precedence order** on operators. Operators with the highest degree of
precedence are evaluated first, then those with highest precedence
among what remains, and so on.

The precedence of operators in expressions is as follows:

**highest precedence**

{{{
**highest precedence**

function/type calls
unary- unary+ not
* /
+ -
&
< > <= >= = !=
and or xor
}}}

**lowest precedence**
{{{
{ , , , }
}}}

Thus ##2+6*3## means ##2+(6*3)## rather than ##(2+6)*3##. Operators on the same
line
above have equal precedence and are evaluated left to right. You can force
any order of operations by placing round brackets ##( )## around an expression.
For instance, ##6/3*5## is ##2*5##, not ##6/15##.

Different languages or contexts may have slightly different precedence rules.
You should be careful when translating a formula from a language to another;
Euphoria is no exception. Adding superfluous parentheses to explicitly denote
the exact order of evaluation does not cost much, and may help either readers
used to some other precedence chart or translating to or from another context
with slightly different rules. Watch out for ##and## and ##or##, or
##*## and ##/##.

The equals symbol ##'='## used in an [[:assignment statement]] is not an
operator, it's just part of the syntax of the language.

Search



Quick Links

User menu

Not signed in.

Misc Menu