OpenEuphoria: Wiki View

Values

integer atom boolean character string

integer atom sequence object user defined

integer		programming tip
range	`-1_073_741_824` to `+1_073_741_823`
use for	speed reduced memory
not for	32-bit machine addresses	(use atom instead)
caution	additions and multiplications can exceed the size limit of an integer	(assign these values to an atom)

atom		programming tip
range	integers -`power(2,53)` to +`power(2,53)`	`power(2,53)` is slightly above 9.10¹⁵
	floating-point -`power(2,1024)+1` to +`power(2,1024)-1`	`power(2,1024)` is in the 10³⁰⁸ range
	large integers	(up to about 15 digits)
	use for general calculations
	integers values, when overflow may occur	(assign result of large integer additions and multiplications to atom)
	32-bit machine addresses	(never use integer type for this!)
	large valued integers	(rather than overflowing, large integers are converted to floating-point)
caution	floating-point values are limited by accuracy of computer hardware	(see: Floating-Point Arithmetic )

integer

An Euphoria integer is a mathematical integer restricted to the range -1,073,741,824 to +1,073,741,823. As a result, a variable of the integer type, while allowing computations as fast as possible, cannot hold 32-bit machine addresses, even though the latter are mathematical integers. You must use the atom type for this purpose. Also, even though the product of two integers is a mathematical integer, it may not fit into an integer, and should be assigned to an atom instead.

atom

An atom can hold three kinds of data:

Mathematical integers in the range -power(2,53) to +power(2,53)
Floating point numbers, in the range -power(2,1024)+1 to +power(2,1024)-1
Large mathematical integers in the same range, but with a "fuzz" that grows with the magnitude of the integer.

power(2,53) is slightly above 9.10¹⁵, power(2,1024) is in the 10³⁰⁸ range.

Floating-Point Math

Euphoria follows the standards set by IEEE for calculations for floating-point arithmetic. There is an intrinsic limit to the accuracy of calculations for any computer.

Because of these constraints, which arise in part from common hardware limitations, some care is needed for specific purposes:

The sum or product of two integers is an atom, but may not be an integer.
Memory addresses, or handles acquired from anything non Euphoria, including the operating system, must be stored as an atom.
For large numbers, usual operations may yield strange results.
Calculations with numbers that differ by a tiny amount can yield strange results
Calculations requiring many computations may suffer from rounding and truncation errors

This example, using algebra, has the exact result of one:

   n*n - (n+1)*(n-1) 
 = n*n - ( n*n - n + n - 1 ) 
 = n*n - n*n + 1 
 = 1

A computer can also be used perform this calculation. For "most" values the correct answer is displayed. But n has a size limit; if n is too big then capability of the computer hardware is breached:

integer n = power(2, 27) -- ok 
integer n_plus = n + 1 
integer n_minus = n - 1 -- ok 
atom a = n * n -- ok 
atom a1 = n_plus * n_minus -- still ok 
? a - a1 -- prints 0, should be 1 mathematically

This is not an Euphoria bug. The IEEE 754 standard for floating point numbers provides for 53 bits of precision for any real number, and an accurate computation of a-a1 would require 54 of them. Intel FPU chips do have 64 bit precision registers, but the low order 16 bits are only used internally, and Intel recommends against using them for high precision arithmetic. Their SIMD machine instruction set only uses the IEEE 754 defined format.

Sequence

A sequence is a type that is a container. A sequence has elements which can be accessed through their index, like in my_sequence[3]. sequences are so generic as being able to store all sorts of data structures: strings, trees, lists, anything. Accesses to sequences are always bound checked, so that you cannot read or write an element that does not exist, ever. A large amount of extraction and shape change operations on sequences is available, both as built-in operations and library routines. The elements of a sequence can have any type.

Euphoria sequences are implemented very efficiently. Programmers used to pointers will soon notice that they can get most usual pointer operations done using sequence indexes. The loss is efficiency is usually hard to notice, and the gain is code safety and bug prevention far outweighs it.

Object

This type can hold any data Euphoria can handle, both atoms and sequences.

The object() type returns 0 if a variable is not initialized, else 1.

Boolean Values

Euphoria has no special "boolean data-type."

In Euphoria, a boolean value is just the interpretation of a number value that could be assigned to either an atom or integer. A value is false if it is zero. A value is true if is is non-zero.

In some libraries, there may be constants pre-defined as:

FALSE = 0 
TRUE = 1

Flow control statements depend on atomic values when a true or false result is needed; never a sequence. If a sequence is to be part of a conditional test, it must be used in a function that returns an atomic result.

In this documentation true and false are used describe conditional and flow structures. The proper atomic value must be actually used for these tests.

A boolean expression is any expression that will evaluate to an atomic value, and is interpreted as true or false.

See also: relational operators and control flow.

Character and String

Euphoria is easy to use for programming with text based data.

All of the usual Euphoria routines and operations work on text just like they work on numbers. There are also libraries of routines designed to make working with text easy: string centric routines and regular expressions.

Character	String
'd' '3' '-' '&' 'f' 'G'	"Hello world" "this is a string"

Character

String

'd' '3' '-' '&' 'f' 'G'

"Hello world"
"this is a string"

Input	Output
`gets()`	`puts()`

Character

A character is one individual symbol such as a letter, digit, punctuation, dingbat, ..., that we use for written communications.

An individual character may be written using single quote ' delimiters:

'a'    'A'    '['    '#'

They may be assigned to either an integer or atom:

atom char = 'a' 
integer pound = '#'

There is no special "character data-type" in Euphoria. The standard ASCII chart assigns a number to each character. These number values are used in Euphoria to represent characters.

Euphoria converts all character values to their numeric equivalence; only number values are stored:

? 'a' 
    -- 97 appears, not 'a' 
? 'A' 
    -- 65 appears, not 'A'

It is easy to display character values using puts():

puts(1, 'a' ) 
    -- a    <-- appears         
puts(1, 'A' ) 
    -- A    <-- appears

There is no automatic way to distinguish the value 97 intended to be the number 'ninety-seven' and the value 'a' intended to be the character a. All values are numbers.

include std/console.e 
display( 'a' ) 
display( 97  ) 
        -- output for both examples is: 
   -- 97 
   -- 97

Therefore 'B' is just a notation that is equivalent to typing 66. There are no "characters" in Euphoria, just numbers (atoms).

Values representing characters may be manipulated and operated on just like any other numerical value--they are numerical values.

Character atoms combine to make string sequences. Both examples represent the same Euphoria sequence:

{ 'H','e','l','l','o',' ','W','o','r','l','d' } 
"Hello World"

Escaped Characters

Special characters may be entered using a back-slash:

Code	Meaning
\n	newline
\r	carriage return
\t	tab
\\	backslash
\"	double quote
\'	single quote
\0	null
\e	escape
\E	escape
\b/d..d/	A binary coded value, The 'b is followed by 1 or more binary digits. Inside strings, use the space character to delimit end a binary value.
\x/hh/	A 2-hex-digit value: "\x5F" ==> {95}
\u/hhhh/	A 4-hex-digit value: "\u2A7C" ==> {10876}
\U/hhhhhhhh/	An 8-hex-digit value: "\U8123FEDC" ==> {2166619868}

For example, "Hello, World!\n", or '\\'. The Euphoria editor displays character strings in green.

Sometimes the special characters are described as "non-printing characters" because, while they control the layout of a display, nothing appears when they are output.

Note that you can use the underscore character '_' inside the \b, \x, \u, and \U values to aid readability:

\U8123_FEDC   -- as written using spacer _  
{2166619868}  -- value as stored

String

A string is a sequence of character values. There is no special "string data-type" in Euphoria.

A string sequence may be written using double-quote " delimiters:

"ABCDEFG"

A string is just like any other sequence in Euphoria. For each element of a string, the character values are all converted to their numerical value. Strings may be manipulated and operated on the same way as all other sequences in Euphoria.

The string "ABCDEFG" is entirely equivalent to the sequence:

{65, 66, 67, 68, 69, 70, 71}

puts(1, "ABCDEFG" ) 
   -- ABCDEFG  <-- appears on output 
print(1, "ABCDEFG" ) 
   -- {65, 66, 67, 68, 69, 70, 71} <-- appears on output

A quoted string is really just a convenient notation that saves you from having to type in all the ASCII codes. It follows that "" is equivalent to {}. Both represent the sequence of zero length, also known as the empty sequence. As a matter of programming style, it is natural to use "" to suggest a zero length sequence of characters, and {} to suggest some other kind of sequence.

An individual character is an atom. It must be entered using single quotes. There is a difference between an individual character (which is an atom), and a character string of length one (which is a sequence):

'B'   -- equivalent to the atom 66 -- the ASCII code for B 
"B"   -- equivalent to the sequence {66}

Keep in mind that an atom is not equivalent to a one-element sequence containing the same value, although there are a few built-in routines that choose to treat them similarly.

Some routines are able to make an intelligent guess if a sequence is intended to be as string as opposed to a numerical sequence:

include std/console.e 
 
? "Hello World" 
    -- {72,101,108,108,111,32,87,111,114,108,100}  <-- appears 
 
    -- recognize that all sequences are numeric 
 
display( "Hello World" ) 
    -- Hello World  <-- appears  
 
                     -- string appears as expected 
 
display( {72,101,108,108,111,32,87,111,114,108,100} ) 
    -- Hello World <-- appears 
 
                  --  appears as string since all element values are  
                  -- character values 
 
display( {72,101,108,108,111,32,87,111,114,108,100.1 } ) 
    -- {72,101,108,108,111,32,87,111,114,108,100.1 } <-- appears 
 
            -- last element in the sequence is not a character value 
            -- sequence is output as a numerical sequence

Hint

Escaped characters may be written directly into a string, for characters not available on the keyboard, and to control the layout of the string:

puts(1, "This sentence\nis displayed\nover three lines" ) 
-- 
-- the '\n' escaped character creates line breaks 
--  
--This sentence 
--is displayed 
--over three lines

In a real and practical program it is possible to input, create, manipulate, and then finally output strings, all without ever having to ever consider their numerical basis. Euphoria lets you think in terms of 'values' rather than specialized 'data-types'. String values can be manipulated just like any other sequence. This generic quality of Euphoria makes programming simple and easy.

Character Strings and Individual Characters

A string in Euphoria is just as sequence of characters. That means that individual characters may be indexed and manipulated just like any other sequence.

To make working with strings easy, there are a variety of ways to enter string values into a sequence:

Delimeter		Notation	Example
Left	Right
`"`	`"`	double-quotes	"ABCDEFG"
`	`	back-quotes	`ABCDEFG`
`"""`	`"""`	three double-quotes	"""ABCDEFG"""
`b"`	`"`	binary byte strings	b"1001 00110110 0110_0111 1_0101_1010" -- ==> {#9,#36,#67,#15A}
`x"`	`"`	hexadecimal byte strings	x"65 66 67 AE" -- ==> {#65,#66,#67,#AE}

Double-Quote Strings

They begin and end with a double-quote " character
They cannot contain a double-quote
They must be only on a single line
They cannot contain the TAB character
If they contain the back-slash '\' character, that character must immediately be followed by one of the special escape codes. The back-slash and escape code will be replaced by the appropriate single character equivalent. If you need to include double-quote, end-of-line, back-slash, or TAB characters inside a double-quoted string, you need to enter them in a special manner.

Examples:

"Bill said\n\t\"This is a back-slash \\ character\".\n"

Which, when displayed should look like ...

Bill said 
    "This is a back-slash \ character".

Raw Strings

Enclose with three double-quotes """...""" or back-quote. `...`
The resulting string will never have any carriage-return characters in it.
If the resulting string begins with a new-line, the initial new-line is removed and any trailing new-line is also removed.
A special form is used to automatically remove leading whitespace from the source code text. You might code this form to align the source text for ease of reading. If the first line after the raw string start token begins with one or more underscore characters, the number of consecutive underscores signifies the maximum number of whitespace characters that will be removed from each line of the raw string text. The underscores represent an assumed left margin width. Note, these leading underscores do not form part of the raw string text.

Examples:

-- No leading underscores and no leading whitespace 
` 
Bill said 
    "This is a back-slash \ character". 
`

Which, when displayed should look like ...

Bill said 
    "This is a back-slash \ character".

-- No leading underscores and but leading whitespace 
` 
   Bill said 
      "This is a back-slash \ character". 
`

Which, when displayed should look like ...

   Bill said 
      "This is a back-slash \ character".

-- Leading underscores and leading whitespace 
` 
_____Bill said 
         "This is a back-slash \ character". 
`

Which, when displayed should look like ...

Bill said 
    "This is a back-slash \ character".

Extended string literals are useful when the string contains new-lines, tabs, or back-slash characters because they do not have to be entered in the special manner. The back-quote form can be used when the string literal contains a set of three double-quote characters, and the triple quote form can be used when the text literal contains back-quote characters. If a literal contains both a back quote and a set of three double-quotes, you will need to concatenate two literals.

object TQ, BQ, QQ 
TQ = `This text contains """ for some reason.` 
BQ = """This text contains a back quote ` for some reason.""" 
QQ = """This text contains a back quote ` """ & `and """ for some reason.`

Binary Strings

They begin with the pair b" and end with a double-quote " character
They can only contain binary digits (0-1), and space, underscore, tab, newline, carriage-return. Anything else is invalid.
An underscore is simply ignored, as if it was never there. It is used to aid readability.
Each set of contiguous binary digits represent a single sequence element
They can span multiple lines
The non-digits are treated as punctuation and used to delimit individual values.

Examples:

b"1 10 11_0100 01010110_01111000" == {0x01, 0x02, 0x34, 0x5678}

Hexadecimal Strings

They begin with the pair x" and end with a double-quote " character
They can only contain hexadecimal digits (0-9 A-F a-f), and space, underscore, tab, newline, carriage-return. Anything else is invalid.
An underscore is simply ignored, as if it was never there. It is used to aid readability.
Each pair of contiguous hex digits represent a single sequence element with a value from 0 to 255
They can span multiple lines
The non-digits are treated as punctuation and used to delimit individual values.

Examples:

x"1 2 34 5678_AbC" == {0x01, 0x02, 0x34, 0x56, 0x78, 0xAB, 0x0C}

When you put too many hex characters together they are split up appropriately for you:

x"656667AE"  -- 8-bit  ==> {#65,#66,#67,#AE}

OpenEuphoria

Values

Values

integer

atom

Floating-Point Math

Sequence

Object

Boolean Values

Character and String

Character

Escaped Characters

String

Character Strings and Individual Characters

Double-Quote Strings

Raw Strings

Binary Strings

Hexadecimal Strings

Search

Include:

Quick Links

User menu

Misc Menu