OpenEuphoria: Wiki View

Historical Values, Revision 1

Data Objects

Predefined Types

Euphoria is a unique programming language because of the way it handles data. The Euphoria way is both powerful and easy to use.

The Euphoria data viewpoint:

All data is an object in the form of atoms or sequences.
An atom is a single numeric value.
A sequence is a collection of objects, either atoms or sequences themselves.

An atom (which is also an object) is any one value:

integer	character	boolean	float	with 'spacer'	exponential	other bases
0 100 -4	'x' '$' '\n'	0 1	98.6 -1	23_100_000	-1e6	0b101 #A000

integer

character

boolean

float

with 'spacer'

exponential

other bases

0
100
-4

'x'
'$'
'\n'

0
1

98.6
-1

23_100_000

-1e6

0b101
#A000

A sequence (which is also an object) is any collection of objects:

empty	single	with atoms	as a string
{ }	{ 3.14 }	{2, 3, 5, 7, 11, 13, 17, 19}	"Euphoria"

empty

single

with atoms

as a string

{ }

{ 3.14 }

{2, 3, 5, 7, 11, 13, 17, 19}

"Euphoria"

mixed content	pairs	nested
{{"jon", "smith"}, 52389, 97.25}	{ {"blue", 5}, {"red",3} }	{ 1, 2, {3, 3, 3}, 4, { 5, {6} } }

mixed content

pairs

nested

{{"jon", "smith"}, 52389, 97.25}

{ {"blue", 5}, {"red",3} }

{
1,
2,
{3, 3, 3},
4,
{ 5, {6} }
}

Different data-types: An empty sequence { } is not the same as the atom 0. A sequence of one element { 3 } is not the same as the atom 3.

Predictable behavior

Sometimes it helps to view a sequence as a list of "elements." Then, for some operations, an atom behaves like a one element sequence:

? { 2, 4 } & 5           -- one atom  (one element ) concatenated   
? { 2, 4 } & {5}         -- one sequence (of one element ) concatenated 
-- result is { 2, 4, 5 } 
                         -- the same for both examples

Euphoria is designed to "make sense"; there are no surprising behaviors.

Hint: When thinking about data just consider: atom or sequence, one or many, individual or collection. Remembering which Euphoria routine to use, or how to use a particular operation, all becomes simpler when you realize that there may be only two choices to consider.

Because of objects: Programming in Euphoria is simpler than programming with a conventional language that requires many data-types.

Data-Type

A data-type is an abstract description of values. "Abstract" means we can talk about the members of a data-type without having to describe each and every member. The "description" consists of an algorithm that returns true or false when any given value is tested for being a member of the data-type.

The way data-types are used in Euphoria is simple:

Every variable (really just a name for a value) has to be declared to be a specific data-type.
You are free to create and manipulate values at will in your programming. Euphoria is not going to inhibit your "creative" programming.
You can not assign a value to a variable of the wrong data-type. Euphoria is safe.

For example:

? 3.14 * 10     -- first value is an atom, second value is an integer 
                -- you are free to mix types 
                -- calculations produce the "expected" value                   
--> 31.4        -- result is an atom value 
 
integer x       -- an integer variable is declared 
x = 3.14 * 10   -- can not assign an atom value to an integer variable 
--> error

Atom

An atom can be any integer or double-precision floating point value. Its value can range from approximately -1e300 (minus one times 10 to the power 300) to +1e300 with 15 decimal digits of accuracy.

To test if a value is an atom use the atom() function:

? atom( 1 )      -- 1 or 'true' 
? atom( -4e-10 ) -- 1 or 'true' 
? atom( "cat" )  -- 0 or 'false'

Why do we call them atoms? Why not just "numbers"? Well, an atom is just a number, but we wanted to have a distinctive term that emphasizes that they are indivisible (that's what "atom" means in Greek). In the world of physics you can 'split' an atom into smaller parts, but you no longer have an atom--only particles. You can 'split' a number into smaller parts, but you no longer have a number--only digits.

atom		programming tip
range	integers -`power(2,53)` to +`power(2,53)`	`power(2,53)` is slightly above 9.10¹⁵
	large integers `about 15 digits`	(automatic conversion to floating-point when size is exceeded)
	floating-point -`power(2,1024)+1` to +`power(2,1024)-1`	`power(2,1024)` is in the 10³⁰⁸ range
use for	general calculations
	integers, when overflow may occur	(assign result of large integer additions and multiplications to atom)
	32-bit machine addresses	(never use integer type for this!)
	large valued integers	(rather than overflowing, large integers are converted to floating-point)
caution	floating-point values are limited by accuracy of computer hardware	(see notes on floats)

By default, number literals use base 10. You can write integer literals in other bases, namely binary (base 2), octal (base 8), and hexadecimal (base 16). To do this, the number is prefixed by a 2-character code that specifies the base:

Prefix Code	Base	Name
`0b`	2	Binary
`0t`	8	Octal
`0d`	10	Decimal
`0x`	16	Hexadecimal

Examples of values written in various bases:

 
 0b101 --> decimal 5 
 0t101 --> decimal 65 
 0d101 --> decimal 101 
 0x101 --> decimal 257

Additionally, hexadecimal integers can also be written by prefixing the number with the # pound sign.

Examples of hexadecimal integers:

#FE             -- 254 
#A000           -- 40960 
#FFFF00008      -- 68718428168 
-#10            -- -16

Only digits and the letters A, B, C, D, E, F, in either uppercase or lowercase, are allowed in hexadecimal numbers. Hexadecimal numbers are always positive, unless written like -# with a minus sign in front of the pound sign. For instance #FFFFFFFF is a huge positive number (4294967295), not -1, as some machine-language programmers might expect.

You can embed the underscore _ character when writing a numeric literal. This simulates the commas (or periods) sometimes used to "group" digits into sets of three. The underscore character can be located anywhere within a number, and used with any base:

atom big = 32_873_787   -- Set 'big' to the value 32873787 
 
atom salary = 56_110.66 -- Set salary to the value 56110.66 
 
integer defflags = #0323_F3CD 
 
object phone = 61_3_5536_7733 
 
integer bits = 0b11_00010_1

Integer

The integer is a subset of the atom data-type. It too represents a single numerical value.

Computer hardware works best with integer values; calculations with integers are the fastest and require the least amount of memory. Integer values can be exactly represented by the base 2 binary system. For this reason Euphoria has the built-in integer data-type. When in range, specify integer instead of atom for improved performance.

Integer		Programming Tip
range	`-1_073_741_824` to `+1_073_741_823`
use for	speed reduced memory
not for	32-bit machine addresses	(use atom instead)
caution	additions and multiplications can exceed the size limit of an integer	(assign these values to an atom)

Calculations in Euphoria always produce the "correct" answer. Regardless of the data-types in the calculation, the result is always computed to the maximum accuracy available to the computer hardware. The integer data-type requires that values assigned to an integer variable be integers. Euphoria does not restrict how you calculate with these values.

A division between two integer values may produce a fractional value. Rather than truncate the decimal portion (as in modular arithmetic) the 'correct' fractional value is computed. A fractional result when assigned to an integer produces an error, but this value may be assigned to an atom. Similarly, when two integers are added or multiplied together the 'correct' value is calculated. The result of a multiplication or addition may produce a value that is too large to be assigned to an integer, but can be assigned to an atom.

Euphoria allocates 31-bits for an integer; integers can not be used to hold a 32-bit machine address. An atom must be used for this purpose.

Sequence

A sequence is represented by a list of objects within braces { }, separated by commas.

A sequence can contain any mixture of atom and sequence values; a sequence does not have to contain all the same data type. Because the objects contained in a sequence can be an arbitrary mix of atoms or sequences, it is an extremely versatile data structure, capable of representing any sort of data.

The other important feature of a sequence is that it is dynamic and limitless. A sequence may be dynamically changed as elements are added, removed, altered, or just shifted about. A single sequence could be as large as your computer memory allows.

Atoms are the basic building blocks of all the data that a Euphoria program can manipulate. With this analogy, sequences might be thought of as "molecules", made from atoms and other molecules. A better analogy would be that sequences are like directories, and atoms are like files. Just as a directory on your computer can contain both files and other directories, a sequence can contain both atoms and other sequences (and those sequences can contain atoms and sequences and so on).

The "Hierarchical Objects" part of the Euphoria acronym comes from the hierarchical nature of nested sequences. This should not be confused with the class hierarchies of certain object-oriented languages.

Sequences are implemented very efficiently, and automatic garbage collection takes care of memory allocations. As a result sequences are a fast and simple way of working with data. A sequence can represent any arrangement of data. That means that one, easy to use, data-type is all you need to learn to represent all forms of data structures--big or small, simple or complex.

In a conventional language expect to see a variety of data-types, each one different, and each one following its own special rules. Conventional languages force you to learn a lot.

Sequences can be nested to any depth. You can have sequences, within sequences, within sequences, and so on to any depth (until you run out of memory).

Braces are used to construct sequences out of a list of expressions. These expressions can be constant or evaluated at run-time:

{ x+6, 9, y*w+2, sin(0.5) }

Object

images/oas.svg

A variable that is declared to be of the object type may be either an atom or sequence, and may switch between the two dynamically during program execution.

This provides versatility for those situations when it is not possible to anticipate the data-type being assigned to a variable. For example, when the contents of a file is read into a variable it could contain lines of text, hence the sequence type. But, the file could be empty and the read operation would return an integer flag of the atom type. A variable declared to be the object type can store anything.

As you will soon discover, sequences make Euphoria very simple and very powerful. Understanding atoms and sequences is the key to understanding Euphoria.

Note: Here the term object follows the fundamental definition as used by designers of interpreters and compilers. It is not the same as "object" used in a specialized way in "OOP" languages.

OpenEuphoria

Historical Values, Revision 1

Data Objects

Predefined Types

Data-Type

Atom

Integer

Sequence

Object

Search

Include:

Quick Links

User menu

Misc Menu