Historical Values, Revision 1
Data Objects
Predefined Types
Euphoria is a unique programming language because of the way it handles data. The Euphoria way is both powerful and easy to use.
The Euphoria data viewpoint:
- All data is an object in the form of atoms or sequences.
- An atom is a single numeric value.
- A sequence is a collection of objects, either atoms or sequences themselves.
An atom (which is also an object) is any one value:
integer | character | boolean | float | with 'spacer' | exponential | other bases |
---|---|---|---|---|---|---|
0 100 -4 |
'x' '$' '\n' |
0 1 |
98.6 -1 |
23_100_000
|
-1e6
|
0b101 #A000 |
A sequence (which is also an object) is any collection of objects:
empty | single | with atoms | as a string |
---|---|---|---|
{ }
|
{ 3.14 } |
{2, 3, 5, 7, 11, 13, 17, 19} |
"Euphoria"
|
mixed content | pairs | nested |
---|---|---|
{{"jon", "smith"}, 52389, 97.25} |
{ {"blue", 5}, {"red",3} } |
{ 1, 2, {3, 3, 3}, 4, { 5, {6} } } |
- Different data-types
- An empty sequence { } is not the same as the atom 0. A sequence of one element { 3 } is not the same as the atom 3.
- Predictable behavior
- Sometimes it helps to view a sequence as a list of "elements." Then, for some operations, an atom behaves like a one element sequence:
? { 2, 4 } & 5 -- one atom (one element ) concatenated ? { 2, 4 } & {5} -- one sequence (of one element ) concatenated -- result is { 2, 4, 5 } -- the same for both examples
Euphoria is designed to "make sense"; there are no surprising behaviors.
- Hint
- When thinking about data just consider: atom or sequence, one or many, individual or collection. Remembering which Euphoria routine to use, or how to use a particular operation, all becomes simpler when you realize that there may be only two choices to consider.
- Because of objects
- Programming in Euphoria is simpler than programming with a conventional language that requires many data-types.
Data-Type
A data-type is an abstract description of values. "Abstract" means we can talk about the members of a data-type without having to describe each and every member. The "description" consists of an algorithm that returns true or false when any given value is tested for being a member of the data-type.
The way data-types are used in Euphoria is simple:
- Every variable (really just a name for a value) has to be declared to be a specific data-type.
- You are free to create and manipulate values at will in your programming. Euphoria is not going to inhibit your "creative" programming.
- You can not assign a value to a variable of the wrong data-type. Euphoria is safe.
For example:
? 3.14 * 10 -- first value is an atom, second value is an integer -- you are free to mix types -- calculations produce the "expected" value --> 31.4 -- result is an atom value integer x -- an integer variable is declared x = 3.14 * 10 -- can not assign an atom value to an integer variable --> error
Atom
An atom can be any integer or double-precision floating point value. Its value can range from approximately -1e300 (minus one times 10 to the power 300) to +1e300 with 15 decimal digits of accuracy.
To test if a value is an atom use the atom() function:
? atom( 1 ) -- 1 or 'true' ? atom( -4e-10 ) -- 1 or 'true' ? atom( "cat" ) -- 0 or 'false'
atom | programming tip | |
---|---|---|
range | integers -power(2,53) to +power(2,53) |
power(2,53) is slightly above 9.1015 |
large integers about 15 digits |
(automatic conversion to floating-point when size is exceeded) |
|
floating-point -power(2,1024)+1 to +power(2,1024)-1 |
power(2,1024) is in the 10308 range | |
use for | general calculations | |
integers, when overflow may occur | (assign result of large integer additions and multiplications to atom) | |
32-bit machine addresses | (never use integer type for this!) | |
large valued integers | (rather than overflowing, large integers are converted to floating-point) | |
caution | floating-point values are limited by accuracy of computer hardware | (see notes on floats) |
By default, number literals use base 10. You can write integer literals in other bases, namely binary (base 2), octal (base 8), and hexadecimal (base 16). To do this, the number is prefixed by a 2-character code that specifies the base:
Prefix Code | Base | Name |
---|---|---|
0b | 2 | Binary |
0t | 8 | Octal |
0d | 10 | Decimal |
0x | 16 | Hexadecimal |
Examples of values written in various bases:
0b101 --> decimal 5 0t101 --> decimal 65 0d101 --> decimal 101 0x101 --> decimal 257
Additionally, hexadecimal integers can also be written by prefixing the number with the # pound sign.
Examples of hexadecimal integers:
#FE -- 254 #A000 -- 40960 #FFFF00008 -- 68718428168 -#10 -- -16
Only digits and the letters A, B, C, D, E, F, in either uppercase or lowercase, are allowed in hexadecimal numbers. Hexadecimal numbers are always positive, unless written like -# with a minus sign in front of the pound sign. For instance #FFFFFFFF is a huge positive number (4294967295), not -1, as some machine-language programmers might expect.
You can embed the underscore _ character when writing a numeric literal. This simulates the commas (or periods) sometimes used to "group" digits into sets of three. The underscore character can be located anywhere within a number, and used with any base:
atom big = 32_873_787 -- Set 'big' to the value 32873787 atom salary = 56_110.66 -- Set salary to the value 56110.66 integer defflags = #0323_F3CD object phone = 61_3_5536_7733 integer bits = 0b11_00010_1
Integer
The integer is a subset of the atom data-type. It too represents a single numerical value.
Computer hardware works best with integer values; calculations with integers are the fastest and require the least amount of memory. Integer values can be exactly represented by the base 2 binary system. For this reason Euphoria has the built-in integer data-type. When in range, specify integer instead of atom for improved performance.
Integer | Programming Tip | |
---|---|---|
range | -1_073_741_824 to +1_073_741_823 | |
use for | speed reduced memory |
|
not for | 32-bit machine addresses | (use atom instead) |
caution | additions and multiplications can exceed the size limit of an integer |
(assign these values to an atom) |
Calculations in Euphoria always produce the "correct" answer. Regardless of the data-types in the calculation, the result is always computed to the maximum accuracy available to the computer hardware. The integer data-type requires that values assigned to an integer variable be integers. Euphoria does not restrict how you calculate with these values.
A division between two integer values may produce a fractional value. Rather than truncate the decimal portion (as in modular arithmetic) the 'correct' fractional value is computed. A fractional result when assigned to an integer produces an error, but this value may be assigned to an atom. Similarly, when two integers are added or multiplied together the 'correct' value is calculated. The result of a multiplication or addition may produce a value that is too large to be assigned to an integer, but can be assigned to an atom.
Euphoria allocates 31-bits for an integer; integers can not be used to hold a 32-bit machine address. An atom must be used for this purpose.
Sequence
A sequence is represented by a list of objects within braces { }, separated by commas.
A sequence can contain any mixture of atom and sequence values; a sequence does not have to contain all the same data type. Because the objects contained in a sequence can be an arbitrary mix of atoms or sequences, it is an extremely versatile data structure, capable of representing any sort of data.
The other important feature of a sequence is that it is dynamic and limitless. A sequence may be dynamically changed as elements are added, removed, altered, or just shifted about. A single sequence could be as large as your computer memory allows.
The "Hierarchical Objects" part of the Euphoria acronym comes from the hierarchical nature of nested sequences. This should not be confused with the class hierarchies of certain object-oriented languages.
Sequences are implemented very efficiently, and automatic garbage collection takes care of memory allocations. As a result sequences are a fast and simple way of working with data. A sequence can represent any arrangement of data. That means that one, easy to use, data-type is all you need to learn to represent all forms of data structures--big or small, simple or complex.
In a conventional language expect to see a variety of data-types, each one different, and each one following its own special rules. Conventional languages force you to learn a lot.
Sequences can be nested to any depth. You can have sequences, within sequences, within sequences, and so on to any depth (until you run out of memory).
Braces are used to construct sequences out of a list of expressions. These expressions can be constant or evaluated at run-time:
{ x+6, 9, y*w+2, sin(0.5) }
Object
A variable that is declared to be of the object type may be either an atom or sequence, and may switch between the two dynamically during program execution.
This provides versatility for those situations when it is not possible to anticipate the data-type being assigned to a variable. For example, when the contents of a file is read into a variable it could contain lines of text, hence the sequence type. But, the file could be empty and the read operation would return an integer flag of the atom type. A variable declared to be the object type can store anything.
As you will soon discover, sequences make Euphoria very simple and very powerful. Understanding atoms and sequences is the key to understanding Euphoria.
- Note
- Here the term object follows the fundamental definition as used by designers of interpreters and compilers. It is not the same as "object" used in a specialized way in "OOP" languages.
Not Categorized, Please Help
|
- diff to current revision, view current revision history, backlinks
- Last modified Dec 20, 2010 by _tom