8.13 Data type conversion

8.13.1 Routines

8.13.1.1 int_to_bytes

include std/convert.e
namespace convert
public function int_to_bytes(atom x)

Converts an atom that represents an integer to a sequence of 4 bytes.

Parameters:
  1. x : an atom, the value to convert.
Returns:

A sequence, of 4 bytes, lowest significant byte first.

Comments:

If the atom does not fit into a 32-bit integer, things may still work right:

  • If there is a fractional part, the first element in the returned value will carry it. If you poke the sequence to RAM, that fraction will be discarded anyway.
  • If x is simply too big, the first three bytes will still be correct, and the 4th element will be floor(x/power(2,24)). If this is not a byte sized integer, some truncation may occur, but usually no error.

The integer can be negative. Negative byte-values will be returned, but after poking them into memory you will have the correct (two's complement) representation for the 386+.

Example 1:
s = int_to_bytes(999)
-- s is {231, 3, 0, 0}
Example 2:
s = int_to_bytes(-999)
-- s is {-231, -4, -1, -1}
See Also:

bytes_to_int, int_to_bits, atom_to_float64, poke4

8.13.1.2 bytes_to_int

include std/convert.e
namespace convert
public function bytes_to_int(sequence s)

Converts a sequence of at most 4 bytes into an atom.

Parameters:
  1. s : the sequence to convert
Returns:

An atom, the value of the concatenated bytes of s.

Comments:

This performs the reverse operation from int_to_bytes

An atom is being returned, because the converted value may be bigger than what can fit in an Euphoria integer.

Example 1:
atom int32

int32 = bytes_to_int({37,1,0,0})
-- int32 is 37 + 256*1 = 293
See Also:

bits_to_int, float64_to_atom, int_to_bytes, peek, peek4s, peek4u, poke4

8.13.1.3 int_to_bits

include std/convert.e
namespace convert
public function int_to_bits(atom x, integer nbits = 32)

Extracts the lower bits from an integer.

Parameters:
  1. x : the atom to convert
  2. nbits : the number of bits requested. The default is 32.
Returns:

A sequence, of length nbits, made of 1's and 0's.

Comments:

x should have no fractional part. If it does, then the first "bit" will be an atom between 0 and 2.

The bits are returned lowest first.

For negative numbers the two's complement bit pattern is returned.

You can use subscripting, slicing, and/or/xor/not of entire sequences etc. to manipulate sequences of bits. Shifting of bits and rotating of bits are easy to perform.

Example 1:
s = int_to_bits(177, 8)
-- s is {1,0,0,0,1,1,0,1} -- "reverse" order
See Also:

bits_to_int, int_to_bytes, Relational operators, operations on sequences

8.13.1.4 bits_to_int

include std/convert.e
namespace convert
public function bits_to_int(sequence bits)

Converts a sequence of bits to an atom that has no fractional part.

Parameters:
  1. bits : the sequence to convert.
Returns:

A positive atom, whose machine representation was given by bits.

Comments:

An element in bits can be any atom. If nonzero, it counts for 1, else for 0.

The first elements in bits represent the bits with the least weight in the returned value. Only the 52 last bits will matter, as the PC hardware cannot hold an integer with more digits than this.

If you print s the bits will appear in "reverse" order, but it is convenient to have increasing subscripts access bits of increasing significance.

Example 1:
a = bits_to_int({1,1,1,0,1})
-- a is 23 (binary 10111)
See Also:

bytes_to_int, int_to_bits, operations on sequences

8.13.1.5 atom_to_float64

include std/convert.e
namespace convert
public function atom_to_float64(atom a)

Convert an atom to a sequence of 8 bytes in IEEE 64-bit format

Parameters:
  1. a : the atom to convert:
Returns:

A sequence, of 8 bytes, which can be poked in memory to represent a.

Comments:

All Euphoria atoms have values which can be represented as 64-bit IEEE floating-point numbers, so you can convert any atom to 64-bit format without losing any precision.

Integer values will also be converted to 64-bit floating-point format.

Example:
fn = open("numbers.dat", "wb")
puts(fn, atom_to_float64(157.82)) -- write 8 bytes to a file
See Also:

float64_to_atom, int_to_bytes, atom_to_float32

8.13.1.6 atom_to_float32

include std/convert.e
namespace convert
public function atom_to_float32(atom a)

Convert an atom to a sequence of 4 bytes in IEEE 32-bit format

Parameters:
  1. a : the atom to convert:
Returns:

A sequence, of 4 bytes, which can be poked in memory to represent a.

Comments:

Euphoria atoms can have values which are 64-bit IEEE floating-point numbers, so you may lose precision when you convert to 32-bits (16 significant digits versus 7). The range of exponents is much larger in 64-bit format (10 to the 308, versus 10 to the 38), so some atoms may be too large or too small to represent in 32-bit format. In this case you will get one of the special 32-bit values: inf or -inf (infinity or -infinity). To avoid this, you can use atom_to_float64().

Integer values will also be converted to 32-bit floating-point format.

On modern computers, computations on 64 bit floats are no slower than on 32 bit floats. Internally, the PC stores them in 80 bit registers anyway. Euphoria does not support these so called long doubles. Not all C compilers do.

Example 1:
fn = open("numbers.dat", "wb")
puts(fn, atom_to_float32(157.82)) -- write 4 bytes to a file
See Also:

float32_to_atom, int_to_bytes, atom_to_float64

8.13.1.7 float64_to_atom

include std/convert.e
namespace convert
public function float64_to_atom(sequence_8 ieee64)

Convert a sequence of 8 bytes in IEEE 64-bit format to an atom

Parameters:
  1. ieee64 : the sequence to convert:
Returns:

An atom, the same value as the FPU would see by peeking ieee64 from RAM.

Comments:

Any 64-bit IEEE floating-point number can be converted to an atom.

Example 1:
f = repeat(0, 8)
fn = open("numbers.dat", "rb")  -- read binary
for i = 1 to 8 do
    f[i] = getc(fn)
end for
a = float64_to_atom(f)
See Also:

float32_to_atom, bytes_to_int, atom_to_float64

8.13.1.8 float32_to_atom

include std/convert.e
namespace convert
public function float32_to_atom(sequence_4 ieee32)

Convert a sequence of 4 bytes in IEEE 32-bit format to an atom

Parameters:
  1. ieee32 : the sequence to convert:
Returns:

An atom, the same value as the FPU would see by peeking ieee64 from RAM.

Comments:

Any 32-bit IEEE floating-point number can be converted to an atom.

Example 1:
f = repeat(0, 4)
fn = open("numbers.dat", "rb") -- read binary
f[1] = getc(fn)
f[2] = getc(fn)
f[3] = getc(fn)
f[4] = getc(fn)
a = float32_to_atom(f)
See Also:

float64_to_atom, bytes_to_int, atom_to_float32

8.13.1.9 hex_text

include std/convert.e
namespace convert
public function hex_text(sequence text)

Convert a text representation of a hexadecimal number to an atom

Parameters:
  1. text : the text to convert.
Returns:

An atom, the numeric equivalent to text

Comments:
  • The text can optionally begin with '#' which is ignored.
  • The text can have any number of underscores, all of which are ignored.
  • The text can have one leading '-', indicating a negative number.
  • The text can have any number of underscores, all of which are ignored.
  • Any other characters in the text stops the parsing and returns the value thus far.
Example 1:
atom h = hex_text("-#3_4FA.00E_1BD")
 -- h is now -13562.003444492816925
 atom h = hex_text("DEADBEEF")
 -- h is now 3735928559
See Also:

value

8.13.1.10 set_decimal_mark

include std/convert.e
namespace convert
public function set_decimal_mark(integer new_mark)

Gets, and possibly sets, the decimal mark that to_number() uses.

Parameters:
  1. new_mark : An integer: Either a comma (,), a period (.) or any other integer.
Returns:

An integer, The current value, before new_mark changes it.

Comments:
  • When new_mark is a period it will cause to_number() to interpret a dot (.) as the decimal point symbol. The pre-changed value is returned.
  • When new_mark is a comma it will cause to_number() to interpret a comma (,) as the decimal point symbol. The pre-changed value is returned.
  • Any other value does not change the current setting. Instead it just returns the current value.
  • The initial value of the decimal marker is a period.

8.13.1.11 to_number

include std/convert.e
namespace convert
public function to_number(sequence text_in, integer return_bad_pos = 0)

Converts the text into a number.

Parameters:
  1. text_in : A string containing the text representation of a number.
  2. return_bad_pos : An integer.
    • If 0 (the default) then this will return a number based on the supplied text and it will not return any position in text_in that caused an incomplete conversion.
    • If return_bad_pos is -1 then if the conversion of text_in was complete the resulting number is returned otherwise a single-element sequence containing the position within text_in where the conversion stopped.
    • If not 0 then this returns both the converted value up to the point of failure (if any) and the position in text_in that caused the failure. If that position is 0 then there was no failure.
Returns:
  • an atom, If return_bad_pos is zero, the number represented by text_in. If text_in contains invalid characters, zero is returned.
  • a sequence, If return_bad_pos is non-zero. If return_bad_pos is -1 it returns a 1-element sequence containing the spot inside text_in where conversion stopped. Otherwise it returns a 2-element sequence containing the number represented by text_in and either 0 or the position in text_in where conversion stopped.
Comments:
  1. You can supply Hexadecimal values if the value is preceded by a '#' character, Octal values if the value is preceded by a '@' character, and Binary values if the value is preceded by a '!' character. With hexadecimal values, the case of the digits 'A' - 'F' is not important. Also, any decimal marker embedded in the number is used with the correct base.
  2. Any underscore characters or thousands separators, that are embedded in the text number are ignored. These can be used to help visual clarity for long numbers. The thousands separator is a ',' when the decimal mark is '.' (the default), or '.' if the decimal mark is ','. You inspect and set it using set_decimal_mark().
  3. You can supply a single leading or trailing sign. Either a minus (-) or plus (+).
  4. You can supply one or more trailing adjacent percentage signs. The first one causes the resulting value to be divided by 100, and each subsequent one divides the result by a further 10. Thus 3845% gives a value of (3845 / 100) ==> 38.45, and 3845%% gives a value of (3845 / 1000) ==> 3.845.
  5. You can have single currency symbol before the first digit or after the last digit. A currency symbol is any character of the string: "$£¤¥€".
  6. You can have any number of whitespace characters before the first digit and after the last digit.
  7. The currency, sign and base symbols can appear in any order. Thus "$ -21.10" is the same as " -$21.10 ", which is also the same as "21.10$-", etc.
  8. This function can optionally return information about invalid numbers. If return_bad_pos is not zero, a two-element sequence is returned. The first element is the converted number value , and the second is the position in the text where conversion stopped. If no errors were found then the second element is zero.
  9. When converting floating point text numbers to atoms, you need to be aware that many numbers cannot be accurately converted to the exact value expected due to the limitations of the 64-bit IEEEE Floating point format.
Examples:
object val
val = to_number("12.34")      ---> 12.34 -- No errors and no error return needed.
val = to_number("12.34", 1)   ---> {12.34, 0} -- No errors.
val = to_number("12.34", -1)  ---> 12.34 -- No errors.
val = to_number("12.34a", 1)  ---> {12.34, 6} -- Error at position 6
val = to_number("12.34a", -1) ---> {6} -- Error at position 6
val = to_number("12.34a")     ---> 0 because its not a valid number

val = to_number("#f80c")        --> 63500
val = to_number("#f80c.7aa")    --> 63500.47900390625
val = to_number("@1703")        --> 963
val = to_number("!101101")      --> 45
val = to_number("12_583_891")   --> 12583891
val = to_number("12_583_891%")  --> 125838.91
val = to_number("12,583,891%%") --> 12583.891

8.13.1.12 to_integer

include std/convert.e
namespace convert
public function to_integer(object data_in, integer def_value = 0)

Converts an object into a integer.

Parameters:
  1. data_in : Any Euphoria object.
  2. def_value : An integer. This is returned if data_in cannot be converted into an integer. If omitted, zero is returned.
Returns:

An integer, either the integer rendition of data_in or def_value if it has no integer value.

Comments:

The returned value is guaranteed to be a valid Euphoria integer.

Examples:
? to_integer(12)            --> 12
? to_integer(12.4)          --> 12
? to_integer("12")          --> 12
? to_integer("12.9")        --> 12

? to_integer("a12")         --> 0 (not a valid number)
? to_integer("a12",-1)      --> -1 (not a valid number)
? to_integer({"12"})        --> 0 (sub-sequence found)
? to_integer(#3FFFFFFF)     --> 1073741823
? to_integer(#3FFFFFFF + 1) --> 0 (too big for a Euphoria integer)

8.13.1.13 to_string

include std/convert.e
namespace convert
public function to_string(object data_in, integer string_quote = 0,
        integer embed_string_quote = '"')

Converts an object into a text string.

Parameters:
  1. data_in : Any Euphoria object.
  2. string_quote : An integer. If not zero (the default) this will be used to enclose data_in, if it is already a string.
  3. embed_string_quote : An integer. This will be used to enclose any strings embedded inside data_in. The default is '"'
Returns:

A sequence. This is the string repesentation of data_in.

Comments:
  • The returned value is guaranteed to be a displayable text string.
  • string_quote is only used if data_in is already a string. In this case, all occurances of string_quote already in data_in are prefixed with the '\' escape character, as are any preexisting escape characters. Then string_quote is added to both ends of data_in, resulting in a quoted string.
  • embed_string_quote is only used if data_in is a sequence that contains strings. In this case, it is used as the enclosing quote for embedded strings.
Examples:
include std/console.e
display(to_string(12))           --> 12
display(to_string("abc"))        --> abc
display(to_string("abc",'"'))    --> "abc"
display(to_string(`abc\"`,'"'))  --> "abc\\\""
display(to_string({12,"abc",{4.5, -99}}))    --> {12, "abc", {4.5, -99}}
display(to_string({12,"abc",{4.5, -99}},,0)) --> {12, abc, {4.5, -99}}