update internals 2

Documentation Version for Comments and Changes

You are invited to make any changes...add any comments.

Changes will `eventually` be merged into the offical documentation.

Leave any commnents here...


... back to index page OE documentation

four byte signed integer. Legal integer values for Euphoria integers are between -1,073,741,824 ( -230 ) and +1,073,741,823 ( 230-1 ). Unsigned hexadecimal numbers from C000_0000 to FFFF_FFFF are the negative integers and numbers from 0000_0000 to 3FFF_FFFF are the positive integers. The hexadecimal values not used as integers are thus 4000_0000 to BFFF_FFFF. Other values are for encoded pointers. Pointers are always 8 byte aligned. So a pointer is stored in 29-bits instead of 32 and can fit in a hexadecimal range 0x2000_0000 long. The pointers are encoded in such a way that their encoded values will never be in the range of the integers. Pointers to sequence structures (struct s1) are encoded into a range between 8000_0000 to 9FFF_FFFF. Pointers to structures for doubles (struct d) are encoded into a range between A000_0000 to BFFF_FFFF. A special value NOVALUE is at the end of the range of encoded pointers is BFFF_FFFF and it signifies that there is no value yet assigned to a variable and it also signifies the end of a sequence. In C, values of this type are stored in the 'object' type. The range 4000_0000 to 7FFF_FFFF is unused.

A double structure 'struct d' could indeed contain a value that is legally in the range of a Euphoria integer. So the encoded pointer to this structure is recognized by the interpreter as an 'integer' but in this internals document when we say Euphoria integer we mean it actually is a C integer in the legal Euphoria integer range.

The C Representations of a Euphoria Sequence and a Euphoria Atom

// Sequence Header  
struct s1 
 object_ptr base;     // base is such that base[1] is the first element 
 long length;         // this is the sequence length 
 long ref;            // ref is the number of as virtual copies of this sequence 
 long postfill;       // is how many extra objects could fit at the end of base 
 cleanup_ptr cleanup; // this is a pointer to a Euphoria routine that is run  
                      // just before the sequence is freed. 

However, we allocate more than this structure. Inside the allocated data but past the structure, there also is an area of 'pre free space'; sequence data pointed to by base[1] to base[$], $ being the length; a NOVALUE terminator for the sequence, and an area of post fill space. In memory, immediately following the structure there is the following data stored:

 object pre_fill_space[]; // could have 0 (not exist) or more elements before used data 
 object base[1..$];       // sequence members pointed to by base 
 object base[$+1];        // a magic number terminating the sequence members (NOVALUE) 
 object post_fill_space[];// could have 0 (not exist) or more elements after used data 

Taken together these are what get represented in memory.

base length ref postfill cleanup pre fill space base[1..$] NOVALUE post fill space

By their nature, sequences are variable length, dynamic entities and so the C structure needs to cater for this. When a sequence is created, we allocate enough RAM for the combined header and the initial storage for the elements.

Field Description
base This contains the address of the first element less the length of one element. Thus base[1] points to the first element and base[0] points to a fictitious element just before the first one, which is never used.
Initially, base contains the address of the last member of the sequence header but as the sequence is resized, it can point to the last member or anywhere after.
length Contains the current number of elements in the sequence.
ref Contains the count of references to this sequence. Only when this is zero, can the RAM used by the sequence be returned to the system for reuse.
Not Categorized, Please Help


Quick Links

User menu

Not signed in.

Misc Menu