1. sequence storage space

I have several ways to arrange a large sequence. How can I have
Euphoria 2.0 DOS/6.21 tell me the memory space required by each candidate?
Thank you, Art Adamson
Arthur P. Adamson, The Engine Man, euclid at isoc.net

new topic     » topic index » view message » categorize

2. Re: sequence storage space

Art Adamson writes:
> I have several ways to arrange a large sequence.
> How can I have Euphoria 2.0 DOS/6.21 tell me
> the memory space required by each candidate?

function bytes_needed(object x)
-- estimates the number of bytes of storage needed for any
-- Euphoria 2.0 data object (atom or sequence).
    integer space

    if integer(x) then
        return 4
    elsif atom(x) then
        return 16
    else
        -- sequence
        space = 24 -- overhead
        for i = 1 to length(x) do
            space = space + bytes_needed(x[i])
        end for
        return space
    end if
end function

? bytes_needed(1)                    -- 4
? bytes_needed(1.5)                  -- 16
? bytes_needed({1,2,3})              -- 36
? bytes_needed({{1.5,2.5}, {1,2}})   -- 112

Keep in mind that Euphoria will often point to a
sequence or a (non-integer) atom, rather than make
a copy of it. For example:

x = repeat("Euphoria", 100)
y = 99.9
z = y

There will only be *one* copy of "Euphoria" in memory,
along with 100 4-byte pointers to it. There will only be
one copy of 99.9, pointed to by y and z.

Euphoria is primarily tuned for speed. It does not
for instance, try to store small integers in less than
4 bytes. The space overhead on sequences
consists of the length, plus a bunch of information
that allows for fast manipulation (growing/shrinking)
of sequences, as well as fast allocation and
deallocation of storage.

Whenever you have dynamic allocation of storage,
you have to use some storage to keep track of
which blocks of memory are used vs. free, their
length etc. That's included in the above figures.

Regards,
     Rob Craig
     Rapid Deployment Software

new topic     » goto parent     » topic index » view message » categorize

3. Re: sequence storage space

Hello,
I wrote a couple of functions based on the info provided in
Robert Craig's function that achieve a fair amount of compression
for strings. By fair I mean not quite 2:1 for short strings,
and approaching 3:1 for longer ones. If saving space is really
critical though you'd be better off using memory allocation
with peek and poke or some other method.

The two functions are quite simple, str_pack() compresses the
string by putting 3 characters into each element of the sequence
and str_unpack() decompresses it. You might wonder why I disn't use
4 characters instead of 3... 4 seemed to prove less compact in
every case I tried, possibly I just didn't try enough cases or
long enough strings. Of course, don't attempt to use this on
non-string sequences (i.e. if they contain values greater than
255 in any of the elements).

-----------------------------------------------------------------------
function str_pack(sequence str)
    integer norm_strlen,
            packed_strlen,
            padding
    atom this_atom
    sequence packed_str

    norm_strlen = length(str)
    padding = remainder(norm_strlen, 3)
    packed_strlen = floor(norm_strlen / 3)
    packed_str = {}
    for cnt = 1 to packed_strlen do
        this_atom = 0
        for cnt2 = (3 * (cnt - 1)) + 1 to 3 * cnt do
            this_atom = this_atom * 256 + str[cnt2]
        end for
        packed_str = append(packed_str, this_atom)
    end for
    if padding then
        this_atom = 0
        for cnt = (3 * packed_strlen) + 1 to (3 * packed_strlen) + 3 do
            if padding then
                this_atom = this_atom * 256 + str[cnt]
                padding = padding - 1
            else
                this_atom = this_atom * 256
            end if
        end for
        packed_str = append(packed_str, this_atom)
    end if
    return packed_str
end function

function str_unpack(sequence pstr)
    sequence str,
             sub_str
    integer pstr_len
    atom this_atom

    pstr_len = length(pstr)
    str = {}
    sub_str = {0, 0, 0}
    for cnt = 1 to pstr_len do
        this_atom = pstr[cnt]
        for cnt2 = 1 to 3 do
            sub_str[4 - cnt2] = and_bits(this_atom, 255)
            this_atom = floor(this_atom / 256)
        end for
        str = str & sub_str
    end for
    while not compare(str[length(str)], 0) do
        str = str[1..length(str) - 1]
    end while
    return str
end function
-----------------------------------------------------------------------

Christopher D. Hickman

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu