1. sequence storage space
I have several ways to arrange a large sequence. How can I have
Euphoria 2.0 DOS/6.21 tell me the memory space required by each candidate?
Thank you, Art Adamson
Arthur P. Adamson, The Engine Man, euclid at isoc.net
2. Re: sequence storage space
Art Adamson writes:
> I have several ways to arrange a large sequence.
> How can I have Euphoria 2.0 DOS/6.21 tell me
> the memory space required by each candidate?
function bytes_needed(object x)
-- estimates the number of bytes of storage needed for any
-- Euphoria 2.0 data object (atom or sequence).
integer space
if integer(x) then
return 4
elsif atom(x) then
return 16
else
-- sequence
space = 24 -- overhead
for i = 1 to length(x) do
space = space + bytes_needed(x[i])
end for
return space
end if
end function
? bytes_needed(1) -- 4
? bytes_needed(1.5) -- 16
? bytes_needed({1,2,3}) -- 36
? bytes_needed({{1.5,2.5}, {1,2}}) -- 112
Keep in mind that Euphoria will often point to a
sequence or a (non-integer) atom, rather than make
a copy of it. For example:
x = repeat("Euphoria", 100)
y = 99.9
z = y
There will only be *one* copy of "Euphoria" in memory,
along with 100 4-byte pointers to it. There will only be
one copy of 99.9, pointed to by y and z.
Euphoria is primarily tuned for speed. It does not
for instance, try to store small integers in less than
4 bytes. The space overhead on sequences
consists of the length, plus a bunch of information
that allows for fast manipulation (growing/shrinking)
of sequences, as well as fast allocation and
deallocation of storage.
Whenever you have dynamic allocation of storage,
you have to use some storage to keep track of
which blocks of memory are used vs. free, their
length etc. That's included in the above figures.
Regards,
Rob Craig
Rapid Deployment Software
3. Re: sequence storage space
- Posted by MAP <ddhinc at ALA.NET>
Apr 17, 1998
Hello,
I wrote a couple of functions based on the info provided in
Robert Craig's function that achieve a fair amount of compression
for strings. By fair I mean not quite 2:1 for short strings,
and approaching 3:1 for longer ones. If saving space is really
critical though you'd be better off using memory allocation
with peek and poke or some other method.
The two functions are quite simple, str_pack() compresses the
string by putting 3 characters into each element of the sequence
and str_unpack() decompresses it. You might wonder why I disn't use
4 characters instead of 3... 4 seemed to prove less compact in
every case I tried, possibly I just didn't try enough cases or
long enough strings. Of course, don't attempt to use this on
non-string sequences (i.e. if they contain values greater than
255 in any of the elements).
-----------------------------------------------------------------------
function str_pack(sequence str)
integer norm_strlen,
packed_strlen,
padding
atom this_atom
sequence packed_str
norm_strlen = length(str)
padding = remainder(norm_strlen, 3)
packed_strlen = floor(norm_strlen / 3)
packed_str = {}
for cnt = 1 to packed_strlen do
this_atom = 0
for cnt2 = (3 * (cnt - 1)) + 1 to 3 * cnt do
this_atom = this_atom * 256 + str[cnt2]
end for
packed_str = append(packed_str, this_atom)
end for
if padding then
this_atom = 0
for cnt = (3 * packed_strlen) + 1 to (3 * packed_strlen) + 3 do
if padding then
this_atom = this_atom * 256 + str[cnt]
padding = padding - 1
else
this_atom = this_atom * 256
end if
end for
packed_str = append(packed_str, this_atom)
end if
return packed_str
end function
function str_unpack(sequence pstr)
sequence str,
sub_str
integer pstr_len
atom this_atom
pstr_len = length(pstr)
str = {}
sub_str = {0, 0, 0}
for cnt = 1 to pstr_len do
this_atom = pstr[cnt]
for cnt2 = 1 to 3 do
sub_str[4 - cnt2] = and_bits(this_atom, 255)
this_atom = floor(this_atom / 256)
end for
str = str & sub_str
end for
while not compare(str[length(str)], 0) do
str = str[1..length(str) - 1]
end while
return str
end function
-----------------------------------------------------------------------
Christopher D. Hickman