subsequence storage

new topic     » topic index » view thread      » older message » newer message

A thread a while back brought up the problem with sequence expansion:  when
a subsequence is appended to beyond the space buffered for it, the entire
sequence and all its branced are relocated.  I ran a speed check appending
first to ten sequences, then to ten subsequences and the difference was
startling:  what took the first set .74 secs on my computer took the second
set close to ten minutes!

I assume a complex sequence is stored to a single block of memory.  Why is
this?  I don't pretend to understand the inner workings of interpreters,
but I would think a sequence would be stored as a single strand, with
pointers representing substrings, so that {1,2,{3,4,5},6} would be stored
not as:

1,2,<begin_string>,3,4,5,<end_string>,6

but as:

memblock1: 1,2,<pointer_memblock2>,6
memblock2: 3,4,5

then, if I said

seq[3]=seq[3]&9

then you'd only need to copy memblock2 to memblock3: 3,4,5,9
and change the pointer in memblock1, without recopying the parent sequence
or any of the other subsequences.

How are sequences really stored? The freedom of the sequence structure is
one of my favorite features of euphoria, and I often take advantage of it
with very deep sequences, but now I'm almost afraid to use them.  greater
efficiency here is now on the top of my wish list.

In any case, I am curious as to how sequences are handled, but I understand
that Robert might not want to give away all his secrets.

Isaac



-- first number is time for independent sequences
-- second is for subseqences
-- third is ratio
-- currently set at 1000 repetitions; the test mentioned
-- in the message was with 10000
-- with 1000 I got .05 vs. 4.27

include machine.e
sequence q,w,e,r,t,y,u,i,o,p,seq
atom tick,t1,t2
constant reps=1000

tick_rate(100)
q={}
w={}
e={}
r={}
t={}
y={}
u={}
i={}
o={}
p={}
seq=repeat({},10)

tick=time()
for x=1 to reps do
    q=q&0
    w=w&0
    e=e&0
    r=r&0
    t=t&0
    y=y&0
    u=u&0
    i=i&0
    o=o&0
    p=p&0
end for
t1=(time()-tick)
?t1

tick=time()
    for x=1 to reps do
        seq[1]=seq[1]&0
        seq[2]=seq[2]&0
        seq[3]=seq[3]&0
        seq[4]=seq[4]&0
        seq[5]=seq[5]&0
        seq[6]=seq[6]&0
        seq[7]=seq[7]&0
        seq[8]=seq[8]&0
        seq[9]=seq[9]&0
        seq[10]=seq[10]&0
    end for
t2=(time()-tick)
?t2
?t2/t1

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu