subsequence storage
- Posted by isaac <isaaca at MINDSPRING.COM> Jul 06, 1998
- 599 views
A thread a while back brought up the problem with sequence expansion: when a subsequence is appended to beyond the space buffered for it, the entire sequence and all its branced are relocated. I ran a speed check appending first to ten sequences, then to ten subsequences and the difference was startling: what took the first set .74 secs on my computer took the second set close to ten minutes! I assume a complex sequence is stored to a single block of memory. Why is this? I don't pretend to understand the inner workings of interpreters, but I would think a sequence would be stored as a single strand, with pointers representing substrings, so that {1,2,{3,4,5},6} would be stored not as: 1,2,<begin_string>,3,4,5,<end_string>,6 but as: memblock1: 1,2,<pointer_memblock2>,6 memblock2: 3,4,5 then, if I said seq[3]=seq[3]&9 then you'd only need to copy memblock2 to memblock3: 3,4,5,9 and change the pointer in memblock1, without recopying the parent sequence or any of the other subsequences. How are sequences really stored? The freedom of the sequence structure is one of my favorite features of euphoria, and I often take advantage of it with very deep sequences, but now I'm almost afraid to use them. greater efficiency here is now on the top of my wish list. In any case, I am curious as to how sequences are handled, but I understand that Robert might not want to give away all his secrets. Isaac -- first number is time for independent sequences -- second is for subseqences -- third is ratio -- currently set at 1000 repetitions; the test mentioned -- in the message was with 10000 -- with 1000 I got .05 vs. 4.27 include machine.e sequence q,w,e,r,t,y,u,i,o,p,seq atom tick,t1,t2 constant reps=1000 tick_rate(100) q={} w={} e={} r={} t={} y={} u={} i={} o={} p={} seq=repeat({},10) tick=time() for x=1 to reps do q=q&0 w=w&0 e=e&0 r=r&0 t=t&0 y=y&0 u=u&0 i=i&0 o=o&0 p=p&0 end for t1=(time()-tick) ?t1 tick=time() for x=1 to reps do seq[1]=seq[1]&0 seq[2]=seq[2]&0 seq[3]=seq[3]&0 seq[4]=seq[4]&0 seq[5]=seq[5]&0 seq[6]=seq[6]&0 seq[7]=seq[7]&0 seq[8]=seq[8]&0 seq[9]=seq[9]&0 seq[10]=seq[10]&0 end for t2=(time()-tick) ?t2 ?t2/t1