Re: Builtin sequence operations?
- Posted by Robert Craig <rds at RapidEuphoria.com> Dec 07, 2005
- 505 views
Cuvier Christian wrote: > Question for RC mainly: > > Nowadays, the fastest way to insert an element inside a sequence appears to > be: > > }}} <eucode> > target_sequence &= 0 --add dummy element at the end to get some more > storage space > target_sequence[insertion_point+1..$] = > target_sequence[insertion_point..$-1] -- shift sequence tail 1 position > right > target_sequence[insertion_point] = new_value -- now set the right spot > </eucode> {{{ > > and similarly to remove an element. > Wouldn't it be desirable to increase code effiiciency as well as to spare > coders' fingers, to have a builtin > insert(target_sequence,insertion_point,new_value) and > remove(target_sequence,removal_point) ? I never do it that way unless performance is critical and the sequence might be very long. To insert x, I usually just do: if x is an atom: s = s[1..n] & x & s[n+1..$] if x could be an atom or a sequence: s = append(s[1..n], x) & s[n+1..$] Both are pretty readable, and fairly efficient. It's tempting to add insert() and delete() builtins, but I don't think the performance gain would be that great, and I somehow don't like cluttering up the manual with tiny, somewhat redundant routines. > If sequences are implemented as linked lists, this should be far more > efficient than the current best method, right? Sequences are not implemented as linked lists. Although, back in the pre-1.0 days, I actually did start off implementing them as linked lists. The performance on subscripting, and also memory consumption, were both terrible. It's very wasteful to have a 4-byte link for each 4-byte integer. In an effort to speed up deletions and other things, I made it a doubly-linked list with forward and backward pointers, thus wasting 8 bytes for every 4-byte integer. Then, in an effort to speed up subscripting, I kept track of the last subscript value accessed, and a pointer to the corresponding element. That way, if the same subscript were used again right away, I wouldn't have to go chasing through a bunch of pointers to find it. And if an adjacent element were accessed next (as often happens), I could use the saved subscript pointer to quickly move to the adjacent one. This wasn't too bad for sequential access of a sequence (for i=1 to length(s) do), but was terrible for random access. Anyway, I eventually "saw the light", and realized that subscripting was extremely important, and the only way to do it quickly was to implement sequences like growable arrays. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com