1. Builtin sequence operations?

Question for RC mainly:

Nowadays, the fastest way to insert an element inside a sequence appears to
be:

target_sequence &= 0      --add dummy element at the end to get some more
storage space
target_sequence[insertion_point+1..$] =
target_sequence[insertion_point..$-1] -- shift sequence tail 1 position
right
target_sequence[insertion_point] = new_value     -- now set the right spot


and similarly to remove an element.
Wouldn't it be desirable to increase code effiiciency as well as to spare
coders' fingers, to have a builtin
insert(target_sequence,insertion_point,new_value) and
remove(target_sequence,removal_point) ?

If sequences are implemented as linked lists, this should be far more
efficient than the current best method, right?

CChris

new topic     » topic index » view message » categorize

2. Re: Builtin sequence operations?

Cuvier Christian wrote:
> 
> Question for RC mainly:
> 
> Nowadays, the fastest way to insert an element inside a sequence appears to
> be:
> 
> }}}
<eucode>
> target_sequence &= 0      --add dummy element at the end to get some more
> storage space
> target_sequence[insertion_point+1..$] =
> target_sequence[insertion_point..$-1] -- shift sequence tail 1 position
> right
> target_sequence[insertion_point] = new_value     -- now set the right spot
> </eucode>
{{{

> 
> and similarly to remove an element.
> Wouldn't it be desirable to increase code effiiciency as well as to spare
> coders' fingers, to have a builtin
> insert(target_sequence,insertion_point,new_value) and
> remove(target_sequence,removal_point) ?
> 
> If sequences are implemented as linked lists, this should be far more
> efficient than the current best method, right?
> 
> CChris
> 
> 

In other languages that have linked lists, do they use a dynamic growing chain
of pointers to keep track of all allocated memory addresses reserved for each
element in the linked list?

Wouldn't you think that could cause extreme slow down when using a loop to scan
to a perticular list element? What if the list is 10 million elements long? To
seek to element #10 million, it would have to scan through all 10 million
pointers to retrieve the value stored in it?

Imagine trying to subscript and manuplate values in random elements, and on the
fly; that would be painful!


Regards,
Vincent

new topic     » goto parent     » topic index » view message » categorize

3. Re: Builtin sequence operations?

Cuvier Christian wrote:
> Question for RC mainly:
> 
> Nowadays, the fastest way to insert an element inside a sequence appears to
> be:
> 
> }}}
<eucode>
> target_sequence &= 0      --add dummy element at the end to get some more
> storage space
> target_sequence[insertion_point+1..$] =
> target_sequence[insertion_point..$-1] -- shift sequence tail 1 position
> right
> target_sequence[insertion_point] = new_value     -- now set the right spot
> </eucode>
{{{

> 
> and similarly to remove an element.
> Wouldn't it be desirable to increase code effiiciency as well as to spare
> coders' fingers, to have a builtin
> insert(target_sequence,insertion_point,new_value) and
> remove(target_sequence,removal_point) ?

I never do it that way unless performance is critical
and the sequence might be very long. To insert x,
I usually just do:

if x is an atom:

   s = s[1..n] & x & s[n+1..$]

if x could be an atom or a sequence:

   s = append(s[1..n], x) & s[n+1..$]

Both are pretty readable, and fairly efficient.

It's tempting to add insert() and delete() builtins,
but I don't think the performance gain would be that great,
and I somehow don't like cluttering up the manual with tiny,
somewhat redundant routines. 

> If sequences are implemented as linked lists, this should be far more
> efficient than the current best method, right?

Sequences are not implemented as linked lists.

Although, back in the pre-1.0 days, I actually did start off
implementing them as linked lists. The performance
on subscripting, and also memory consumption, were both terrible.
It's very wasteful to have a 4-byte link for each 4-byte integer.
In an effort to speed up deletions and other things, I made it a 
doubly-linked list with forward and backward pointers, thus wasting
8 bytes for every 4-byte integer. Then, in an effort to speed up
subscripting, I kept track of the last subscript value accessed, and
a pointer to the corresponding element. That way, if the same
subscript were used again right away, I wouldn't have to go chasing
through a bunch of pointers to find it. And if an adjacent element
were accessed next (as often happens), I could use the saved subscript
pointer to quickly move to the adjacent one. This wasn't too bad
for sequential access of a sequence (for i=1 to length(s) do), 
but was terrible for random access.

Anyway, I eventually "saw the light", and realized that subscripting
was extremely important, and the only way to do it quickly was to
implement sequences like growable arrays.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

4. Re: Builtin sequence operations?

On 12/8/05, christian cuvier <christian.cuvier at insee.fr> wrote:
> If sequences are implemented as linked lists, this should be far more
> efficient than the current best method, right?

I'm pretty sure they're not linked-lists... otherwise it'd be very
slow doing random access. (seq[n])

As for creating a function to do this, I don't think it could be made
much faster than it currently is - if speed is really important, find
a better algorithm.
--
MrTrick
----------

new topic     » goto parent     » topic index » view message » categorize

5. Re: Builtin sequence operations?

> 
> posted by: Robert Craig <rds at RapidEuphoria.com>
> 
> Cuvier Christian wrote:
> > Question for RC mainly:
> > 
> > Nowadays, the fastest way to insert an element inside a 
> sequence appears to
> > be:
> > 
> > }}}
<eucode>
> > target_sequence &= 0      --add dummy element at the end to 
> get some more
> > storage space
> > target_sequence[insertion_point+1..$] =
> > target_sequence[insertion_point..$-1] -- shift sequence 
> tail 1 position
> > right
> > target_sequence[insertion_point] = new_value     -- now set 
> the right spot
> > </eucode>
{{{

> > 
> > and similarly to remove an element.
> > Wouldn't it be desirable to increase code effiiciency as 
> well as to spare
> > coders' fingers, to have a builtin
> > insert(target_sequence,insertion_point,new_value) and
> > remove(target_sequence,removal_point) ?
> 
> I never do it that way unless performance is critical
> and the sequence might be very long. To insert x,

That was my concern. Otherwise, the gains, if any, are just not noticeable
enough.

> I usually just do:
> 
> if x is an atom:
> 
>    s = s[1..n] & x & s[n+1..$]
> 
> if x could be an atom or a sequence:
> 
>    s = append(s[1..n], x) & s[n+1..$]
> 
> Both are pretty readable, and fairly efficient.
> 
> It's tempting to add insert() and delete() builtins,
> but I don't think the performance gain would be that great,
> and I somehow don't like cluttering up the manual with tiny,
> somewhat redundant routines.

I wouldn't consider this to be clutter, but just standard reusable code for
a frequent task, without the overhead of an include file and the nagging
issues of the current namespace system - another, thorny but different,
topic. Having these available 
would spare the coder either another include file or reinventing the wheel.

Well, unless you consider introducing macros in the language. They would be
really adequate for this sort of simple situations. They can be misused just
as about anything, and can simplify coding, make code more readablea s well.
 
> 
> > If sequences are implemented as linked lists, this should 
> be far more
> > efficient than the current best method, right?
> 
> Sequences are not implemented as linked lists.
> 
> Although, back in the pre-1.0 days, I actually did start off
> implementing them as linked lists. The performance
> on subscripting, and also memory consumption, were both terrible.
> It's very wasteful to have a 4-byte link for each 4-byte integer.
> In an effort to speed up deletions and other things, I made it a 
> doubly-linked list with forward and backward pointers, thus wasting
> 8 bytes for every 4-byte integer. Then, in an effort to speed up
> subscripting, I kept track of the last subscript value accessed, and
> a pointer to the corresponding element. That way, if the same
> subscript were used again right away, I wouldn't have to go chasing
> through a bunch of pointers to find it. And if an adjacent element
> were accessed next (as often happens), I could use the saved subscript
> pointer to quickly move to the adjacent one. This wasn't too bad
> for sequential access of a sequence (for i=1 to length(s) do), 
> but was terrible for random access.
> 
> Anyway, I eventually "saw the light", and realized that subscripting
> was extremely important, and the only way to do it quickly was to
> implement sequences like growable arrays.
>

Thanks for the precise argumentation.

CChris
 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    http://www.RapidEuphoria.com
>

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu