1. Re: type string

ddhinc <ddhinc at ALA.NET> wrote:

>global type string13(object x)
>
>   if atom(x) then
>      return 0
>   end if
>
>   -- ensure all elements are in the range 0 to 255
>   if compare(x, and_bits(255, x)) then
>      return 0
>   end if
>
>   -- ensure there's no sub-sequences
>   for i = 1 to length(x) do
>      if not integer(x[i]) then
>         return 0
>      end if
>   end for
>   return 1
>end type


Chris, I admit I was quite skeptical at first about this routine being
faster than string12.  The "and_bits(255, x)" operation has to be applied to
the *entire* sequence -- including subsequences, if any -- before a result
can be returned, which would lead me to believe it would be slower.  When I
read your benchmark code and saw your program's results, I was stunned.  How
could this be faster?

I had reached some of Rod Jackson's conclusions at the time I received his
post -- the fact that it takes *really* long sequences (like those in your
test program) for string13 to come out on top.  I also figured out another
reason why your benchmark program favored string13.  You have the
non-integer, non-byte values situated near the ends of those loooooong
sequence values.  If you put those non-byte values at the very beginning,
string12 comes out *much* faster, since it can exit immediately when it hits
the first non-byte value, while string13 has to and_bits() the whole shebang
first.

Not to say that your test conditions were unrealistic -- we can't always
expect to use strings less than 80 chars long, or to have non-string
sequences beginning with an obviously non-byte value.  But I think Daniel
Berstein's benchmark program uses a more "average" default test value.  (Too
bad it doesn't work in Euphoria 2.0, though...)

Still, I *really* liked the idea of using and_bits() to verify the range.  I
even came up with a version of my string12 which used and_bits() to verify
the range, rather than the "if" statements.  However, in both your benchmark
and Daniel's, the times for this routine were consistently just a bit slower
than string12.

All of this leads me to suspect that string13's speed advantage for
loooooong sequences relies on the and_bits() function being *very*
well-optimized for loooooong sequences.  Its disadvantage, then, would
similarly result from and_bits() being not quite so well-optimized for the
shorter sequences.  (I'd have to play around with this some more to be sure
-- perhaps this is true for all of the built-in functions?  Perhaps this can
be fixed?)

Thanks for the alternate view! :)


Be seeing you,
   Gabriel Boehme


 -------
The modern world is a crowd of very rapid racing cars all brought to a
standstill and stuck in a block of traffic.

G.K. Chesterton
 -------

new topic     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu