1. Re: type string
- Posted by "Boehme, Gabriel" <gboehme at MUSICLAND.COM>
Mar 17, 1999
-
Last edited Mar 18, 1999
ddhinc <ddhinc at ALA.NET> wrote:
>global type string13(object x)
>
> if atom(x) then
> return 0
> end if
>
> -- ensure all elements are in the range 0 to 255
> if compare(x, and_bits(255, x)) then
> return 0
> end if
>
> -- ensure there's no sub-sequences
> for i = 1 to length(x) do
> if not integer(x[i]) then
> return 0
> end if
> end for
> return 1
>end type
Chris, I admit I was quite skeptical at first about this routine being
faster than string12. The "and_bits(255, x)" operation has to be applied to
the *entire* sequence -- including subsequences, if any -- before a result
can be returned, which would lead me to believe it would be slower. When I
read your benchmark code and saw your program's results, I was stunned. How
could this be faster?
I had reached some of Rod Jackson's conclusions at the time I received his
post -- the fact that it takes *really* long sequences (like those in your
test program) for string13 to come out on top. I also figured out another
reason why your benchmark program favored string13. You have the
non-integer, non-byte values situated near the ends of those loooooong
sequence values. If you put those non-byte values at the very beginning,
string12 comes out *much* faster, since it can exit immediately when it hits
the first non-byte value, while string13 has to and_bits() the whole shebang
first.
Not to say that your test conditions were unrealistic -- we can't always
expect to use strings less than 80 chars long, or to have non-string
sequences beginning with an obviously non-byte value. But I think Daniel
Berstein's benchmark program uses a more "average" default test value. (Too
bad it doesn't work in Euphoria 2.0, though...)
Still, I *really* liked the idea of using and_bits() to verify the range. I
even came up with a version of my string12 which used and_bits() to verify
the range, rather than the "if" statements. However, in both your benchmark
and Daniel's, the times for this routine were consistently just a bit slower
than string12.
All of this leads me to suspect that string13's speed advantage for
loooooong sequences relies on the and_bits() function being *very*
well-optimized for loooooong sequences. Its disadvantage, then, would
similarly result from and_bits() being not quite so well-optimized for the
shorter sequences. (I'd have to play around with this some more to be sure
-- perhaps this is true for all of the built-in functions? Perhaps this can
be fixed?)
Thanks for the alternate view! :)
Be seeing you,
Gabriel Boehme
-------
The modern world is a crowd of very rapid racing cars all brought to a
standstill and stuck in a block of traffic.
G.K. Chesterton
-------