Re: ABS fastest thus far
BABOR, JIRI wrote:
> Hi all,
> I am very happy to report that the clumsy looking recursive version of
> abs() below beat all the other contenders in all my benchmarks on a 266
> MHz pentium II machine, both for single atoms as well as long
> sequences. jiri
I'm ever so terribly sowwy jiri... that one isn't the fastest...
This is the finalist set of abs functions I tested, including yours,
carl's, some variations on a theme proposed by ralf, and optimizations
and code provided by myself...
A notable optimization is abs2, an optimization to your last
entry jiri(abs1). You gain more speed by jumping out after
testing for atoms first...
However, for sequences, abs4 is the winner, with a 25% improvement
over abs1.
****here is the code, and benchmark prog*****
include machine.e
tick_rate(1000)
atom now, temp
object answer
sequence data
integer seqlen
--jiri, recursive
function abs1(object x)
--this should be the other way, test atoms first
if sequence(x) then
for i=1 to length(x) do
x[i]=abs1(x[i])
end for
return x
elsif x<0 then return -x
else return x
end if
end function
--jiri, recursive, optimized
function abs2(object x)
if atom(x) then
if x<0 then return -x else return x end if
end if
for i=1 to length(x) do
x[i] = abs2(x[i])
end for
return x
end function
--carl, perhaps the most elegant, but a little wastefull...
function abs3(object x)
if atom(x) then
if x<0 then return -x else return x end if
end if
return x * ((x>0)-(x<0))
end function
--mine, nonrecursive, only sign changes what is needed
function abs4(object x)
if atom(x) then
if x<0 then return -x else return x end if
end if
for index=1 to length(x) do
temp = x[index] --this assignment helped vs element lookup
if temp<0 then x[index]=-temp end if
end for
return x
end function
--build a big sequence
function RandSeq(integer len)
sequence randseq
integer sign
randseq = repeat(0,len)
for num = 1 to len do
sign = rand(2) --makes 1 or 2
if sign = 2 then sign = - 1 end if --makes 1 or -1
randseq[num] = rand(500) * sign
end for
return randseq
end function
procedure doatoms(integer which)
now = time()
for junk= -1000000 to 1000000 do
if which = 1 then answer = abs1(junk) --this may be why
elsif which = 2 then answer = abs2(junk) --2,3,4 don't match
elsif which = 3 then answer = abs3(junk) --in benchmark speed
elsif which = 4 then answer = abs4(junk) --vs individual tests
end if
end for
now = time() - now
printf(1,"time abs%d using atoms:%6.3f\n",{which,now})
end procedure
procedure doseq(integer which, integer seqlen)
now = time()
for junk= 1 to 5000 do
data= RandSeq(seqlen)
if which = 1 then answer = abs1(data) --doesn't seem to
elsif which = 2 then answer = abs2(data) --affect results
elsif which = 3 then answer = abs3(data) --with seq's tho...
elsif which = 4 then answer = abs4(data) --
end if
end for
now = time() - now
printf(1,"time abs%d using seq's:%6.3f\n",{which,now})
end procedure
for test = 1 to 4 do doatoms(test) end for
for test = 1 to 4 do doseq(test,500) end for
for test = 1 to 4 do doseq(test,2000) end for
*************************************************
and here are the results:
=========================
time abs1 using atoms: 2.478
time abs2 using atoms: 2.479 --\
time abs3 using atoms: 2.355 =====
time abs4 using atoms: 2.573 --/
those three (abs2,3,4) *should* actually all be the same...
why they aren't i'm not sure, might be function calling
overhead, or my cache getting thrased for some reason
if you use individual testing, they do come out
much closer to each other... within less than a tenth
of a second per million atoms tested...
and 2,3,4 are all 4-5% faster than 1 because of that
testing of atoms first...
also, the culprit might be the if which... statement
since there are 2million iterations there...as opposed
to a few thousand for sequences...
length 500 sequence, 5000 times:
time abs1 using seq's:10.548 --jiri's last post
time abs2 using seq's:10.479 --jiri's optimized
time abs3 using seq's: 8.118 --carl's
time abs4 using seq's: 7.869 --mine, sign flipping only what's needed
length 2000 sequence, 5000 times:
time abs1 using seq's:42.508
time abs2 using seq's:42.335
time abs3 using seq's:34.074
time abs4 using seq's:31.815
as far as the one that will prolly end up in my toolbox??
for speed, carl's is second, and is so elegant, and
since atom speed can never be faster than
if x<0 return -x else return x
i'm thinking the slight speed loss over small
to medium sequences is worth the readability
and elegance. how often are you going to
abs a really really big sequence? usually
it's atoms or small lists of coordinates
(less than a few thousand) and for that
abs3 is a few thousandths/sec slower,
maybe a frame/sec slower as a result...
liveable.
enjoy--Hawke'
|
Not Categorized, Please Help
|
|