Re: ABS fastest thus far

new topic     » goto parent     » topic index » view thread      » older message » newer message

BABOR, JIRI wrote:
> Hi all,
> I am very happy to report that the clumsy looking recursive version of
> abs() below beat all the other contenders in all my benchmarks on a 266
> MHz pentium II machine, both for single atoms as well as long
> sequences. jiri

I'm ever so terribly sowwy jiri... that one isn't the fastest...
This is the finalist set of abs functions I tested, including yours,
carl's, some variations on a theme proposed by ralf, and optimizations
and code provided by myself...
A notable optimization is abs2, an optimization to your last
entry jiri(abs1). You gain more speed by jumping out after
testing for atoms first...
However, for sequences, abs4 is the winner, with a 25% improvement
over abs1.

****here is the code, and benchmark prog*****
include machine.e
tick_rate(1000)
atom now, temp
object answer
sequence data
integer seqlen

--jiri, recursive
function abs1(object x)
   --this should be the other way, test atoms first
   if sequence(x) then
      for i=1 to length(x) do
         x[i]=abs1(x[i])
      end for
      return x
   elsif x<0 then return -x
   else return x
   end if
end function

--jiri, recursive, optimized
function abs2(object x)
   if atom(x) then
      if x<0 then return -x else return x end if
   end if
   for i=1 to length(x) do
      x[i] = abs2(x[i])
   end for
   return x
end function

--carl, perhaps the most elegant, but a little wastefull...
function abs3(object x)
   if atom(x) then
      if x<0 then return -x else return x end if
   end if
   return x * ((x>0)-(x<0))
end function

--mine, nonrecursive, only sign changes what is needed
function abs4(object x)
   if atom(x) then
        if x<0 then return -x else return x end if
   end if
   for index=1 to length(x) do
     temp = x[index] --this assignment helped vs element lookup
     if temp<0 then x[index]=-temp end if
   end for
   return x
end function

--build a big sequence
function RandSeq(integer len)
sequence randseq
integer sign
   randseq = repeat(0,len)
   for num = 1 to len do
      sign = rand(2)                     --makes 1 or 2
      if sign = 2 then sign = - 1 end if --makes 1 or -1
      randseq[num] = rand(500) * sign
   end for
   return randseq
end function

procedure doatoms(integer which)
   now = time()
   for junk= -1000000 to 1000000 do
      if    which = 1 then answer = abs1(junk) --this may be why
      elsif which = 2 then answer = abs2(junk) --2,3,4 don't match
      elsif which = 3 then answer = abs3(junk) --in benchmark speed
      elsif which = 4 then answer = abs4(junk) --vs individual tests
      end if
   end for
   now = time() - now
   printf(1,"time abs%d using atoms:%6.3f\n",{which,now})
end procedure

procedure doseq(integer which, integer seqlen)
   now = time()
   for junk= 1 to 5000 do
      data= RandSeq(seqlen)
      if    which = 1 then answer = abs1(data) --doesn't seem to
      elsif which = 2 then answer = abs2(data) --affect results
      elsif which = 3 then answer = abs3(data) --with seq's tho...
      elsif which = 4 then answer = abs4(data) --
      end if
   end for
   now = time() - now
   printf(1,"time abs%d using seq's:%6.3f\n",{which,now})
end procedure

for test = 1 to 4 do doatoms(test)    end for
for test = 1 to 4 do doseq(test,500)  end for
for test = 1 to 4 do doseq(test,2000) end for
*************************************************

and here are the results:
=========================

time abs1 using atoms: 2.478
time abs2 using atoms: 2.479 --\
time abs3 using atoms: 2.355 =====
time abs4 using atoms: 2.573 --/
those three (abs2,3,4) *should* actually all be the same...
why they aren't i'm not sure, might be function calling
overhead, or my cache getting thrased for some reason
if you use individual testing, they do come out
much closer to each other... within less than a tenth
of a second per million atoms tested...
and 2,3,4 are all 4-5% faster than 1 because of that
testing of atoms first...
also, the culprit might be the if which... statement
since there are 2million iterations there...as opposed
to a few thousand for sequences...

length 500 sequence, 5000 times:
time abs1 using seq's:10.548 --jiri's last post
time abs2 using seq's:10.479 --jiri's optimized
time abs3 using seq's: 8.118 --carl's
time abs4 using seq's: 7.869 --mine, sign flipping only what's needed

length 2000 sequence, 5000 times:
time abs1 using seq's:42.508
time abs2 using seq's:42.335
time abs3 using seq's:34.074
time abs4 using seq's:31.815


as far as the one that will prolly end up in my toolbox??
for speed, carl's is second, and is so elegant, and
since atom speed can never be faster than
if x<0 return -x else return x
i'm thinking the slight speed loss over small
to medium sequences is worth the readability
and elegance.  how often are you going to
abs a really really big sequence? usually
it's atoms or small lists of coordinates
(less than a few thousand) and for that
abs3 is a few thousandths/sec slower,
maybe a frame/sec slower as a result...
liveable.

enjoy--Hawke'

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu