1. new new challenge

This is a new and (I think) simpler challenge:
Find the fastest algorithm to compute the largest power of 2 that exactly
divides an integer from 1 to 2^31 - 1.
The following is the test program I propose:

function dummy(integer x)
    return x
end function

function best(integer x)
    -- ?????????
    return -- ?????????
end function

procedure test(integer iter, integer top)
    integer n, k
    atom sum, t0, t1
    sum = 0
    for i = 1 to top do
     for j = 0 to 31 do
         n = power(2, j)
         if remainder(i,n) != 0 then
      exit
     end if
 end for
 n = n / 2
 t1 = time()
 for j = 1 to iter do
     k = dummy(i)
 end for
 t1 = time() - t1
 t0 = time()
 for j = 1 to iter do
     k = best(i)
 end for
 sum += time() - t0 - t1
 if n != k then
     printf(1, "%s %d %s %d \n", {"Error! ", n, "not equal", k})
 end if
end for
printf(1, "%s %f \n", {"Time:", sum})
end procedure

test(100000, 100)

Of course, there is a trick I know.

new topic     » topic index » view message » categorize

2. Re: new new challenge

> This is a new and (I think) simpler challenge:
> Find the fastest algorithm to compute the largest power of 2 that
exactly
> divides an integer from 1 to 2^31 - 1.

Hi,
The solution boils down to finding the value of the least significant
bit.
Here are two alternatives... unfortunately the ugly one is faster.
Ross


-- This solution is elegant but slower due to the divide by 2.
-- If Euphoria had a bit shift operator or integer division
-- this would probably be the fastest method.

function best2(integer x)
    if and_bits(x,1) then  -- quickly eliminate odd numbers
        return 1
    end if
    return floor(xor_bits(x,x-1)/2) + 1
end function

-- Cruder and faster solution...
-- You could make it even faster by 'jumping' depending on the
-- value of x and hence reduce the number of comparisons.
-- But why bother... this method stinks already.
function best(integer x)
    if and_bits(x,1) then
        return 1
    end if
    if and_bits(x,2) then
        return 2
    end if
    if and_bits(x,4) then
        return 4
    end if
    if and_bits(x,8) then
        return 8
    end if
    if and_bits(x,16) then
        return 16
    end if
    if and_bits(x,32) then
        return 32
    end if
    if and_bits(x,64) then
        return 64
    end if
    if and_bits(x,128) then
        return 128
    end if
    if and_bits(x,256) then
        return 256
    end if
    if and_bits(x,512) then
        return 512
    end if
    if and_bits(x,1024) then
        return 1024
    end if
    if and_bits(x,2048) then
        return 2048
    end if
    if and_bits(x,4096) then
        return 4096
    end if
    if and_bits(x,8192) then
        return 8192
    end if
    if and_bits(x,16384) then
        return 16384
    end if
    if and_bits(x,32768) then
        return 32768
    end if
    if and_bits(x,65536) then
        return 65536
    end if
    if and_bits(x,131072) then
        return 131072
    end if
    if and_bits(x,262144) then
        return 262144
    end if
    if and_bits(x,524288) then
        return 524288
    end if
    if and_bits(x,1048576) then
        return 1048576
    end if
    if and_bits(x,2097152) then
        return 2097152
    end if
    if and_bits(x,4194304) then
        return 4194304
    end if
    if and_bits(x,8388608) then
        return 8388608
    end if
    if and_bits(x,16777216) then
        return 16777216
    end if
    if and_bits(x,33554432) then
        return 33554432
    end if
    if and_bits(x,67108864) then
        return 67108864
    end if
    if and_bits(x,134217728) then
        return 134217728
    end if
    if and_bits(x,268435456) then
        return 268435456
    end if
    if and_bits(x,536870912) then
        return 536870912
    end if
    if and_bits(x,1073741824) then
        return 1073741824
    end if
    return 0
end function

procedure test(integer iter, integer top)
    integer n, k
    atom sum, t0, t1
    sum = 0
    for i = 1 to top do
        for j = 0 to 31 do
            n = power(2, j)
            if remainder(i,n) != 0 then
                exit
            end if
        end for
        n = n / 2
        t1 = time()
        for j = 1 to iter do
            k = dummy(i)
        end for
        t1 = time() - t1
        t0 = time()
        for j = 1 to iter do
            k = best(i)
        end for
        sum += time() - t0 - t1
        if n != k then
            printf(1, "%s %d %s %d \n", {"Error! ", n, "not equal", k})
        end if
    end for
    printf(1, "%s %f \n", {"Time:", sum})
end procedure

test(100000, 100)

if wait_key() then end if

new topic     » goto parent     » topic index » view message » categorize

3. Re: new new challenge

You were really close to what I think is an unbeatable solution, namely:

function best(integer x)
    return xor_bits(x, and_bits(x, x - 1))
end function

The two functions may be reversed.

----- Original Message ----- 
From: <rossboyd at ihug.com.au>
To: "EUforum" <EUforum at topica.com>
Subject: Re: new new challenge


> 
> > This is a new and (I think) simpler challenge:
> > Find the fastest algorithm to compute the largest power of 2 that
> exactly
> > divides an integer from 1 to 2^31 - 1.
> 
> Hi,
> The solution boils down to finding the value of the least significant
> bit.
> Here are two alternatives... unfortunately the ugly one is faster.
> Ross
> 
> 
> -- This solution is elegant but slower due to the divide by 2.
> -- If Euphoria had a bit shift operator or integer division
> -- this would probably be the fastest method.
> 
> function best2(integer x)
>     if and_bits(x,1) then  -- quickly eliminate odd numbers
>         return 1
>     end if
>     return floor(xor_bits(x,x-1)/2) + 1
> end function
> 
> -- Cruder and faster solution...
> -- You could make it even faster by 'jumping' depending on the
> -- value of x and hence reduce the number of comparisons.
> -- But why bother... this method stinks already.
> function best(integer x)
>     if and_bits(x,1) then
>         return 1
>     end if
>     if and_bits(x,2) then
>         return 2
>     end if
>     if and_bits(x,4) then
>         return 4
>     end if
>     if and_bits(x,8) then
>         return 8
>     end if
>     if and_bits(x,16) then
>         return 16
>     end if
>     if and_bits(x,32) then
>         return 32
>     end if
>     if and_bits(x,64) then
>         return 64
>     end if
>     if and_bits(x,128) then
>         return 128
>     end if
>     if and_bits(x,256) then
>         return 256
>     end if
>     if and_bits(x,512) then
>         return 512
>     end if
>     if and_bits(x,1024) then
>         return 1024
>     end if
>     if and_bits(x,2048) then
>         return 2048
>     end if
>     if and_bits(x,4096) then
>         return 4096
>     end if
>     if and_bits(x,8192) then
>         return 8192
>     end if
>     if and_bits(x,16384) then
>         return 16384
>     end if
>     if and_bits(x,32768) then
>         return 32768
>     end if
>     if and_bits(x,65536) then
>         return 65536
>     end if
>     if and_bits(x,131072) then
>         return 131072
>     end if
>     if and_bits(x,262144) then
>         return 262144
>     end if
>     if and_bits(x,524288) then
>         return 524288
>     end if
>     if and_bits(x,1048576) then
>         return 1048576
>     end if
>     if and_bits(x,2097152) then
>         return 2097152
>     end if
>     if and_bits(x,4194304) then
>         return 4194304
>     end if
>     if and_bits(x,8388608) then
>         return 8388608
>     end if
<snip>

> 
>

new topic     » goto parent     » topic index » view message » categorize

4. Re: new new challenge

> You were really close to what I think is an unbeatable solution,
namely:
>
> function best(integer x)
>     return xor_bits(x, and_bits(x, x - 1))
> end function
>
> The two functions may be reversed.


That's a brilliant solution. In C or ASM it would be unbeatable.

But strangely,  in Euphoria my crude version  is still ~50% faster.
I ran a 1,000,000 iterations and halved the timing errors (by putting in
2 wait loops.)
eg.
----------------------
t1 = time()
while t1 = time() do
end while
t1 = time()

-- insert code to test

t1 = time() - t1
----------------------

Your solution averaged 6.50 secs and my ugly one below was 3.85 secs on
a P3-450.
But I still much prefer your solution and will put it to use in my C
programs.
It will definitely be fastest in a compiled environment.
Thanks for sharing your algorithm!

Cheers,
Ross

ps. I tried the following using elsif 's and it was a LOT slower.
Which might be of interest to the speed freaks out there.

-- Crude but Fastest...
function best(integer x)
    if and_bits(x,1) then return 1 end if
    if and_bits(x,2) then return 2 end if
    if and_bits(x,4) then return 4 end if
    if and_bits(x,8) then return 8 end if
    if and_bits(x,16) then return 16 end if
    if and_bits(x,32) then return 32 end if
    if and_bits(x,64) then return 64 end if
    if and_bits(x,128) then return 128 end if
    if and_bits(x,256) then return 256 end if
    if and_bits(x,512) then return 512 end if
    if and_bits(x,1024) then return 1024 end if
    if and_bits(x,2048) then return 2048 end if
    if and_bits(x,4096) then return 4096 end if
    if and_bits(x,8192) then return 8192 end if
    if and_bits(x,16384) then return 16384 end if
    if and_bits(x,32768) then return 32768 end if
    if and_bits(x,65536) then return 65536 end if
    if and_bits(x,131072) then return 131072 end if
    if and_bits(x,262144) then return 262144 end if
    if and_bits(x,524288) then return 524288 end if
    if and_bits(x,1048576) then return 1048576 end if
    if and_bits(x,2097152) then return 2097152 end if
    if and_bits(x,4194304) then return 4194304 end if
    if and_bits(x,8388608) then return 8388608 end if
    if and_bits(x,16777216) then return 16777216 end if
    if and_bits(x,33554432) then return 33554432 end if
    if and_bits(x,67108864) then return 67108864 end if
    if and_bits(x,134217728) then return 134217728 end if
    if and_bits(x,268435456) then return 268435456 end if
    if and_bits(x,536870912) then return 536870912 end if
    if and_bits(x,1073741824) then return 1073741824 end if
    return 0
end function

new topic     » goto parent     » topic index » view message » categorize

5. Re: new new challenge

Ross Boyd writes:
>Your solution averaged 6.50 secs and
> my ugly one below was 3.85 secs on a P3-450.

It depends on the distribution of numbers that you feed in. 
I find the two functions are about equal,
given a series of numbers 1 to 100,000,000. (To be really
fair you might go to 1.07 billion, but that takes too long).

The long chain of if's looks bad until you realize
that 50% of the time you will quit after just one and_bits() test,
75% after just two, 87.5% after three etc. 
The simple function *always* does one and_bits(), 
one xor_bits(), and one subtraction.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

6. Re: new new challenge

> Ross Boyd writes:
> >Your solution averaged 6.50 secs and
> > my ugly one below was 3.85 secs on a P3-450.

Rob Craig wrote:
> It depends on the distribution of numbers that you feed in.
> I find the two functions are about equal,
> given a series of numbers 1 to 100,000,000. (To be really
> fair you might go to 1.07 billion, but that takes too long).

You're quite right. The performance really slows down on the ugly
multi-if solution when you apply a #7F000000 mask to the input.
It degrades when processing large numbers.

I just timed the 3 algorithms in C  using 1,000,000,000 iterations

These are the times not including the loop and function call overhead.
(16.65 secs)
4.89   secs  ..... "return x ^ (x & (x - 1));"
11.91 secs  ..... "return ((x ^ (x-1)) >> 1) + 1 ;"
12.20 secs  ..... "if (x & ?) return ?;" etc. etc.

So, "nforno"s algorithm is by far the best....

new topic     » goto parent     » topic index » view message » categorize

7. Re: new new challenge

Ross Boyd writes:
>Your solution averaged 6.50 secs and
> my ugly one below was 3.85 secs on a P3-450.

Rob Craig wrote:
It depends on the distribution of numbers that you feed in.
I find the two functions are about equal,
given a series of numbers 1 to 100,000,000. (To be really
fair you might go to 1.07 billion, but that takes too long).

You're quite right. The performance really slows down on the ugly
multi-if solution when you apply a #7F000000 mask to the input.
It degrades when processing large numbers.

I just timed the 3 algorithms in C  using 1,000,000,000 iterations

These are the times not including the loop and function call overhead.
(16.65 secs)
4.89   secs  ..... "return x ^ (x & (x - 1));"
11.91 secs  ..... "return ((x ^ (x-1)) >> 1) + 1 ;"
12.20 secs  ..... "if (x & ?) return ?;" etc. etc.

So, "nforno"s algorithm is by far the best....

new topic     » goto parent     » topic index » view message » categorize

8. Re: new new challenge

Ross Boyd writes:
>Your solution averaged 6.50 secs and
> my ugly one below was 3.85 secs on a P3-450.

Rob Craig wrote:
It depends on the distribution of numbers that you feed in.
I find the two functions are about equal,
given a series of numbers 1 to 100,000,000. (To be really
fair you might go to 1.07 billion, but that takes too long).

You're quite right. The performance really slows down on the ugly
multi-if solution when you apply a #7F000000 mask to the input.
It degrades when processing large numbers.

I just timed the 3 algorithms in C  using 1,000,000,000 iterations

These are the times not including the loop and function call overhead.
(16.65 secs)
4.89   secs  ..... "return x ^ (x & (x - 1));"
11.91 secs  ..... "return ((x ^ (x-1)) >> 1) + 1 ;"
12.20 secs  ..... "if (x & ?) return ?;" etc. etc.

So, "nforno"s algorithm is by far the best....

new topic     » goto parent     » topic index » view message » categorize

9. Re: new new challenge

Ross Boyd writes:
>Your solution averaged 6.50 secs and
> my ugly one below was 3.85 secs on a P3-450.

Rob Craig wrote:
It depends on the distribution of numbers that you feed in.
I find the two functions are about equal,
given a series of numbers 1 to 100,000,000. (To be really
fair you might go to 1.07 billion, but that takes too long).

You're quite right. The performance really slows down on the ugly
multi-if solution when you apply a #7F000000 mask to the input.
It degrades when processing large numbers.

I just timed the 3 algorithms in C  using 1,000,000,000 iterations

These are the times not including the loop and function call overhead.
(16.65 secs)
4.89   secs  ..... "return x ^ (x & (x - 1));"
11.91 secs  ..... "return ((x ^ (x-1)) >> 1) + 1 ;"
12.20 secs  ..... "if (x & ?) return ?;" etc. etc.

So, "nforno"s algorithm is by far the best....

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu