1. [rant][benchmark] Sieve of Eratosthenes

The Sieve of Eratosthenes is "a classic algorithm for finding prime numbers."

I found a Python algorithm that uses "list comprehension" which results in a compact program. [ Python without Fear Brian Overland, 2018 Pearson Education ]

The program outputs the largest primes in the first 20_000 numbers.

# Python code 
 
n = 20000 
comp_list = [j for i in range(2,n) 
               for j in range(i*i, n, i )] 
prime_list = [i for i in range(2,n) 
                      if i not in comp_list] 
print( prime_list[-10:] ) 

The Euphoria version looks like this:

-- Euphoria Code 
 
integer n = 20000 
 
sequence b_lst = repeat( 1, n ) 
sequence prime_lst = {} 
 
for prime = 2 to n do 
	if b_lst[prime] then 
		prime_lst = append(prime_lst, prime ) 
		for i = prime * 2  to n by prime do 
			b_lst[i] = 0 
		end for 
	end if 
end for 
 
? prime_lst[$-9..$] 

My benchmark testing program is:

atom t 
 
puts(1, "phix\n" ) 
t = time() 
? system_exec( "p sieve.ex" ) 
printf(1, "%g time\n\n", time() - t ) 
 
 
puts(1, "oE\n" ) 
t = time() 
? system_exec( "eui sieve.ex" ) 
printf(1, "%g time\n\n", time() - t ) 
 
 
puts(1, "python\n" )  
t = time() 
? system_exec( "python sieve.py" ) 
printf(1, "%g time\n\n", time() - t ) 
 
 
puts(1, "phix compiled\n" )  
t = time() 
? system_exec( "./Sieve" ) 
printf(1, "%g time\n\n", time() - t ) 
 
 
puts(1, "oE compiled\n" )  
t = time() 
? system_exec( "./sieve" ) 
printf(1, "%g time\n\n", time() - t ) 

The results (netbook) are:

phix 
{19927,19937,19949,19961,19963,19973,19979,19991,19993,19997} 
0 
0.126013 time 
 
oE 
{19927,19937,19949,19961,19963,19973,19979,19991,19993,19997} 
0 
0.0479414 time 
 
python 
[19927, 19937, 19949, 19961, 19963, 19973, 19979, 19991, 19993, 19997] 
0 
33.448 time 
 
phix compiled 
{19927,19937,19949,19961,19963,19973,19979,19991,19993,19997} 
0 
0.00570715 time 
 
oE compiled 
{19927,19937,19949,19961,19963,19973,19979,19991,19993,19997} 
0 
0.0102074 time 

In conclusion:

  • I found a benchmark that is extremely unfair to Python
  • I still do not understand the advantage of "list comprehension"
  • as long as I use oE|Phix my netbook is fast enough

_tom

new topic     » topic index » view message » categorize

2. Re: [rant][benchmark] Sieve of Eratosthenes

Have fun: http://www.rosettacode.org/wiki/Sieve_of_Eratosthenes#Python

I recently got a bit of a slapping for not implementing wheels at http://www.rosettacode.org/wiki/Extensible_prime_generator#Phix - oh, wait, there's more on that page too smile.

The phix docs include this: http://phix.x10.mx/docs/html/phixvsfl.htm

To be fair, list comprehensions are probably more useful for lazy evaluation (not that I'm a fan of that either).

new topic     » goto parent     » topic index » view message » categorize

3. Re: [rant][benchmark] Sieve of Eratosthenes

Hi

So you can tell I'm not a professional programmer.

How on earth does this calculate primes?

-- Euphoria Code  
  
integer n = 20000  
  
sequence b_lst = repeat( 1, n )  
sequence prime_lst = {}  
  
for prime = 2 to n do  
	if b_lst[prime] then  
		prime_lst = append(prime_lst, prime )  
		for i = prime * 2  to n by prime do  
			b_lst[i] = 0  
		end for  
	end if  
end for  
  
? prime_lst[$-9..$]  

Chris

new topic     » goto parent     » topic index » view message » categorize

4. Re: [rant][benchmark] Sieve of Eratosthenes

ChrisB said...

How on earth does this calculate primes?

b_lst (the "sieve") is initially an array of 20,000 "true".
For every [prime] number it marks all multiples of that number (prime*2 by prime) as false.
Any b_lst[i] still true by the time we get to it means that i must be prime, since by then we've processed every number smaller that it.
Invented by some ancient Greek bloke a very very long time ago.

Pete

new topic     » goto parent     » topic index » view message » categorize

5. Re: [rant][benchmark] Sieve of Eratosthenes

A factor is "an integer value that exactly divides the test number `i`."

3  is  (1)(3)     -- a prime number 
 
6  is  (1)(2)(3)  -- a composite number 
                  -- because `2` is an extra factor 

A prime "number `i` has only one `1` and itself `i` as factors." A composite number "has a factor (or more) other than the numbers one `1` and `i`."

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
| | |   |   |        |     |           |     |  primes  

For example the value two `2` is prime. That means any multiple of two `2` is not prime.

  ┌───┬───┬───┬───┬─────┬─────┬─────┬─────┬─────┬──── composite values to ignore 
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
  |  
  prime 

We use False `0` as a marker.

the revised number line to search for primes 
 
  ┌───┬───┬───┬───┬─────┬─────┬─────┬─────┬─────┬──── composite values to ignore 
1 2 3 0 5 0 7 0 9 0  11 0  13 0  15 0  17 0  19 0 
  |  
  prime 

    ┌─────┬─────┬───────┬────────┬────────┬───────────  possible composite values to ignore 
1 2 3 0 5 0 7 0 9 0  11 0  13 0  15 0  17 0  19 0 
    |  
    prime 
 

The revised number line to search for primes is now simpler.

1 2 3 0 5 0 7 0 0 0  11 0  13 0  0  0  17 0  19 0 
        |  
        prime 
 

As you locate prime numbers you also increasingly eliminate future values that you will not have to test.

As a Euphoria program

We will search for prime numbers in the first `n` numbers.

integer n = 2000 

We mark all values as prime to start with. We make a list of numbers from one to n and mark them as True. Thus we have a "boolean" list of values which I call `b_lst`.

The "trick" here is that the index value is a substitute for having a separate sequence of numbers ( 1,2,3,4,5,6,...,n ).

sequence b_lst = repeat(1, n ) 
 
--> { 1,1,1,1,1, ...  

We save prime numbers as we locate them.

sequence prime_lst = {} 

We know in advance that one `1` is a prime number (so we ignore this value). We will test every number from two to `n` to see if it is prime.

for prime = 2 to n do 
    ... 
end for 

We look at the `b_lst` to discover if a value is prime. If a value is marked as True `1` then we add it to our `prime_lst`. If a value is marked as False `0` then we ignore that value. The index value `prime` is our connection between the number and its T|F status.

if b_lst[prime] then 
    ... 

So if the value is prime we add it to our `prime_lst`

prime_lst = append(prime_lst, prime ) 

We must eliminate all multiples of this prime number from future searches. The next number we know is not prime is larger by the factor two `2`; so we start with prime*2 and go to the end of the list of numbers to `n`; we mark known composite numbers as False `0`.

for i = prime * 2 to n by prime do 
    b_lst[i] = 0 
end for 
  • If prime is `2` this series looks like 4, 6, 8, 10, 12, ...; these are all numbers we can eliminate as not prime numbers.
  • If prime is `3` this series looks like 6, 9, 12, 15, 18, ... these are all numbers we can eliminate as not prime numbers.

The algorithm moves along; we find a prime number and then eliminate all future values that we can ignore as composite values. The more primes we find the more values are eliminated from our brute force search.

Finally, we display the last ten prime values

? prime_lst[$-9..$] 

This output has to be the same as that for the python algorithm.

The Python without Fear book told me that "The sieve is frequently used as a benchmark to demonstrate how fast a computer program is." I called their bluff. Speed measured on a netbook; the slow atom chip magnifies speed differences between algorithms.

Language Time (seconds) Notice This
Phix 0.13
oE 0.05
Python 33 *
Phix compiled 0.006
oE compiled 0.01

The speed differences between Phix and oE are a cat and mouse game. When the differences are small you can even argue that they are not significant because this is not a rigorous test. However, Python is dramatically slower in this benchmark.

_tom

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu