Re: [rant][benchmark] Sieve of Eratosthenes
- Posted by _tom (admin) Jan 09, 2019
- 1138 views
A factor is "an integer value that exactly divides the test number `i`."
3 is (1)(3) -- a prime number 6 is (1)(2)(3) -- a composite number -- because `2` is an extra factor
A prime "number `i` has only one `1` and itself `i` as factors." A composite number "has a factor (or more) other than the numbers one `1` and `i`."
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | | | | | | | | | primes
For example the value two `2` is prime. That means any multiple of two `2` is not prime.
┌───┬───┬───┬───┬─────┬─────┬─────┬─────┬─────┬──── composite values to ignore 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | prime
We use False `0` as a marker.
the revised number line to search for primes ┌───┬───┬───┬───┬─────┬─────┬─────┬─────┬─────┬──── composite values to ignore 1 2 3 0 5 0 7 0 9 0 11 0 13 0 15 0 17 0 19 0 | prime
┌─────┬─────┬───────┬────────┬────────┬─────────── possible composite values to ignore 1 2 3 0 5 0 7 0 9 0 11 0 13 0 15 0 17 0 19 0 | prime
The revised number line to search for primes is now simpler.
1 2 3 0 5 0 7 0 0 0 11 0 13 0 0 0 17 0 19 0 | prime
As you locate prime numbers you also increasingly eliminate future values that you will not have to test.
As a Euphoria program
We will search for prime numbers in the first `n` numbers.
integer n = 2000
We mark all values as prime to start with. We make a list of numbers from one to n and mark them as True. Thus we have a "boolean" list of values which I call `b_lst`.
The "trick" here is that the index value is a substitute for having a separate sequence of numbers ( 1,2,3,4,5,6,...,n ).
sequence b_lst = repeat(1, n ) --> { 1,1,1,1,1, ...
We save prime numbers as we locate them.
sequence prime_lst = {}
We know in advance that one `1` is a prime number (so we ignore this value). We will test every number from two to `n` to see if it is prime.
for prime = 2 to n do ... end for
We look at the `b_lst` to discover if a value is prime. If a value is marked as True `1` then we add it to our `prime_lst`. If a value is marked as False `0` then we ignore that value. The index value `prime` is our connection between the number and its T|F status.
if b_lst[prime] then ...
So if the value is prime we add it to our `prime_lst`
prime_lst = append(prime_lst, prime )
We must eliminate all multiples of this prime number from future searches. The next number we know is not prime is larger by the factor two `2`; so we start with prime*2 and go to the end of the list of numbers to `n`; we mark known composite numbers as False `0`.
for i = prime * 2 to n by prime do b_lst[i] = 0 end for
- If prime is `2` this series looks like 4, 6, 8, 10, 12, ...; these are all numbers we can eliminate as not prime numbers.
- If prime is `3` this series looks like 6, 9, 12, 15, 18, ... these are all numbers we can eliminate as not prime numbers.
The algorithm moves along; we find a prime number and then eliminate all future values that we can ignore as composite values. The more primes we find the more values are eliminated from our brute force search.
Finally, we display the last ten prime values
? prime_lst[$-9..$]
This output has to be the same as that for the python algorithm.
The Python without Fear book told me that "The sieve is frequently used as a benchmark to demonstrate how fast a computer program is." I called their bluff. Speed measured on a netbook; the slow atom chip magnifies speed differences between algorithms.
Language | Time (seconds) | Notice This |
---|---|---|
Phix | 0.13 | |
oE | 0.05 | |
Python | 33 | * |
Phix compiled | 0.006 | |
oE compiled | 0.01 |
The speed differences between Phix and oE are a cat and mouse game. When the differences are small you can even argue that they are not significant because this is not a rigorous test. However, Python is dramatically slower in this benchmark.
_tom