logo le blog invivoo blanc

The Sieve of Eratosthenes

9 July 2018 | Python | 0 comments

1. Who is Eratosthenes?

Eratosthenes of Cyrène was a Greek mathematician, geographer, poet, astronomer and music theorist. He was the chief librarian at the Library of Alexandria. He is commonly called the “father of geography”. He is also the first person who made the first measurement of the size of Earth for which any details are known. And in mathematical research, he created the sieve which is an efficient way to find all prime Numbers between 1 and N.

 

2. Sieve

Sieve is a very simple and ancient algorithm. During a long time, it was the best algorithm for finding all prime numbers between 1 and N. And, it’s the reason why it’s a very good algorithm for training session.

Now Atkin’s sieve is the best algorithm but this one is more complex to implement.

Principes of sieve are very simple. Let’s see ! We just need a grid with number from 0 to N

  •   At beginning all numbers are present
  •   Then we remove the 0. (For example, we change the background color to gray)
    Grid 1
  • For the example, we will change the background of all prime Numbers to green.

Grid 2

  • We start the algorithm at 2. We can see that 2 is in the grid so it’s a prime number.
    grid 3
  • Then, we have to remove all multiples of 2
    grid 4
  • After 2, we try 3. The number 3 is in the grid so it’s a primer number

  • We have to remove all multiples of 3

grid 7

  • We continue on 4. As we can see 4 has been removed from the grid so it’s not a prime number.
  • We continue with 5. 5 is in the grid, so it’s a prime number.

grid 8

  • Then, we remove all the multiples of 5

grid 9

  • We continue with 6 which is not in the grid.
  • We continue with 7. 7 is in the grid so it’s a prime number.
    grid 10
  • Then we remove all the multiples of 7
    grid 11
  • We have to continue until finishing with that situation.

grid 12

3. Python and the Sieve

Sieve is a simple algorithm and it’s very adapted to train beginners who want to learn Python basis :

  • Work on naive implementation allows the developer to work on list ;
  • To minimize the memory consumption, we can work on large integers
  • Or we can use “set” objects

Each way to do will help the beginners to develop new skills and improve his understanding of Python. For each case, we’ll create a specific python function to contain the algorithm.

Naive implementation

Naive implementation allows developer to manipulate list / array (creation, access through index), to use loops and conditions. It’s a good way to start !

In fact, the grid is an array where indices are the Numbers and values are True (It’s a prime number) or False (it isn’t a prime number). The first step consists in creating an array of (n+1) elements all initialized at True : (n+1) is needed to have all numbers from 0 to n included. There are several ways to do that : 

  •  A loop
t = []
for i in range( n + 1 ) :
     t.append( True )
  • In using generator
t = [ True for i in range( n + 1 ) ]
  • The best method
t = [ True ] * ( n + 1 )

The second step consists in wirting the main loop. Then extract the list of prime Numbers :

def primes_naive_1( n ) :
    N = n + 1
    t = [ True ] * N
    for i in range( 2, N ) :
        if t[ i ] :
            # ‘i’ is a prime number
            for j in range( i+i, N, i ) : # now we remove all ‘i’ multiples
                t[ j ] = False

    # extract the list of prime numbers
    return [ i for i in range( 1, N ) if t[ i ] ]

Improve the naive implementation

That part allows the developer to improve its algorithmic skills.

When we test each index one after other, we try all even Numbers but, if 2 is a prime number, all other even Numbers are multiples of 2. So testing the even number is useless. We can adapt the algorithm to separate the odd and even Numbers.

def primes_naive_2( n ) :
    N = n + 1
    t = [ True ] * N

    # remove multiple of 2
    for i in range( 4, N, 2 ) :
        t[ i ] = False

    # treat the odd numbers
    for i in range( 3, N, 2 ) :
        if t[ i ] :
            # ‘i’ is a prime number
            for j in range( i+i, N, i ) : # now we remove all ‘i’ multiples
                t[ j ] = False

    # extract the list of prime numbers
    return [ i for i in range( 1, N ) if t[ i ] ]

Now, we’ll try to improve the internal loop. In trying 5, which is a prime number, the internal loop starts from 10 to N with a step of 5. First values to set to False are 10 (5×2), 15 (5×3) and 20 (5×4) which were already removed when we selected 2 and 3 as prime Numbers. So we can start from 5×5. And in a general case, we can start at ‘i*i’ wich is an odd number.

def primes_naive_3( n ) :
    N = n + 1
    t = [ True ] * N

    # remove multiple of 2
    for i in range( 4, N, 2 ) :
        t[ i ] = False

    # treat the odd numbers
    for i in range( 3, N, 2 ) :
        if t[ i ] :
            # ‘i’ is a prime number
            for j in range( i * i, N, i ) : # now we remove all ‘i’ multiples
                t[ j ] = False

    # extract the list of prime numbers
    return [ i for i in range( 1, N ) if t[ i ] ]

‘ i * i ‘ is an odd number but ‘ i * i + i ‘ is an even number. So we can update the algorithm to avoid the even Numbers :

def primes_naive_4( n ) :
    N = n + 1
    t = [ True ] * N

    # remove multiple of 2
    for i in range( 4, N, 2 ) :
        t[ i ] = False

    # treat the odd numbers
    for i in range( 3, N, 2 ) :
        if t[ i ] :
            # ‘i’ is a prime number
            for j in range( i * i, N, i+i ) :
                t[ j ] = False

    # extract the list of prime numbers
    return [ i for i in range( 1, N ) if t[ i ] ]

As we removed ‘not prime number’ from ‘i * i’ to ‘N’, if ‘i’ is greater than square root of ‘N’ it’s not necessary to continue because there is no values which can be removed.

def primes_naive_5( n ) :
    from math import ceil, sqrt
    N = n + 1
    L = int( ceil( sqrt( N ) ) )
    t = [ True ] * N

    # remove multiple of 2
    for i in range( 4, N, 2 ) :
        t[ i ] = False

    # treat the odd numbers
    for i in range( 3, L, 2 ) :
        if t[ i ] :
            # ‘i’ is a prime number
            for j in range( i * i, N, i+i ) :
                t[ j ] = False

    # extract the list of prime numbers
    return [ i for i in range( 1, N ) if t[ i ] ]

One set else nothing

At this step, the developer will manipulate sets and difference operator. Often, sets are unused in algorithms where sats could be useful. We’ll try to convert the primes_naives_5.

To replace the concept of array of booleans, we will use a set which contains the numbers. And to replace the updates of values to False, we will use the difference operator. And to fill the sets, we’ll use generators and it’ll simplify some loops.

def primes_set_1( n ) :
    from math import ceil, sqrt
    N = n + 1
    L = int( ceil( sqrt( N ) ) )
    t = set( i for i in range( 1, N ) )

    # remove multiple of 2
    t -= set( i for i in range( 4, N, 2 ) )

    # treat the odd numbers
    for i in range( 3, L, 2 ) :
        if i in t:
            # ‘i’ is a prime number
            t -= set( j for j in range( i * i, N, i+i ) )

    # extract the list of prime numbers
    return [ i for i in range( 1, N ) if i in t ]

Only integers

Through binary representation, an integer is an array of bits with each bit which is 0 or 1 (False or True). So we can replace an array of booleans by an array of bits. This exercise will allow the student to work in integer operations (shifting, bit and bit operators).

Create an array with n+1 True values, we will call « 2^n+1-1 ». To remove a value from the integer, there are two ways to solve :

  • The exclusive or operator
  • And operator on inversed bits

But be careful the risk with “exclusive or operator” is that if we Apply it twice on same value ( we remove twice the same value), we will cancel the first removing in adding the value. So the only right way to remove the value i is to apply : t &= ~( 1 << i ).

def primes_integer( n ) :
    from math import ceil, sqrt
    N        = n + 1
    L        = int( ceil( sqrt( N ) ) )
    t        = ( 1 << N ) – 1

    # remove multiple of 2
    for i in range( 4, N, 2 ) :
        t &= ~( 1 << i )

    # treat the odd numbers
    for i in range( 3, L, 2 ) :
        if t & ( 1 << i ) != 0:
            # ‘i’ is a prime number
            for j in range( i * i, N, i+i ) :
                t &= ~( 1 << j )

    # extract the list of prime numbers
    return [ i for i in range( 1, N ) if t & ( 1 << i ) != 0 ]

Conclusion

Simple, smart and efficient to help student in Learning Python and algorithmic. We can continue in using bitarray.bitarray or numpy.array (bit field option or boolean).