Example code - Prime numbers optimization
This is going to be a long, but quite instructive rant about various optimization ideas and techniques.
Let us start with my "classic" prime number generator. The idea is to see how long time is takes to find/generate the primes between 0 and 100000000 (100 millions) and use that as a measure for performance. I need to point out that the basic speed of the computer plays a big role in getting these numbers.
We know that primes are integers which can only be divided by 1 and the primes itself. Therefore it is an obvious idea to evaluate if a number is a prime by starting to divide it by 2, 3, 4 ...n all the way up to the number itself and if it can be divided at any point, it is not a prime.
This basic idea can be improved in various ways.
- It is not necessary to divide with all the numbers up to n (the prospective prime). It is sufficient to divide with the numbers up to sqrt(n). This is because for any number where bignum * smallnum = n it also goes that smallnum * bignum = n, which means we only need to find the small number which is a divisor, to identify the number as not-prime. The small number will always be less or equal to sqrt(n), and the big number will always be greater or equal to sqrt(n). sqrt(n) is the equilibrium.
- All even numbers do not need to be checked. They are all not-primes, except 2. This means the sequence of numbers you test for division of n are odd, i.e. 3, 5, 7, 9, 11 ... sqrt(n).
- Actually, this can be improved as you only have to divide n by the primes in the testing, not all odd numbers. If you can divide n with an odd number, which is not a prime, then the odd number is a composite of at least two other smaller numbers, which you have tested for earlier.
Math definition: All non-primes are called composite numbers because they can be calculated as a multiplication of a number of primes.
In this PrimeGenerator class I am using the above tricks, remembering every prime as it is being calculated in order to efficiently calculate the following primes.
Time: 953 sec.
#!/usr/bin/env python3
# Prime number generator
class PrimeGenerator:
    # Class varible, known primes in consecutive order, can be extended, but must contain these
    knownprimes = [2, 3]
    # Highest tested number for prime
    highesttested = 3
    # Instatiation
    def __init__(self, number=None):
        if number is not None:
            if not isinstance(number, int):
                raise ValueError("Integer expected")
        self.target = number                
    
    # Initializing iteration
    def __iter__(self):
        if self.target is None:
            raise ValueError("No number specified")
        self.pos = 0
        return self
    
    # Find next prime
    def __next__(self):
        # Can we use the list of known primes to find the next?
        if self.pos < len(self.knownprimes):
            nextprime = self.knownprimes[self.pos]
            if nextprime >= self.target:
                raise StopIteration
            self.pos += 1
            return nextprime
        # No, start computing the next prime
        while self.target > PrimeGenerator.highesttested+1:
            PrimeGenerator.highesttested += 1
            if self._isprime(PrimeGenerator.highesttested):
                self.knownprimes.append(PrimeGenerator.highesttested)
                self.pos += 1
                return self.highesttested
        raise StopIteration
    # Private method for identifying a prime
    def _isprime(self, number):
        factor = 0
        pos = 0
        while factor*factor <= number:
            # find next potential factor either in known primes or odd numbers above last known prime
            if pos < len(self.knownprimes):
                factor = self.knownprimes[pos]
                pos += 1
            else:
                factor += 2
            # test if it truly is a factor
            if number % factor == 0:
                return False
        return True
    # It is nice be able to ask if an number is a prime
    def isprime(self, number=None):
        if number is None:
            number = self.target
        if not isinstance(number, int):
            raise ValueError("Integer expected")
        if number in PrimeGenerator.knownprimes:
            return True
        if number <= PrimeGenerator.highesttested:
            return False
        return self._isprime(number)
if __name__ == "__main__":
    for i in PrimeGenerator(100000000):
        print(i)
    # Big prime, don't try
    # 999296950101072104250052714631
The sieve of Eratosthanes
This old Greek figured out another method for computing the primes until "some number". You initialize a list n long with just True on every position. The True means 'this is a prime' for the number given by the position. Initially position 0 and 1 is set to False, as they are obviously not primes. Now you find the next True in the list, which would be a position 2. When a True is found, then go through the entire list in jumps of the number/position and the the value to False. The first time this happens is with 2, so position 4, 6, 8... will be set to False. These are the even numbers or more importantly the numbers that can be divided by 2 and are therefore not primes - they are composite numbers. The next position/number in the list is 3, so run through the list in jumps of 3 and set the value to False at the position, i.e. 6 (already False), 9, 12 (already False), 15 ...
Continue this pattern until the next True position is more than sqrt(n), at which point you can stop for the reasons already stated for the previous method.
The sieve is really about eliminating all composite numbers in the list more than finding primes. It is of course the same thing, but from different viewpoints.
Part of what makes the sieve efficient is that we get rid of all the divisions, which take a lot of time.
Time: 18.6 sec.
def eratosthenes(size):
    size += 1  # Adjust for zero based list
    array = [True] * size
    array[0] = array[1] = False
    for i in range(int(size**0.5)):
        if array[i]:
            # A prime, now get rid of all composites where this prime is a divisor
            for j in range(i+i, size, i):
                array[j] = False
    return [i for i in range(size) if array[i]]
for i in eratosthenes(100000000):
    print(i)
This was a big performance boost. This is due to a simpler idea, which leads to simpler code.
What happens if we use a simpler bytearray instead of a list? The default init of bytearray is 0, so the logic must be reverted.
Time: 12.5 sec.
def eratosthenes2(size):
    size += 1  # Adjust for zero based list
    array = bytearray(size)
    array[0] = array[1] = 1
    for i in range(int(size**0.5)):
        if array[i] == 0:
            # A prime, now get rid of all composites where this prime is a divisor
            for j in range(i+i, size, i):
                array[j] = 1
    return [i for i in range(size) if array[i] == 0]
for i in eratosthenes2(100000000):
    print(i)
The return statement contains a comparison, which can be eliminated with an enumeration, or removed by a negation.
- return [i for i, is_prime in enumerate(array) if not is_prime]
- return [i for i in range(size) if not array[i]]
It comes out to approximately the same improvement.
Time: 12.2 sec.
Code not shown.
A common optimization is replacing
for j in range(i+i, size, i):
with
for j in range(i*i, size, i):
This is not extremely obvious. If, say 7, is the prime we found, then we start with eliminating 14, 21, 28, 35, 42, i.e. 7*n where 1 < n < 7. However, these numbers have already been eliminated earlier by other primes; 2, 3 & 5.
Time: 11.9 sec.
def eratosthenes3(size):
    size += 1  # Adjust for zero based list
    array = bytearray(size)
    array[0] = array[1] = 1
    for i in range(int(size**0.5)):
        if array[i] == 0:
            # A prime, now get rid of all composites where this prime is a divisor
            for j in range(i*i, size, i):
                array[j] = 1
    return [i for i in range(size) if not array[i]]
for i in eratosthenes3(100000000):
    print(i)
When we have lists or bytearrays, we can use slicing, or more specifically slice assignment. In other words, the inner loop can be replaced with a slice assignment. This trick works only because it is a very regular piece of the bytearray we replace. Good trick, nonetheless.
Time: 7.4 sec.
def eratosthenes4(size):
    size += 1  # Adjust for zero based list
    array = bytearray(size)
    array[0] = array[1] = 1
    for i in range(int(size**0.5)):
        if array[i] == 0:
            # A prime, now get rid of all composites where this prime is a divisor
            array[i*i:size:i] = [1] * (((size-1) // i) - i +1)
    return [i for i in range(size) if not array[i]]