Example code - Prime numbers optimization
This is going to be a long, but quite instructive rant about various optimization ideas and techniques.
Let us start with my "classic" prime number generator. The idea is to see how long time is takes to find/generate the primes between 0 and 100000000 (100 millions) and use that as a measure for performance. I need to point out that the basic speed of the computer plays a big role in getting these numbers.
We know that primes are integers which can only be divided by 1 and the primes itself. Therefore it is an obvious idea to evaluate if a number is a prime by starting to divide it by 2, 3, 4 ...n all the way up to the number itself and if it can be divided at any point, it is not a prime.
This basic idea can be improved in various ways.
- It is not necessary to divide with all the numbers up to n (the prospective prime). It is sufficient to divide with the numbers up to sqrt(n). This is because for any number where bignum * smallnum = n it also goes that smallnum * bignum = n, which means we only need to find the small number which is a divisor, to identify the number as not-prime. The small number will always be less or equal to sqrt(n), and the big number will always be greater or equal to sqrt(n). sqrt(n) is the equilibrium.
- All even numbers do not need to be checked. They are all not-primes, except 2. This means the sequence of numbers you test for division of n are odd, i.e. 3, 5, 7, 9, 11 ... sqrt(n).
- Actually, this can be improved as you only have to divide n by the primes in the testing, not all odd numbers. If you can divide n with an odd number, which is not a prime, then the odd number is a composite of at least two other smaller numbers, which you have tested for earlier.
Math definition: All non-primes are called composite numbers because they can be calculated as a multiplication of a number of primes.
In this PrimeGenerator class I am using the above tricks, remembering every prime as it is being calculated in order to efficiently calculate the following primes.
Time: 953 sec.
#!/usr/bin/env python3 # Prime number generator class PrimeGenerator: # Class varible, known primes in consecutive order, can be extended, but must contain these knownprimes = [2, 3] # Highest tested number for prime highesttested = 3 # Instatiation def __init__(self, number=None): if number is not None: if not isinstance(number, int): raise ValueError("Integer expected") self.target = number # Initializing iteration def __iter__(self): if self.target is None: raise ValueError("No number specified") self.pos = 0 return self # Find next prime def __next__(self): # Can we use the list of known primes to find the next? if self.pos < len(self.knownprimes): nextprime = self.knownprimes[self.pos] if nextprime >= self.target: raise StopIteration self.pos += 1 return nextprime # No, start computing the next prime while self.target > PrimeGenerator.highesttested+1: PrimeGenerator.highesttested += 1 if self._isprime(PrimeGenerator.highesttested): self.knownprimes.append(PrimeGenerator.highesttested) self.pos += 1 return self.highesttested raise StopIteration # Private method for identifying a prime def _isprime(self, number): factor = 0 pos = 0 while factor*factor <= number: # find next potential factor either in known primes or odd numbers above last known prime if pos < len(self.knownprimes): factor = self.knownprimes[pos] pos += 1 else: factor += 2 # test if it truly is a factor if number % factor == 0: return False return True # It is nice be able to ask if an number is a prime def isprime(self, number=None): if number is None: number = self.target if not isinstance(number, int): raise ValueError("Integer expected") if number in PrimeGenerator.knownprimes: return True if number <= PrimeGenerator.highesttested: return False return self._isprime(number) if __name__ == "__main__": for i in PrimeGenerator(100000000): print(i) # Big prime, don't try # 999296950101072104250052714631
The sieve of Eratosthanes
This old Greek figured out another method for computing the primes until "some number". You initialize a list n long with just True on every position. The True means 'this is a prime' for the number given by the position. Initially position 0 and 1 is set to False, as they are obviously not primes. Now you find the next True in the list, which would be a position 2. When a True is found, then go through the entire list in jumps of the number/position and the the value to False. The first time this happens is with 2, so position 4, 6, 8... will be set to False. These are the even numbers or more importantly the numbers that can be divided by 2 and are therefore not primes - they are composite numbers. The next position/number in the list is 3, so run through the list in jumps of 3 and set the value to False at the position, i.e. 6 (already False), 9, 12 (already False), 15 ...
Continue this pattern until the next True position is more than sqrt(n), at which point you can stop for the reasons already stated for the previous method.
The sieve is really about eliminating all composite numbers in the list more than finding primes. It is of course the same thing, but from different viewpoints.
Part of what makes the sieve efficient is that we get rid of all the divisions, which take a lot of time.
Time: 18.6 sec.
def eratosthenes(size): size += 1 # Adjust for zero based list array = [True] * size array[0] = array[1] = False for i in range(int(size**0.5)): if array[i]: # A prime, now get rid of all composites where this prime is a divisor for j in range(i+i, size, i): array[j] = False return [i for i in range(size) if array[i]] for i in eratosthenes(100000000): print(i)
This was a big performance boost. This is due to a simpler idea, which leads to simpler code.
What happens if we use a simpler bytearray instead of a list? The default init of bytearray is 0, so the logic must be reverted.
Time: 12.5 sec.
def eratosthenes2(size): size += 1 # Adjust for zero based list array = bytearray(size) array[0] = array[1] = 1 for i in range(int(size**0.5)): if array[i] == 0: # A prime, now get rid of all composites where this prime is a divisor for j in range(i+i, size, i): array[j] = 1 return [i for i in range(size) if array[i] == 0] for i in eratosthenes2(100000000): print(i)
The return statement contains a comparison, which can be eliminated with an enumeration, or removed by a negation.
- return [i for i, is_prime in enumerate(array) if not is_prime]
- return [i for i in range(size) if not array[i]]
It comes out to approximately the same improvement.
Time: 12.2 sec.
Code not shown.
A common optimization is replacing
for j in range(i+i, size, i):
with
for j in range(i*i, size, i):
This is not extremely obvious. If, say 7, is the prime we found, then we start with eliminating 14, 21, 28, 35, 42, i.e. 7*n where 1 < n < 7. However, these numbers have already been eliminated earlier by other primes; 2, 3 & 5.
Time: 11.9 sec.
def eratosthenes3(size): size += 1 # Adjust for zero based list array = bytearray(size) array[0] = array[1] = 1 for i in range(int(size**0.5)): if array[i] == 0: # A prime, now get rid of all composites where this prime is a divisor for j in range(i*i, size, i): array[j] = 1 return [i for i in range(size) if not array[i]] for i in eratosthenes3(100000000): print(i)
When we have lists or bytearrays, we can use slicing, or more specifically slice assignment. In other words, the inner loop can be replaced with a slice assignment. This trick works only because it is a very regular piece of the bytearray we replace. Good trick, nonetheless.
Time: 7.4 sec.
from math import ceil def eratosthenes4(size): size += 1 # Adjust for zero based list array = bytearray(size) array[0] = array[1] = 1 for i in range(int(size**0.5)): if array[i] == 0: # A prime, now get rid of all composites where this prime is a divisor array[i*i:size:i] = [1] * (((size-1) // i) - i +1) return [i for i in range(size) if not array[i]]