Example code - Prime numbers optimization: Difference between revisions
No edit summary |
No edit summary |
||
(11 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
This is going to be a long, but quite instructive rant about various optimization ideas and techniques.<br> | This is going to be a long, but quite instructive rant about various optimization ideas and techniques.<br> | ||
Let us start with my "classic" prime number generator. The idea is to see how long time | Let us start with my "classic" prime number generator. The idea is to see how long time it takes to find/generate the primes between 0 and 100000000 (100 millions) and use that as a measure for performance. I need to point out that the basic speed of the computer plays a big role in getting these numbers. | ||
We know that primes are integers which can only be divided by 1 and the primes itself. Therefore it is an obvious idea to evaluate if a number is a prime by starting to divide it by 2, 3, 4 ...n all the way up to the number itself and if it can be divided at any point, it is not a prime.<br> | We know that primes are integers which can only be divided by 1 and the primes itself and are larger than 1. Therefore it is an obvious idea to evaluate if a number is a prime by starting to divide it by 2, 3, 4 ...n all the way up to the number itself and if it can be divided at any point, it is not a prime.<br> | ||
This basic idea can be improved in various ways. | This basic idea can be improved in various ways. | ||
* It is not necessary to divide | * It is not necessary to divide by all the numbers up to n (the prospective prime). It is sufficient to divide with the numbers up to sqrt(n). This is because for any number where '''bignum * smallnum = n''' it also goes that '''smallnum * bignum = n''', which means we only need to find the small number that is a divisor, to identify n as not-prime. When considering pairs of numbers that multiply to n, once you reach sqrt(n) the factors starts repeating in reverse order. This means for every factor pair, at least one of them needs to be less or equal to sqrt(n). | ||
* All even numbers do not need to be checked. They are all not-primes, except 2. This means the sequence of numbers you test for division of n are odd, i.e. 3, 5, 7, 9, 11 ... sqrt(n). | * All even numbers do not need to be checked. They are all not-primes, except 2. This means the sequence of numbers you test for division of n are odd, i.e. 3, 5, 7, 9, 11 ... sqrt(n). | ||
* Actually, this can be improved as you only have to divide n by the primes in the testing, not all odd numbers. If you can divide n with an odd number, which is not a prime, then the odd number is a composite of at least two other smaller numbers, which you have tested for earlier. | * Actually, this can be improved as you only have to divide n by the primes in the testing, not all odd numbers. If you can divide n with an odd number, which is not a prime, then the odd number is a composite of at least two other smaller numbers, which you have tested for earlier. | ||
Line 88: | Line 88: | ||
</pre> | </pre> | ||
== The sieve of Eratosthanes == | |||
This old Greek figured out another method for computing the primes until "some number". You initialize a list n long with just True on every position. The True means 'this is a prime' for the number given by the position. Initially position 0 and 1 is set to False, as they are obviously not primes. Now you find the next True in the list, which would be a position 2. When a True is found, then go through the entire list in jumps of the number/position and | This old Greek figured out another method for computing the primes until "some number". You initialize a list n long with just True on every position. The True means 'this is a prime' for the number given by the position. Initially position 0 and 1 is set to False, as they are obviously not primes. Now you find the next True in the list, which would be a position 2. When a True is found, then go through the entire list in jumps of the number/position and the value to False. The first time this happens is with 2, so position 4, 6, 8... will be set to False. These are the even numbers or more importantly the numbers that can be divided by 2 and are therefore not primes - they are composite numbers. The next position/number in the list is 3, so run through the list in jumps of 3 and set the value to False at the position, i.e. 6 (already False), 9, 12 (already False), 15 ...<br> | ||
Continue this pattern until the next True position is more than sqrt(n), at which point you can stop for the reasons already stated for the previous method.<br> | Continue this pattern until the next True position is more than sqrt(n), at which point you can stop for the reasons already stated for the previous method.<br> | ||
The sieve is really about eliminating all composite numbers in the list more than finding primes. It is of course the same thing, but from different viewpoints. | The sieve is really about eliminating all composite numbers in the list more than finding primes. It is of course the same thing, but from different viewpoints. | ||
Part of what makes the sieve efficient is that we get rid of all the divisions, which take a lot of time. | |||
Time: 18.6 sec. | Time: 18.6 sec. | ||
Line 99: | Line 100: | ||
array = [True] * size | array = [True] * size | ||
array[0] = array[1] = False | array[0] = array[1] = False | ||
for i in range(int(size**0.5)): | for i in range(int(size**0.5)+1): | ||
if array[i]: | if array[i]: | ||
# A prime, now get rid of all composites where this prime is a divisor | # A prime, now get rid of all composites where this prime is a divisor | ||
Line 119: | Line 120: | ||
array = bytearray(size) | array = bytearray(size) | ||
array[0] = array[1] = 1 | array[0] = array[1] = 1 | ||
for i in range(int(size**0.5)): | for i in range(int(size**0.5)+1): | ||
if array[i] == 0: | if array[i] == 0: | ||
# A prime, now get rid of all composites where this prime is a divisor | # A prime, now get rid of all composites where this prime is a divisor | ||
Line 129: | Line 130: | ||
print(i) | print(i) | ||
</pre> | </pre> | ||
The return statement contains a comparison, which can be eliminated with an enumeration, or removed by a negation. | |||
* return [i for i, is_prime in enumerate(array) if not is_prime] | |||
* return [i for i in range(size) if not array[i]] | |||
It comes out to approximately the same improvement. | |||
Time: 12.2 sec.<br> | |||
Code not shown. | |||
A common optimization is replacing | |||
for j in range(i+i, size, i): | |||
with | |||
for j in range(i*i, size, i): | |||
This is not extremely obvious. If, say 7, is the prime, p, we found, then we start with eliminating 14, 21, 28, 35, 42, i.e. p*n where 1 < n < p. However, these numbers have already been eliminated earlier by other primes; 2, 3 & 5. The insight is that all the numbers from 2 (0 & 1 does not count here) up to - but not including - p has already been through the sieve. The numbers 1 < n < p are simply primes or composite primes (2 or more primes multiplied together) and as they are smaller than p, they have already been through the procedure/sieve.<br> | |||
Time: 11.9 sec. | |||
<pre> | |||
def eratosthenes3(size): | |||
size += 1 # Adjust for zero based list | |||
array = bytearray(size) | |||
array[0] = array[1] = 1 | |||
for i in range(int(size**0.5)+1): | |||
if array[i] == 0: | |||
# A prime, now get rid of all composites where this prime is a divisor | |||
for j in range(i*i, size, i): | |||
array[j] = 1 | |||
return [i for i in range(size) if not array[i]] | |||
for i in eratosthenes3(100000000): | |||
print(i) | |||
</pre> | |||
When we have lists or bytearrays, we can use slicing, or more specifically slice assignment. In other words, the inner loop can be replaced with a slice assignment. This trick works only because it is a very regular piece of the bytearray we replace. Good trick, nonetheless. | |||
Time: 7.5 sec. | |||
<pre> | |||
def eratosthenes4(size): | |||
array = bytearray(size+1) # Adjust for zero based list | |||
array[0] = array[1] = 1 | |||
for i in range(int(size**0.5)+1): | |||
if array[i] == 0: | |||
# A prime, now get rid of all composites where this prime is a divisor | |||
array[i*i::i] = [1] * ((size // i) - i +1) | |||
return [i for i in range(size+1) if not array[i]] | |||
for i in eratosthenes4(100000000): | |||
print(i) | |||
</pre> | |||
A memory optimization could be to use the entire bytearray as a bit vector, instead of the 0/1 byte vector use now. This will expand memory capacity, i.e. how many primes we could find, but at the cost of performance; semi-complicated position calculation and bit modification would have to be done, and slicing could not be used. | |||
Please do not think that code you read here was something I "just made". I spend a long time trying variations of these ideas, making mistakes underway. What you see is the best version of a particular optimization idea. | |||
The code was run like: | |||
time python3 mycode.py > resultfile.txt |
Latest revision as of 10:55, 20 June 2024
This is going to be a long, but quite instructive rant about various optimization ideas and techniques.
Let us start with my "classic" prime number generator. The idea is to see how long time it takes to find/generate the primes between 0 and 100000000 (100 millions) and use that as a measure for performance. I need to point out that the basic speed of the computer plays a big role in getting these numbers.
We know that primes are integers which can only be divided by 1 and the primes itself and are larger than 1. Therefore it is an obvious idea to evaluate if a number is a prime by starting to divide it by 2, 3, 4 ...n all the way up to the number itself and if it can be divided at any point, it is not a prime.
This basic idea can be improved in various ways.
- It is not necessary to divide by all the numbers up to n (the prospective prime). It is sufficient to divide with the numbers up to sqrt(n). This is because for any number where bignum * smallnum = n it also goes that smallnum * bignum = n, which means we only need to find the small number that is a divisor, to identify n as not-prime. When considering pairs of numbers that multiply to n, once you reach sqrt(n) the factors starts repeating in reverse order. This means for every factor pair, at least one of them needs to be less or equal to sqrt(n).
- All even numbers do not need to be checked. They are all not-primes, except 2. This means the sequence of numbers you test for division of n are odd, i.e. 3, 5, 7, 9, 11 ... sqrt(n).
- Actually, this can be improved as you only have to divide n by the primes in the testing, not all odd numbers. If you can divide n with an odd number, which is not a prime, then the odd number is a composite of at least two other smaller numbers, which you have tested for earlier.
Math definition: All non-primes are called composite numbers because they can be calculated as a multiplication of a number of primes.
In this PrimeGenerator class I am using the above tricks, remembering every prime as it is being calculated in order to efficiently calculate the following primes.
Time: 953 sec.
#!/usr/bin/env python3 # Prime number generator class PrimeGenerator: # Class varible, known primes in consecutive order, can be extended, but must contain these knownprimes = [2, 3] # Highest tested number for prime highesttested = 3 # Instatiation def __init__(self, number=None): if number is not None: if not isinstance(number, int): raise ValueError("Integer expected") self.target = number # Initializing iteration def __iter__(self): if self.target is None: raise ValueError("No number specified") self.pos = 0 return self # Find next prime def __next__(self): # Can we use the list of known primes to find the next? if self.pos < len(self.knownprimes): nextprime = self.knownprimes[self.pos] if nextprime >= self.target: raise StopIteration self.pos += 1 return nextprime # No, start computing the next prime while self.target > PrimeGenerator.highesttested+1: PrimeGenerator.highesttested += 1 if self._isprime(PrimeGenerator.highesttested): self.knownprimes.append(PrimeGenerator.highesttested) self.pos += 1 return self.highesttested raise StopIteration # Private method for identifying a prime def _isprime(self, number): factor = 0 pos = 0 while factor*factor <= number: # find next potential factor either in known primes or odd numbers above last known prime if pos < len(self.knownprimes): factor = self.knownprimes[pos] pos += 1 else: factor += 2 # test if it truly is a factor if number % factor == 0: return False return True # It is nice be able to ask if an number is a prime def isprime(self, number=None): if number is None: number = self.target if not isinstance(number, int): raise ValueError("Integer expected") if number in PrimeGenerator.knownprimes: return True if number <= PrimeGenerator.highesttested: return False return self._isprime(number) if __name__ == "__main__": for i in PrimeGenerator(100000000): print(i) # Big prime, don't try # 999296950101072104250052714631
The sieve of Eratosthanes
This old Greek figured out another method for computing the primes until "some number". You initialize a list n long with just True on every position. The True means 'this is a prime' for the number given by the position. Initially position 0 and 1 is set to False, as they are obviously not primes. Now you find the next True in the list, which would be a position 2. When a True is found, then go through the entire list in jumps of the number/position and the value to False. The first time this happens is with 2, so position 4, 6, 8... will be set to False. These are the even numbers or more importantly the numbers that can be divided by 2 and are therefore not primes - they are composite numbers. The next position/number in the list is 3, so run through the list in jumps of 3 and set the value to False at the position, i.e. 6 (already False), 9, 12 (already False), 15 ...
Continue this pattern until the next True position is more than sqrt(n), at which point you can stop for the reasons already stated for the previous method.
The sieve is really about eliminating all composite numbers in the list more than finding primes. It is of course the same thing, but from different viewpoints.
Part of what makes the sieve efficient is that we get rid of all the divisions, which take a lot of time.
Time: 18.6 sec.
def eratosthenes(size): size += 1 # Adjust for zero based list array = [True] * size array[0] = array[1] = False for i in range(int(size**0.5)+1): if array[i]: # A prime, now get rid of all composites where this prime is a divisor for j in range(i+i, size, i): array[j] = False return [i for i in range(size) if array[i]] for i in eratosthenes(100000000): print(i)
This was a big performance boost. This is due to a simpler idea, which leads to simpler code.
What happens if we use a simpler bytearray instead of a list? The default init of bytearray is 0, so the logic must be reverted.
Time: 12.5 sec.
def eratosthenes2(size): size += 1 # Adjust for zero based list array = bytearray(size) array[0] = array[1] = 1 for i in range(int(size**0.5)+1): if array[i] == 0: # A prime, now get rid of all composites where this prime is a divisor for j in range(i+i, size, i): array[j] = 1 return [i for i in range(size) if array[i] == 0] for i in eratosthenes2(100000000): print(i)
The return statement contains a comparison, which can be eliminated with an enumeration, or removed by a negation.
- return [i for i, is_prime in enumerate(array) if not is_prime]
- return [i for i in range(size) if not array[i]]
It comes out to approximately the same improvement.
Time: 12.2 sec.
Code not shown.
A common optimization is replacing
for j in range(i+i, size, i):
with
for j in range(i*i, size, i):
This is not extremely obvious. If, say 7, is the prime, p, we found, then we start with eliminating 14, 21, 28, 35, 42, i.e. p*n where 1 < n < p. However, these numbers have already been eliminated earlier by other primes; 2, 3 & 5. The insight is that all the numbers from 2 (0 & 1 does not count here) up to - but not including - p has already been through the sieve. The numbers 1 < n < p are simply primes or composite primes (2 or more primes multiplied together) and as they are smaller than p, they have already been through the procedure/sieve.
Time: 11.9 sec.
def eratosthenes3(size): size += 1 # Adjust for zero based list array = bytearray(size) array[0] = array[1] = 1 for i in range(int(size**0.5)+1): if array[i] == 0: # A prime, now get rid of all composites where this prime is a divisor for j in range(i*i, size, i): array[j] = 1 return [i for i in range(size) if not array[i]] for i in eratosthenes3(100000000): print(i)
When we have lists or bytearrays, we can use slicing, or more specifically slice assignment. In other words, the inner loop can be replaced with a slice assignment. This trick works only because it is a very regular piece of the bytearray we replace. Good trick, nonetheless.
Time: 7.5 sec.
def eratosthenes4(size): array = bytearray(size+1) # Adjust for zero based list array[0] = array[1] = 1 for i in range(int(size**0.5)+1): if array[i] == 0: # A prime, now get rid of all composites where this prime is a divisor array[i*i::i] = [1] * ((size // i) - i +1) return [i for i in range(size+1) if not array[i]] for i in eratosthenes4(100000000): print(i)
A memory optimization could be to use the entire bytearray as a bit vector, instead of the 0/1 byte vector use now. This will expand memory capacity, i.e. how many primes we could find, but at the cost of performance; semi-complicated position calculation and bit modification would have to be done, and slicing could not be used.
Please do not think that code you read here was something I "just made". I spend a long time trying variations of these ideas, making mistakes underway. What you see is the best version of a particular optimization idea.
The code was run like:
time python3 mycode.py > resultfile.txt