Runtime evaluation of algorithms

Previous: Scientific Libraries, Pandas, Numpy

Required course material for the lesson

Powerpoint: Runtime evaluation of algorithms
Video: Runtime Analysis See this first

Subjects covered

Runtime evaluation of algorithms
Big O notation

Big O notation

When you "calculate" the running time of a program or an algorithm you write your result in the Big O notation. The running time is the complexity of the algorithm. The hardest part about Big O is understanding the concept - performing the actual calculation is rather easy in most cases. Especially since there is no real calculation to do, but mostly just analysis of the code.

Some loosely explained theory

Any program is performing some operations when it is running. This could be addition of a number to a variable. It could be asking which number is the biggest or if two strings are identical. You have been programming; every statement is an operation.

Calculating Big O is estimating the number of operations. This is not the number of statements. Some statements are executed several times in loops. Big O is more similar to the accumulated number of times each statement has been executed during the run of the program. While it is possible to calculate this number exactly, given an input and given that randomness is not part of the program, then this is not necessary and very uninteresting and possibly difficult.

What is needed is an estimate that explains how the program's running time scales with the size of the input; Big O. Very often - almost always - a program contains loops that loop over data. Let's take the program that calculates the average of a list/file of numbers as an example.

# pseudo code
open file
for each line in file
    sum = sum + number(line)
    increment(line_counter)
close file
print sum/line_counter

The opening and closing of the file is done once, likewise the printing of the sum. The summing is done once per number/line. This means the more numbers there are in the file, the more additions are made. It scales with the number of numbers. The line_counter is also incremented once per line, so that obviously also scale in the same way as the summing.
To put this in Big O notation: O(1 + n + n + 1 + 1) = O(2n + 3).
This expression is still too complex. The "small stuff" is eliminated, this means remove the constants. They don't tell much about how running time scales anyway.
The final Big O notation: O(n), where n is the number of numbers.
If you have a more complex Big O like: O(n*n + n), then the single n is thrown away as that is not important compared to n*n, so O(n*n) is the final result.

A collecting of internet resources on Big O

Big O in plain English
A Beginner's Guide to Big O Notation
Big O Notation
Big-O Algorithm Complexity Cheat Sheet
Big O notation explained
How to calculate Big O
Big O notations - a youtube video

Wikipedia's entry on Big O is particular useless. It is most likely correct, but impossible to understand. Don't go there for information unless you like pain.

If all you want to do is to measure how long your program takes to run in real time (seconds and minutes), then you should just use the time command in unix. Simply put it in front of your program.

# Normally you call your program like
./mypythonprogram.py fastafile
# To see how long it takes to execute, use
time ./mypythonprogram.py fastafile

How to calculate Big O in the project

When a program or algorithm is small, then it is often trivial to perform this "calculation". In bigger pieces of software, you often need to be more systematic about it. Since we know that the loops are what matters in Big O, then keeping track of the loops in the software as comments is a great way of starting a system. In this example, I have a list of numbers where all must be multiplied in pairs and then summed.
The code:

numbers = [2,3,5,7,11,13]
sum = 0
for i in range(len(numbers)-1):
    for j in (j+1, range(numbers)):
        sum += numbers[i] * numbers[j]
print("Result:", sum)

I start adding a comment about Big O in the inner loop. I discard the minor issues, like in essence the loop is at most n-1 long, and minimum 1 long. The worst case is what we use in these situations.

numbers = [2,3,5,7,11,13]
sum = 0
for i in range(len(numbers)-1):
    # O(n), where n is the number of numbers, worst case
    for j in (j+1, range(numbers)):
        sum += numbers[i] * numbers[j]
print("Result:", sum)

Then I add a Big O comment to the outer loop. Since I already have the Big O for the inner loop, it is easy.

numbers = [2,3,5,7,11,13]
sum = 0
# O(n*n), where n is the number of numbers
for i in range(len(numbers)-1):
    # O(n), where n is the number of numbers, worst case
    for j in (j+1, range(numbers)):
        sum += numbers[i] * numbers[j]
print("Result:", sum)

Now I have the result, O(n*n), easily argued and explainable. Shown directly in the code. I should add some logical reasoning, but I have a strong analysis in the code, which can support me.

Exercises to be handed in

"Calculate" can mean many things here; Analyze, argue, explain, show, or even demonstrate. For some algorithms (not these) it even means "do a lot of math". The calculation must be convincing and explained. Just showing the result is insufficient.

Calculate Big O of exercise 2 in Python Recap and Objects
Calculate Big O of exercise 1 in Regular Expressions
Calculate Big O of exercise 2 in Regular Expressions
Calculate Big O of exercise 5 in Regular Expressions
Calculate Big O of exercise 9 in Regular Expressions
Calculate Big O of exercise 9 in Making Functions
Calculate Big O of exercise 1 in Advanced Data Structures and New Data Types
Calculate Big O of exercise 4 in Advanced Data Structures and New Data Types
Calculate Big O of exercise 6 in Comprehension, Generators, Functions and Methods

Runtime evaluation of algorithms

Required course material for the lesson

Subjects covered

Big O notation

Some loosely explained theory

A collecting of internet resources on Big O

How to calculate Big O in the project

Exercises to be handed in

Exercises for extra practice

Navigation menu

Runtime evaluation of algorithms

Required course material for the lesson

Subjects covered

Big O notation

Some loosely explained theory

A collecting of internet resources on Big O

How to calculate Big O in the project

Exercises to be handed in

Exercises for extra practice

Navigation menu

Search