Functions, namespace, memory management
| Previous: Collaborative git | Next: Comprehension, generators, iteration |
Required course material for the lesson
Powerpoint: Functions - this will be short as it is a reminder.
Powerpoint: Namespace.
Powerpoint: Memory management.
Video: Functions in Python
Video: Examples and identifying data types
Resource: Example code - Functions
Video: Live Coding
Subjects covered
- Short about functions
- Namespace in Python
- Memory management
Exercises to be handed in
The 2 first exercises are re-use from earlier. It is for a good purpose as will be seen later.
You go back to solo git use to maintain your git skills. Commit every exercise to your private exercise repository.
For many of the exercises you need to make a small program that uses your function in order to test it.
- Make a function fastaread(filename) which takes a filename as a parameter and returns 2 lists, first list is the headers, second list is the sequences (as single strings without whitespace). Add appropriate error handling to the function. You can test your function on the file dna7.fsa.
- Make a function fastawrite(filename, headers, sequences) which takes a filename, a list of headers and a corresponding list of sequences as parameters and writes the fasta file. Add appropriate error handling to the function. You can test your function on the file dna7.fsa. If you first read the file with fastaread and then write a new file with fastawrite, then if the files are identical, you know you have done right.
- Make a function normalize that takes as argument a list of numbers. The function normalizes the numbers between 0 and 1 and returns a normalized list. Normalization in this context is a linear transformation - min-max rescaling.
- Make a new normalize function based on the above. This time the function takes three arguments; the list, a min, and a max that the values should be scaled/normalized between The default value for min and max are 0 and 1.
- Make a program that reads the ex1.dat file and counts how many positive and negative numbers there are in each column. Display the result. Now use your latest normalize function to normalize the numbers in each column between -1 and 1, and then again count the numbers of positive and negative in each column. Display. All this in one program.
- This is the first part of a larger program - you might want to also read the next 3 exercises to get the full picture. You know that column based files can use different delimiters. The typical example of this is a tab-separated file, where the tab separates the columns. Other classic delimiters can be comma, colon, semicolon or the pipe sign. Now make a function, determineDelimiter, that as argument takes a line, investigates the line and determines if the delimiter is tab, comma, colon, semicolon or pipe sign in the preferred order, and return a single char which is the delimiter. It is required that the delimiter is present at least once in the line. If no delimiter can be found, return None.
- This is the second part. Sometimes column based files uses the first line as an identifying headline where each column gets a name that describes the data in the column. See the employee-data.csv file as an example. Make a function identifyColumn, that takes three arguments, a delimiter, a headline and a column name and return which column number the column name belongs to. Return None if a column can not be identified.