Parallel programming
Previous: Algorithms | Next: More parallelism |
Material for the lesson
Video: Parallel Programming - problems
Video: Libraries to use and how they work
Video: Parent/Child relationships
Powerpoint: Programming
Video: What is expected from exercises
Exercises
Read a fasta file, find the complement strand of each entry, save result in new fasta file.
Note: The python you should use on pupil1 server is /opt/anaconda3_2021.11/bin/python3 - there are 2 pythons, so it is possible to choose wrong.
1)
In lesson 6, Distributed computing, you created 2-3 programs and used the Queueing System to submit your sub tasks. You are going to repeat this exercise, but this time do not use the QS to do the sub tasks, but use
subprocess and joblib as shown in the powerpoint example. The code is the same as second exercise except for the part that do the sub tasks, and waits for them to finish.
You must still use the QS to submit the initial main job.
2)
If you feel this was too easy, then the administrator should only index the fasta file and launch workers with the indexed info using joblib and an internal worker function.
The workers will create the complement strand entry as a file and the administrator will collect the pieces as in above. This is using techniques we have already covered. The result will be the same as above, but faster.
3)
For an extra challenge, then create a different worker function that writes the complement stand directly back into the file the right place as has been discussed earlier.
Here you need your own "private" fasta file, because you can not write in mine in /home/projects/pr_course.
Time everything to see the difference. Don't use more than 8 workers at a time - joblib helps with that.