Parallel programming

From 22112
Revision as of 10:28, 17 June 2024 by WikiSysop (talk | contribs) (→‎Material for the lesson)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Previous: Algorithms Next: More parallelism

Material for the lesson

Video: Parallel Programming - problems
Video: Libraries to use and how they work
Video: Parent/Child relationships
Powerpoint: Programming
Video: What is expected from exercises

Exercises

Read a fasta file, find the complement strand of each entry, save result in new fasta file.
Note: The python you should use on pupil1 server is /opt/anaconda3_2021.11/bin/python3 - there are 2 pythons, so it is possible to choose wrong.
1)
In lesson 6, Distributed computing, you created 2-3 programs and used the Queueing System to submit your sub tasks. You are going to repeat this exercise, but this time do not use the QS to do the sub tasks, but use subprocess and joblib as shown in the powerpoint example. The code is the same as second exercise except for the part that do the sub tasks, and waits for them to finish.
You must still use the QS to submit the initial main job.

2)
If you feel this was too easy, then the administrator should only index the fasta file and launch workers with the indexed info using joblib and an internal worker function. The workers will create the complement strand entry as a file and the administrator will collect the pieces as in above. This is using techniques we have already covered. The result will be the same as above, but faster.

3)
For an extra challenge, then create a different worker function that writes the complement stand directly back into the file the right place as has been discussed earlier. Here you need your own "private" fasta file, because you can not write in mine in /home/projects/pr_course.


Time everything to see the difference. Don't use more than 8 workers at a time - joblib helps with that.