Queueing System: Difference between revisions

Latest revision as of 13:19, 5 November 2025

Material for the lesson

Powerpoint: Queueing System
IUPAC nucleotide codes: Read before doing exercise

Exercises

The exercises must work with the human genome in the fastafile human.fsa and you can develop/test your program against the small scale humantest.fsa file. You must use the Queueing System and sbatch.

Make (or reuse) a program that reads a fasta file and finds the complement strand for each entry, and saves the result in a new file. Keep it simple. Make sure it works. Speed is not important in this step.
Speed is still not important. Add this functionality to your program: Count the bases and unknowns in the entry and add the counts to the header line, like >seq01 A:3450 T:45665 C:34576 G:142345 N:5462
You need to increase the performance of your program. Experiment with various ideas of how to increase the speed. You got some ideas last lecture and you should also be using your Python knowledge. Document your experiments with a line or two as comments in your program. This exercise is likely the one that takes the most time to complete and the shortest to run.

I solved this problem in 222 seconds on a server, however time vary, as I had also a run using 527 seconds with the same code - IO from other users can really affect your time. I had a "slow" version run on a server in over 1000 seconds. On my laptop with SSD I solved the problem in 100 seconds.

@@ Line 4: / Line 4: @@
 |}
 == Material for the lesson ==
+Powerpoint: [https://teaching.healthtech.dtu.dk/material/22112/22112_05-Queue.ppt Queueing System]<br>
+IUPAC nucleotide codes: [http://www.dnabaser.com/articles/IUPAC%20ambiguity%20codes.html Read before doing exercise]<br>
+<!--
 Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=d679e5c8-9372-4af7-9260-af270124ddf6 Introduction to the Queueing System]<br>
 Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=bca57da6-2c60-4218-96ac-af270124b673 Submitting jobs]<br>
 Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=e6e59e86-2928-4caf-b5c3-af2701248a3d Queue control and practical advice]<br>
-Powerpoint: [https://teaching.healthtech.dtu.dk/material/22112/22112_05-Queue.ppt Queueing System]<br>
-IUPAC nucleotide codes: [http://www.dnabaser.com/articles/IUPAC%20ambiguity%20codes.html Read before doing exercise]<br>
 Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=c82ed125-4664-4c5b-b23b-af1700719969 Exercises]
+-->
 == Exercises ==
 The exercises must work with the human genome in the fastafile human.fsa and you can develop/test your program against the small scale humantest.fsa file.
-Use/copy the jobscript-template.sh as a template for your runs with the big file. ''You must use the Queueing System and '''qsub'''''.
+''You must use the Queueing System and '''sbatch'''''.
 # Make (or reuse) a program that reads a fasta file and finds the complement strand for each entry, and saves the result in a new file. Keep it simple. Make sure it works. Speed is not important in this step.
@@ Line 19: / Line 21: @@
 # You need to increase the performance of your program. Experiment with various ideas of how to increase the speed. You got some ideas last lecture and you should also be using your Python knowledge. Document your experiments with a line or two as comments in your program. This exercise is likely the one that takes the most time to complete and the shortest to run.
-I solved this problem in 222 seconds on Computerome, however time vary, as I had also a run using 527 seconds with the same code - IO from other users can really affect your time. I had a "slow" version run on computerome in over 1000 seconds. On my laptop with SSD I solved the problem in 100 seconds.
+I solved this problem in 222 seconds on a server, however time vary, as I had also a run using 527 seconds with the same code - IO from other users can really affect your time. I had a "slow" version run on a server in over 1000 seconds. On my laptop with SSD I solved the problem in 100 seconds.

Queueing System: Difference between revisions

Latest revision as of 13:19, 5 November 2025

Material for the lesson

Exercises

Navigation menu

Search