Microbial genomics exercise: Difference between revisions

From 22126
Jump to navigation Jump to search
No edit summary
No edit summary
Line 8: Line 8:
Although the image files are located in <code>/home/projects/microbial_genomics/singularity_image_files</code>, you do not need to call them directly. Instead, you can run the tools using the provided BASH executables in <code>/home/ctools/bin</code>, which is already available via your <code>$PATH</code>.
Although the image files are located in <code>/home/projects/microbial_genomics/singularity_image_files</code>, you do not need to call them directly. Instead, you can run the tools using the provided BASH executables in <code>/home/ctools/bin</code>, which is already available via your <code>$PATH</code>.
<pre>
<pre>
gtdbtk.sh
mlst.sh
parsnp.sh
abricate.sh
abricate.sh
parsnp.sh
mlst.sh
gtdbtk.sh
</pre>
</pre>


Line 32: Line 32:
</pre>
</pre>


== EX02: What species is it? ==
== EX02: What species is the genome ? ==


The laboratory has sequenced genomic DNA from single-colony isolates of bacteria cultivated from a clinical specimen. The sequence reads have been ''de novo'' assembled, and the genome assemblies are stored in FASTA-formatted files available in <code>/home/projects/microbial_genomics/genome_assemblies</code>.
The laboratory has sequenced genomic DNA from single-colony isolates of bacteria cultivated from a clinical specimen. The sequence reads have been ''de novo'' assembled, and the genome assemblies are stored in FASTA-formatted files available in <code>/home/projects/microbial_genomics/genome_assemblies</code>.
Your want to determine the bacterial species of the assembled genomes.
We can use the [https://github.com/Ecogenomics/GTDBTk GTDB-Tk tool] to assign taxonomic classifications to bacterial genomes based on the [https://gtdb.ecogenomic.org Genome Database Taxonomy (GTDB)].


Run <code>gtdbtk.sh -h</code> to get help information on how to use GTDB-Tk.
Run <code>gtdbtk.sh -h</code> to get help information on how to use GTDB-Tk.
Line 46: Line 50:
'''Question: What species are the genomes ?'''
'''Question: What species are the genomes ?'''


== EX03: What sequence type is the genome ? ==








Your goal is to determine the bacterial species of the assembled genomes.


We can use the [https://github.com/Ecogenomics/GTDBTk GTDB-Tk tool] to assign taxonomic classifications to bacterial genomes based on the [https://gtdb.ecogenomic.org Genome Database Taxonomy (GTDB)].





Revision as of 14:31, 6 January 2026


Dear course participants,

In this exercise you will analyse microbial genome sequences using bioinformatics tools that are commonly used for microbial diagnostics and research.

The tools are available at the server as Apptainer container images. Although the image files are located in /home/projects/microbial_genomics/singularity_image_files, you do not need to call them directly. Instead, you can run the tools using the provided BASH executables in /home/ctools/bin, which is already available via your $PATH.

gtdbtk.sh
mlst.sh
parsnp.sh
abricate.sh

You can use


Background

We imagine that we are employed at a hospital to provide diagnostics for patient care.

The laboratory has sequenced genomic DNA from a clinical specimen. The sequence reads are stored in FASTQ files that are compressed with gzip:

  • X
  • Y

Use the following command to read the first lines of one of the files and inspect its content:

zcat filename.fastq.gz | head

EX02: What species is the genome ?

The laboratory has sequenced genomic DNA from single-colony isolates of bacteria cultivated from a clinical specimen. The sequence reads have been de novo assembled, and the genome assemblies are stored in FASTA-formatted files available in /home/projects/microbial_genomics/genome_assemblies.

Your want to determine the bacterial species of the assembled genomes.

We can use the GTDB-Tk tool to assign taxonomic classifications to bacterial genomes based on the Genome Database Taxonomy (GTDB).

Run gtdbtk.sh -h to get help information on how to use GTDB-Tk.

Use GTDB-Tk to determine the species of the genomes:

gtdbtk.sh classify_wf --extension .fna --cpus 10 --genome_dir /home/projects/microbial_genomics/ex02_assemblies --out_dir $HOME/output

Question: What species are the genomes ?


EX03: What sequence type is the genome ?

Supplementary files

Article describing GTDB-Tk can be found here