2024 Program

Block 1: Bioinformatics for genomic medicine - variant calling and classification

Day 1: DNA sequencing and germline variant calling

When: April 22, 10.00-17.00

Where: Building 202, room R1014

Program:
10.00 - 10.15: Welcome (Carolina Barra Quaglia)
10.15 - 11.00: Bioinformatics in Genomic Medicine (Frederik Otzen Bagger)
11.00 - 11.15: Coffe Break
11.15 - 12.00: Somatic variant calling and RNA sequencing (Frederik Otzen Bagger)
12.00 - 13.00: Lunch
13.00 - 13.30: Introduction to germline variant characterization (Carolina Barra Quaglia)
13.30 - 17.00: Exercises in germline variant classification (Carolina Barra Quaglia)

Preparation:
Whole genome sequencing in clinical practice

Slides:
Welcome
Bioinformatics pipelines + Burrows Wheeler transformation
Variant Not Found

Material:
Burrows-Wheeler transformation explained (youtube video)

Extra Material on Protein Structure Visualisation:
Pymol software
Pymol basics
Pymol wiki
Pymol license file

Bioinformatics resources and databases:
ClinVar
Virtual Ribosome
UniProt
Variant effect predictor
AlphaFold
String

Exercise:
Burrows-Wheeler Transformation and GATK
Variant not found

Curriculum summary
Out of the material covered today, the following is in the curriculum for the exam:
* The following file formats: fastq, bam, sam, cram, vcf, vcf.gz (what’s in them)
* Of the file formats above, which should you store for your own data and why?
* The bioinformatics database and web services listed above (from a basic usage perspective)

Day 2: Clustering and RNA classifiers

When: April 23, 9.00-17.00

Where: Building 202, room R1014

Program:
9.00 - 10.00: Exercise summary + clinical variant interpretation (Majbritt Busk Madsen)
10.00 - 11.00: Molecular subtyping (Lars Rønn Olsen)
11.00 - 11.15: Coffe Break
11.00 - 12.00: Clustering + Classification (Lars Rønn Olsen)
12.00 - 13.00: Lunch
13.00 - 17.00: Exercises in clustering and classification (Carolina Barra Quaglia)

Slides:
Klinisk variant klassificering
Clustering
Classification

Material:
Principal component analysis explained
Paper describing bioinformatics pipelines for molecular subtyping of cancer
Paper: “Delivering precision oncology to patients with cancer”
Paper describing machine learning algorithms for breast cancer classification (very technical paper, but it provides a good overview - voluntary reading)

Exercise:
Clustering and classification
R script
R data

Curriculum summary
Out of the material covered today, the following is in the curriculum for the exam:
* Euclidean distance (and knowing that there are other distance metrics out there)
* Basic concept of data distributions
* Hierarchical agglomerative clustering
* knn classification
* Distance to centroid classification
* The fact that classifiers have tweakable parameters (e.g. the “k” in knn) and how to decide on the best settings
* Principal component analysis (in broad strokes, not the math of it)

Block 2: Bioinformatics for precision therapeutics - cancer immunotherapy

Day 3: Assessing chimeric antigen receptor therapy targets

When: April 24, 10.00-17.00

Where: Building 202, room R1014

Program:
Homework: Working with biological data in a digital format
…or an alternative approach
10.00 - 11.00: Bioinformatics tools for assessing CAR targets (Lars Rønn Olsen)
11.00 - 11.15: Coffe Break
11.00 - 12.00: Exercises in CAR target assessment (Lars Rønn Olsen)
12.00 - 13.00: Lunch
13.00 - 17.00: Exercises in CAR target assessment (Lars Rønn Olsen)

Slides:
Online bioinformatics tools for assessing CAR therapy targets

Material:
Blog post describing the prediction of signal peptides
Overview of post-translational modifications
Paper describing the importance of target isoforms in CAR cell therapy
Introduction to Xenabrowser

Bioinformatics resources and databases:
Xenabrowser
Clustal Omega
(Protein) BLAST
DeepLoc
SignalP
TopCons
BepiPred
NetPhosP
UniProt

Exercise:
Evaluating targets for chimeric antigen receptor therapy

Curriculum summary
Out of the material covered today, the following is in the curriculum for the exam:
* The FASTA format and why we use it * The purpose of the web servers and databases listed above (if you use any of them for your final project, you will also need to be able to provide a high level explanation of how it works)

Day 4: Predicting T cell epitopes

When: April 25, 9.00-15.00

Where: Building 202, room R1014

Program:
9.00 - 10.00: MHC binding (Carolina Barra Quaglia)
10.00 - 11.00: Neural networks and MHC binding predictions (Carolina Barra Quaglia)
11.00 - 12.00: Immunopeptidomics and the IEDB (Carolina Barra Quaglia)
12.00 - 13.00: Lunch
13.00 - 15.00: Exercises in prediction of T cell Epitopes (Carolina Barra Quaglia)

Slides:
Predicting T cell epitopes

Material:
T cell epitope prediction book chapter

Bioinformatics resources and databases:
Sequence Logo generator
Predictor of MHC-I presented peptides
Predictor of MHC-II presented peptides
Allele Frequencies

Exercise:
Making sequence logo
Neural network exercise
Immune Epitope Database exercise
Prediction of T cell epitopes - answers

Curriculum summary
Out of the material covered today, the following is in the curriculum for the exam:
* What is visualized using logo plots and what do the axes of the plot mean
* How are HLA binders predicted? (General method and training data)
* What is the content of the immune epitope database and what is the source of these data?

Block 3: Fundamental bioinformatics algorithms: BLAST and MSA

Day 5: Pairwise sequence alignment

When: April 29, 10.00-17.00

Where: Self study and Zoom

Program:
10.00 - 13.00: Video lecture and exercises
16.00 - 17.00: Q&A time on Zoom (Carolina Barra Quaglia)

Exercises
Pairwise alignment - answers -

Curriculum summary
Out of the material covered today, the following is in the curriculum for the exam:
* What is achieved using pairwise sequence alignment
* How is the output interpreted - particularly the presence of gaps

Day 6: BLAST

When: April 30, 9.00-14.00

Where: Self study and Zoom

Program:
9.00 - 12.00: Video lecture Blast
13.00 - 14.00: Q&A time on Zoom (Carolina Barra Quaglia)

Exercises
BLAST
Answers

Curriculum summary
Out of the material covered today, the following is in the curriculum for the exam:
* What is achieved using BLAST?
* What is an e value?
* What is % identity?

Block 4: Project work

Introduction to project work

When: May 1 - May 30

Where: Self-organized

Program:
Q&A times on Zoom (Carolina Barra Quaglia):
TBD

Exam

When: Submission of written assignment in Digital Eksamen: 12:00 noon, Monday, May 27, 2024

When: Oral exam May 31, 9.00-12.00 (exact program TBD)

Where: DTU, building 202