Pairwise alignment

From 22113
Revision as of 16:09, 6 March 2024 by WikiSysop (talk | contribs) (Created page with "__NOTOC__ ===Description=== Aligning sequences is of great importance in bioinformatics. Many discoveries are based on finding sequences that align to each other. Evolution theory and phylogeny are based on sequence alignments. This project is about implementing a well-known algorithm for aligning two sequences, i.e. finding where they match in an optimal fashion. You must choose to implement either: # Smith-Waterman alignment where the goal is to find the best local al...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Description

Aligning sequences is of great importance in bioinformatics. Many discoveries are based on finding sequences that align to each other. Evolution theory and phylogeny are based on sequence alignments. This project is about implementing a well-known algorithm for aligning two sequences, i.e. finding where they match in an optimal fashion.

You must choose to implement either:

  1. Smith-Waterman alignment where the goal is to find the best local alignment of the two sequences given as input, i.e. the optimal alignment that covers most/best of both sequences.
  2. Needleman-Wunsch alignment where the goal is to find the best global alignment of the two sequences given as input, i.e. the optimal alignment that covers all of at least one sequence.
  3. Or both if you are cool :-)

Input and output

The input is just a fasta file with two sequences, that should be aligned.
The output should be the the best alignment with clear notation where it is in both sequence inputs.
Note: Pairwise alignment works for both DNA and protein sequences.

Examples of program execution:

align.py <fastafile>
align.py fastafile.fsa

Details

Fasta file: Similar dna sequences coding for insulin.
Wikipedia: Smith-Waterman alignment.
Wikipedia: Needleman-Wunsch alignment.
Google: book on alignment.
Note: Investigate substitution matrices, https://en.wikipedia.org/wiki/BLOSUM and https://en.wikipedia.org/wiki/Point_accepted_mutation