Link collection
Jump to navigation
Jump to search
Taxonomy
- Tree of Life http://www.tolweb.org/
- (Good descriptive Taxonomy database — limited range of organisms).
- NCBI Taxonomy http://www.ncbi.nlm.nih.gov/Taxonomy/
- (Somewhat "technical" but very exhaustive taxonomical database. TaxIDs are also used in GenBank and UniProt).
- The "Common Tree" function can be used to investigate how closely related two or more organisms are: http://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi
- NCBI search with "Token set" can be used if you do not know the Latin name: http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi
DNA Databases
- GenBank
- Search page: https://www.ncbi.nlm.nih.gov/nucleotide
- SGD (Saccharomyces Genome Database) http://www.yeastgenome.org
- (The Baker's yeast genome)
- Gene https://www.ncbi.nlm.nih.gov/gene/
- Database of genes in completely sequenced genomes and their phenotypes.
Translation
- Virtual Ribosome
- https://services.healthtech.dtu.dk/services/VirtualRibosome-2.0/
- "The Genetic Codes" (NCBI) https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c
- Information about translation codes
Protein databases
Protein sequence and annotations
- UniProt https://www.uniprot.org
Protein 3D structure
- PDB (Protein Data Bank) http://www.rcsb.org/
Protein domains
- InterPro https://www.ebi.ac.uk/interpro/
Alignment
Pairwise alignment
- Pairwise alignment (global and local) http://www.ebi.ac.uk/emboss/
- Use "Needle" for global alignment and "Water" for local alignment.
- Shuffle a sequence in random order (to get a null model)
- Protein: http://www.bioinformatics.org/sms2/shuffle_protein.html
- DNA: http://www.bioinformatics.org/sms2/shuffle_dna.html
Multiple alignment
The multiple alignment programs MUSCLE and Clustal Omega are built into Seaview, which should be installed on your computer.
- Other multiple alignment methods on EBI's server
- T-Coffee https://www.ebi.ac.uk/jdispatcher/msa/tcoffee
- MAFFT https://www.ebi.ac.uk/jdispatcher/msa/mafft
- Kalign https://www.ebi.ac.uk/jdispatcher/msa/kalign/
- RevTrans
- Special method for aligning coding DNA. https://services.healthtech.dtu.dk/services/RevTrans-2.0/
Phylogenetic trees
Seaview can draw simple trees, but if you need more options and annotations, go to:
- interactive Tree Of Life (iTOL)
- https://itol.embl.de/
BLAST
Note: Most sequence databases, including UniProt and RCSB PDB, offer an option for doing BLAST searches. In the course we have used NCBI's BLAST, since NCBI has the largest selection of databases and is the home of GenBank.
- NCBI BLAST
- https://blast.ncbi.nlm.nih.gov/Blast.cgi
- BLASTN: Choose "nucleotide blast" and "blastn" on the next page.
- NB: We do not use "megablast" in this course (it is constructed for finding sequences that are very similar).
- BLASTP: Choose "protein blast" and "blastp" on the next page.
- Note the information about conserved protein domains near the top of the results page. Click the domain to see further information.
Remember for BLASTN and BLASTP to choose a relevant database (use NR/NT to get the grand overview; but use PDB for structures, or specify an organism or taxonomic group under Organism if it makes sense for your task).
- PSI-BLAST
- Go to NCBI BLAST (see above) and choose "Protein blast" — on the next page you can then choose PSI-BLAST.
Weight matrices and sequence logos
- WebLogo http://weblogo.berkeley.edu/
- A good general-purpose logo generator for BOTH DNA and peptide sequences.
- Alternate link to version 3 (lacks some options): http://weblogo.threeplusone.com/
- Seq2Logo
- A more advanced method for working with peptide sequences. https://services.healthtech.dtu.dk/services/Seq2Logo-2.0/
- EasyPred
- Make a logo AND train a weight matrix using clustering and pseudocounts. https://services.healthtech.dtu.dk/services/EasyPred-1.0/