ExUniProt-answers

From 22140
Revision as of 12:13, 5 March 2024 by WikiSysop (talk | contribs) (Created page with "== Answers to "Exercise: Protein databases" == The numbers are found using UniProt on Feb 10, 2017 (release 2017_01). ===Simple text mining=== '''QUESTION 1:''' # How many hits do you find? <br>3150 # How many of these hits are from Swiss-Prot? <br>1254 # Can you identify the correct hit (''i.e.'' see which one is actually human insulin and not something else)? <br>It's P01308 / INS_HUMAN (among the first ten hits). '''QUESTION 2:''' How many hits are now left? How m...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Answers to "Exercise: Protein databases"

The numbers are found using UniProt on Feb 10, 2017 (release 2017_01).

Simple text mining

QUESTION 1:

  1. How many hits do you find?
    3150
  2. How many of these hits are from Swiss-Prot?
    1254
  3. Can you identify the correct hit (i.e. see which one is actually human insulin and not something else)?
    It's P01308 / INS_HUMAN (among the first ten hits).

QUESTION 2: How many hits are now left? How many of these are from Swiss-Prot?
1298 and 895

QUESTION 3: How many hits are now left? How many of these are from Swiss-Prot?
195 and 60

QUESTION 4: How many hits are now left?
100

QUESTION 5:

  1. How did you do this?
    by adding NOT name:receptor to the query box.
  2. How many hits are now left?
    48

The contents of UniProt

QUESTION 6:

  1. How many references are there in the insulin entry?
    36
  2. Why do you think insulin is such a highly investigated protein?
    Because it is linked to a common and serious disease (diabetes) and used as a drug.

QUESTION 7:

  1. Where do you find insulin?
    It is secreted from the cell (this is written just below the section heading. Under GO - Cellular component you can find additional locations mentioned, such as endoplasmic reticulum lumen, but these are temporary stages on the way to secretion).
  2. Why do you think is it found there?
    Because it is a hormone - it has to travel through the bloodstream to influence other cells.

QUESTION 8: How long is the signal peptide and the propeptide, respectively?
24 and 31 amino acids.

QUESTION 9: Which positions are in β-sheet conformation in insulin?
Positions 26-29, 48-50, 74-76, and 98-101.

Other databases linked from Swiss-Prot

No questions asked here.

Advanced search

QUESTION 10: How many proteins did you find, and what was the search string (the text in the search field)?
5,186,371
annotation:(type:signal)

QUESTION 11: How many proteins do you find now, and what has the search string changed into?
3486
annotation:(type:signal evidence:experimental)

QUESTION 12: How many proteins do you find now, and what is the search string?
707
annotation:(type:signal evidence:experimental) AND organism:"Homo sapiens (Human) [9606]"

QUESTION 13 a: How many proteins are there in UniProt from Neisseria gonorrhoeae with the default TaxID [485]?
9203

QUESTION 13 b: How many proteins are there in UniProt from Neisseria gonorrhoeae in total (all strains and subspecies)?
18,596 (twice as many)

QUESTION 13 c: What does the search string look like now?
taxonomy:"Neisseria gonorrhoeae [485]".

QUESTION 14: How many proteins of maximum length 10 do you find?
32,090
length:[1 TO 10]

QUESTION 15: How many proteins are now left?
1280
length:[1 TO 10] existence:"evidence at protein level"

QUESTION 16: How many proteins are now left?
830
length:[1 TO 10] existence:"evidence at protein level" fragment:no

QUESTION 17: How many human non-fragment proteins of maximum length 10 do you find in UniProt?
5
length:[1 TO 10] existence:"evidence at protein level" fragment:no AND organism:"Human [9606]"

QUESTION 18: Here they are in FASTA format:

>sp|P01358|GAJU_HUMAN Gastric juice peptide 1 OS=Homo sapiens PE=1 SV=1
LAAGKVEDSD
>sp|P02728|GLEM_HUMAN Erythrocyte membrane glycopeptide OS=Homo sapiens PE=1 SV=1
CEGHSHDHGA
>sp|P02729|GLUR_HUMAN Urine glycopeptide OS=Homo sapiens PE=1 SV=1
CEHSHDGA
>sp|P22103|PNEU_HUMAN Pneumadin OS=Homo sapiens PE=1 SV=1
AGEPKLDAGV
>sp|P01858|TUFT_HUMAN Phagocytosis-stimulating peptide OS=Homo sapiens PE=1 SV=1
TKPR