<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22111/index.php?action=history&amp;feed=atom&amp;title=Exercise%3A_The_protein_database_UniProt</id>
	<title>Exercise: The protein database UniProt - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22111/index.php?action=history&amp;feed=atom&amp;title=Exercise%3A_The_protein_database_UniProt"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=Exercise:_The_protein_database_UniProt&amp;action=history"/>
	<updated>2026-05-03T20:47:21Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22111/index.php?title=Exercise:_The_protein_database_UniProt&amp;diff=641&amp;oldid=prev</id>
		<title>Carol: /* The contents of UniProt */</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=Exercise:_The_protein_database_UniProt&amp;diff=641&amp;oldid=prev"/>
		<updated>2025-10-30T12:04:37Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;The contents of UniProt&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 14:04, 30 October 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l108&quot;&gt;Line 108:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 108:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:How long is the signal peptide and the propeptide, respectively?&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:How long is the signal peptide and the propeptide, respectively?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*Under &amp;lt;u&amp;gt;Structure&amp;lt;/u&amp;gt;, the subsection &amp;lt;u&amp;gt;Features&amp;lt;/u&amp;gt; shows the secondary structure elements &amp;quot;Helix&amp;quot; (&amp;amp;alpha;-helix), &amp;quot;Beta strand&amp;quot; (part of a &amp;amp;beta;-pleated sheet) or &amp;quot;Turn&amp;quot;. The regions without specified secondary structure are often called &amp;quot;Loop&amp;quot; or &amp;quot;Coil&amp;quot;. &amp;#039;&amp;#039;&amp;#039;CORRECTION 2024&amp;#039;&amp;#039;&amp;#039;: With the latest update of the UniProt interface, you need to go to the top of the window and select &amp;lt;u&amp;gt;Feature viewer&amp;lt;/u&amp;gt; to see the secondary structure annotations! Click &amp;lt;u&amp;gt;Structural features&amp;lt;/u&amp;gt; on the left to see helices, strands, and turns. Click each coloured box to see positions.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*Under &amp;lt;u&amp;gt;Structure&amp;lt;/u&amp;gt;, the subsection &amp;lt;u&amp;gt;Features&amp;lt;/u&amp;gt; shows the secondary structure elements &amp;quot;Helix&amp;quot; (&amp;amp;alpha;-helix), &amp;quot;Beta strand&amp;quot; (part of a &amp;amp;beta;-pleated sheet) or &amp;quot;Turn&amp;quot;. The regions without specified secondary structure are often called &amp;quot;Loop&amp;quot; or &amp;quot;Coil&amp;quot;. &amp;#039;&amp;#039;&amp;#039;CORRECTION 2024&amp;#039;&amp;#039;&amp;#039;: With the latest update of the UniProt interface, you need to go to the top of the window and select &amp;lt;u&amp;gt;Feature viewer&amp;lt;/u&amp;gt; to see the secondary structure annotations! Click &amp;lt;u&amp;gt;Structural features&amp;lt;/u&amp;gt; on the left to see helices, strands, and turns. Click each coloured box to see positions.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Image:Office-notes-line_drawing.png|30px|left]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Image:Office-notes-line_drawing.png|30px|left]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:&amp;#039;&amp;#039;&amp;#039;QUESTION 2.4:&amp;#039;&amp;#039;&amp;#039;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:&amp;#039;&amp;#039;&amp;#039;QUESTION 2.4:&amp;#039;&amp;#039;&amp;#039;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:Which positions are in &amp;amp;beta;-sheet conformation in insulin?&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:Which positions are in &amp;amp;beta;-sheet conformation in insulin?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;--&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Other databases linked from UniProt===&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===Other databases linked from UniProt===&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Carol</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22111/index.php?title=Exercise:_The_protein_database_UniProt&amp;diff=294&amp;oldid=prev</id>
		<title>Henni: /* The contents of UniProt */</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=Exercise:_The_protein_database_UniProt&amp;diff=294&amp;oldid=prev"/>
		<updated>2024-09-17T13:06:50Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;The contents of UniProt&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 15:06, 17 September 2024&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l108&quot;&gt;Line 108:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 108:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:How long is the signal peptide and the propeptide, respectively?&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;:How long is the signal peptide and the propeptide, respectively?&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*Under &amp;lt;u&amp;gt;Structure&amp;lt;/u&amp;gt;, the subsection &amp;lt;u&amp;gt;Features&amp;lt;/u&amp;gt; shows the secondary structure elements &quot;Helix&quot; (&amp;amp;alpha;-helix), &quot;Beta strand&quot; (part of a &amp;amp;beta;-pleated sheet) or &quot;Turn&quot;. The regions without specified secondary structure are often called &quot;Loop&quot; or &quot;Coil&quot;.  &lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;*Under &amp;lt;u&amp;gt;Structure&amp;lt;/u&amp;gt;, the subsection &amp;lt;u&amp;gt;Features&amp;lt;/u&amp;gt; shows the secondary structure elements &quot;Helix&quot; (&amp;amp;alpha;-helix), &quot;Beta strand&quot; (part of a &amp;amp;beta;-pleated sheet) or &quot;Turn&quot;. The regions without specified secondary structure are often called &quot;Loop&quot; or &quot;Coil&quot;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;. &#039;&#039;&#039;CORRECTION 2024&#039;&#039;&#039;: With the latest update of the UniProt interface, you need to go to the top of the window and select &amp;lt;u&amp;gt;Feature viewer&amp;lt;/u&amp;gt; to see the secondary structure annotations! Click &amp;lt;u&amp;gt;Structural features&amp;lt;/u&amp;gt; on the left to see helices, strands, and turns. Click each coloured box to see positions&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Image:Office-notes-line_drawing.png|30px|left]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Image:Office-notes-line_drawing.png|30px|left]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Henni</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22111/index.php?title=Exercise:_The_protein_database_UniProt&amp;diff=76&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;Exercise written by: Henrik Nielsen - updated by Morten Nielsen and Rasmus Wernersson  UniProt logo as of 2010 - source: http://www.uniprot.org __TOC__ In this exercise, we shall extract information from the protein database, Uniprot. This database is administrated in collaboration between [http://www.isb-sib.ch/ Swiss Institute of Bioinformatics (SIB)], [http://www.ebi.ac.uk/ European Bioinformatics Institute (EBI)], England, and [ht...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=Exercise:_The_protein_database_UniProt&amp;diff=76&amp;oldid=prev"/>
		<updated>2024-03-13T16:03:59Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Exercise written by: Henrik Nielsen - updated by Morten Nielsen and Rasmus Wernersson  &lt;a href=&quot;/22111/index.php/File:Uniprotlogo.gif&quot; title=&quot;File:Uniprotlogo.gif&quot;&gt;right|frame|UniProt logo as of 2010 - source: http://www.uniprot.org&lt;/a&gt; __TOC__ In this exercise, we shall extract information from the protein database, Uniprot. This database is administrated in collaboration between [http://www.isb-sib.ch/ Swiss Institute of Bioinformatics (SIB)], [http://www.ebi.ac.uk/ European Bioinformatics Institute (EBI)], England, and [ht...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Exercise written by: Henrik Nielsen - updated by Morten Nielsen and Rasmus Wernersson&lt;br /&gt;
&lt;br /&gt;
[[File:Uniprotlogo.gif|right|frame|UniProt logo as of 2010 - source: http://www.uniprot.org]]&lt;br /&gt;
__TOC__&lt;br /&gt;
In this exercise, we shall extract information from the protein database, Uniprot. This database is administrated in collaboration between [http://www.isb-sib.ch/ Swiss Institute of Bioinformatics (SIB)], [http://www.ebi.ac.uk/ European Bioinformatics Institute (EBI)], England, and [http://www.georgetown.edu/ Georgetown University], Washington DC, USA.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
UniProt, http://www.uniprot.org/,  consists of three parts:&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;UniProt Knowledge-base&amp;#039;&amp;#039;&amp;#039; (UniProtKB) &lt;br /&gt;
**protein sequences with annotation and references&lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;UniProt Reference Clusters&amp;#039;&amp;#039;&amp;#039; (UniRef) &lt;br /&gt;
**homology-reduced database, where similar sequences (having a certain percentage identity) are merged into clusters, each with a representative sequence &lt;br /&gt;
*&amp;#039;&amp;#039;&amp;#039;UniProt Archive&amp;#039;&amp;#039;&amp;#039; (UniParc) &lt;br /&gt;
**an archive containing all versions of Uniprot without annotations&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Image:Emblem-important_tiny.png‎|left]]&lt;br /&gt;
Of these databases, &amp;#039;&amp;#039;&amp;#039;Uniprot Knowledge-base is the most useful&amp;#039;&amp;#039;&amp;#039;, and this is the database we shall be using today. Uniprot Knowledge-base consists of two parts:&lt;br /&gt;
*UniProtKB/&amp;#039;&amp;#039;&amp;#039;Swiss-Prot&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
**a manually annotated (reviewed) protein-database.&lt;br /&gt;
*UniProtKB/&amp;#039;&amp;#039;&amp;#039;TrEMBL&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
**a computer-annotated supplement to Swiss-Prot, that contains all translations of EMBL nucleotide sequences not yet included in Swiss-Prot.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Simple text mining==&lt;br /&gt;
First, we will find some UniProt entries using simple text mining. You are supposed to find the entry for human insulin.&lt;br /&gt;
&lt;br /&gt;
*Open the UniProt home-page https://www.uniprot.org/&lt;br /&gt;
*Type &amp;#039;&amp;#039;&amp;#039;&amp;lt;tt&amp;gt;human insulin&amp;lt;/tt&amp;gt;&amp;#039;&amp;#039;&amp;#039; in the search field in the top of the page. Leave the search menu on &amp;quot;&amp;lt;u&amp;gt;UniProtKB&amp;lt;/u&amp;gt;&amp;quot;, which is default. Press Enter or click the &amp;lt;u&amp;gt;Search&amp;lt;/u&amp;gt; button.&lt;br /&gt;
*If you are new to UniProt, you will be asked whether you want to view your results as &amp;quot;Cards&amp;quot; or &amp;quot;Table&amp;quot;. Choose &amp;quot;Table&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 1.1:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
:#How many hits do you find? (tip: See the number above the results list)&lt;br /&gt;
:#How many of these hits are from Swiss-Prot? (tip: See under &amp;quot;&amp;lt;u&amp;gt;Reviewed&amp;lt;/u&amp;gt;&amp;quot; at the top left)&lt;br /&gt;
:#Can you identify the correct hit (&amp;#039;&amp;#039;i.e.&amp;#039;&amp;#039; see which one is actually human insulin and not something else)? If yes, write down is Accession code and Entry name (also called ID).&lt;br /&gt;
&lt;br /&gt;
In this case, it was relatively easy to spot the correct hit, but sometimes it is more difficult. If you do not identify the correct hit immediately, it will often help to narrow down the search, and that is exactly what we ask you to do in the next four questions. &lt;br /&gt;
&lt;br /&gt;
The first step is searching for proteins that actually come from the &amp;#039;&amp;#039;organism&amp;#039;&amp;#039; &amp;quot;human&amp;quot; and are &amp;#039;&amp;#039;named&amp;#039;&amp;#039; something containing the word &amp;quot;insulin&amp;quot;, as opposed to just containing the words &amp;quot;human&amp;quot; and &amp;quot;insulin&amp;quot; somewhere in the entry. &lt;br /&gt;
&amp;lt;!-- This can be done very easily: To the left of the results list under &amp;lt;u&amp;gt;Search terms&amp;lt;/u&amp;gt; you find a list of links that allow you to restrict the search to specific fields (you may have to scroll down a bit). --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
On the left, you can see a list of &amp;quot;Model organisms&amp;quot;. Try to click &amp;quot;Human&amp;quot;.&lt;br /&gt;
&amp;lt;!-- *Under &amp;lt;u&amp;gt;Filter &amp;quot;human&amp;quot; as:&amp;lt;/u&amp;gt; click on: &amp;lt;u&amp;gt;organism&amp;lt;/u&amp;gt;. --&amp;gt;&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 1.2:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many hits are now left? How many of these are from Swiss-Prot? &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- *Under &amp;lt;u&amp;gt;Filter &amp;quot;insulin&amp;quot; as:&amp;lt;/u&amp;gt; click on: &amp;lt;u&amp;gt;protein name&amp;lt;/u&amp;gt;.   --&amp;gt;&lt;br /&gt;
However, to really solve the problem, we have to enter Advanced mode. Click on &amp;lt;u&amp;gt;Advanced&amp;lt;/u&amp;gt; in the right part of the search field. Search for &amp;lt;tt&amp;gt;human&amp;lt;/tt&amp;gt; in the &amp;lt;u&amp;gt;Organism [OS]&amp;lt;/u&amp;gt; field, then click &amp;lt;u&amp;gt;Add field&amp;lt;/u&amp;gt; and search for &amp;lt;tt&amp;gt;insulin&amp;lt;/tt&amp;gt; in the &amp;lt;u&amp;gt;Protein Name [DE]&amp;lt;/u&amp;gt; field.&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 1.3:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
:How many hits are now left? How many of these are from Swiss-Prot? And what has the search string in the text box at the top of the page now turned into? &lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
Note that all selections made with the mouse are shown in text format in the search field at the top of the page. It is possible to edit the search criteria manually in this field to make them broader or more narrow. &lt;br /&gt;
*Try for instance to exclude proteins that are not insulin, but only insulin-like. You do this by adding the following text in the search field: &amp;#039;&amp;#039;&amp;#039;&amp;lt;tt&amp;gt;NOT name:insulin-like&amp;lt;/tt&amp;gt;&amp;#039;&amp;#039;&amp;#039; and click on the &amp;lt;u&amp;gt;Search&amp;lt;/u&amp;gt; button. &lt;br /&gt;
--&amp;gt;&lt;br /&gt;
Now, you should exclude proteins that are not insulin, but only insulin-like. Open the &amp;lt;u&amp;gt;Advanced&amp;lt;/u&amp;gt; menu again, add a field, make sure it is combined by &amp;lt;u&amp;gt;NOT&amp;lt;/u&amp;gt; instead of &amp;lt;u&amp;gt;AND&amp;lt;/u&amp;gt;, and remove hits that have &amp;lt;tt&amp;gt;insulin-like&amp;lt;/tt&amp;gt; in the protein name.&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 1.4:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
:How many hits are now left? How many of these are from Swiss-Prot? And what is the search string?&lt;br /&gt;
&lt;br /&gt;
Note that you can also edit the search string directly, instead of going through the Advaced menu every time.&lt;br /&gt;
*Try now to exclude proteins that are insulin receptors (or substrates for insulin receptors). &lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 1.5:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:#How did you do this? &lt;br /&gt;
:#How many hits are now left? How many of these are from Swiss-Prot?&lt;br /&gt;
&lt;br /&gt;
==The contents of UniProt==&lt;br /&gt;
&lt;br /&gt;
We shall now see what information is contained in a UniProt entry, and what further information is available as links in each entry.&lt;br /&gt;
&lt;br /&gt;
Click on the accession-code or ID for insulin. This will take you to the insulin entry in the UniProtKB/Swiss-Prot database. Spend some time to get an overview of the page and the information it contains.&lt;br /&gt;
&lt;br /&gt;
*Note that you can click on the headings in the left side of the page to scroll to different sections of the page. Try it!&lt;br /&gt;
&lt;br /&gt;
*Note also that every time there is a small &amp;quot;&amp;lt;u&amp;gt;i&amp;lt;/u&amp;gt;&amp;quot; after a term on the page, you can click it to get information about the term. Try it!&lt;br /&gt;
&lt;br /&gt;
Now click on &amp;lt;u&amp;gt;Publications&amp;lt;/u&amp;gt; in the top part of the window. Click on &amp;lt;u&amp;gt;UniProtKB/Swiss-Prot&amp;lt;/u&amp;gt; under &amp;lt;u&amp;gt;Source&amp;lt;/u&amp;gt; to show only those references that are part of the entry and exclude those that are &amp;quot;computationally mapped&amp;quot;. Note that it is indicated what each reference has contributed (&amp;quot;&amp;lt;u&amp;gt;Cited for&amp;lt;/u&amp;gt;&amp;quot;). You can get to the PubMed literature database at NCBI by clicking at the link &amp;quot;&amp;lt;u&amp;gt;PubMed&amp;lt;/u&amp;gt;&amp;quot; for a reference — try this. The abstract of a publication can be read here (or directly in UniProt using the &amp;quot;&amp;lt;u&amp;gt;View abstract&amp;lt;/u&amp;gt;&amp;quot;-link), if the work is an actual published article and not a &amp;quot;direct submission&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 2.1:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:#How many references are there in the insulin entry? &lt;br /&gt;
:#Why do you think insulin is such a highly investigated protein? (Hint: see other sections of the entry, &amp;#039;&amp;#039;e.g.&amp;#039;&amp;#039; &amp;lt;u&amp;gt;Function&amp;lt;/u&amp;gt; and &amp;lt;u&amp;gt;Disease &amp;amp; Drugs&amp;lt;/u&amp;gt;, especially the subsections &amp;lt;u&amp;gt;Involvement in disease&amp;lt;/u&amp;gt; and &amp;lt;u&amp;gt;Pharmaceutical&amp;lt;/u&amp;gt;)&lt;br /&gt;
&lt;br /&gt;
*Scroll back to &amp;lt;u&amp;gt;Function&amp;lt;/u&amp;gt; and read the free-text description at the top of the section. Also have a look at the controlled vocabulary annotations: &amp;quot;Gene Ontology&amp;quot; (&amp;lt;u&amp;gt;GO&amp;lt;/u&amp;gt;) and &amp;lt;u&amp;gt;Keywords&amp;lt;/u&amp;gt;. Note that both of these are split into two different aspects: &amp;lt;u&amp;gt;Molecular function&amp;lt;/u&amp;gt; and &amp;lt;u&amp;gt;Biological process&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
*Now scroll to &amp;lt;u&amp;gt;Subcellular Location&amp;lt;/u&amp;gt; and read what is written there. Note that you find another set of &amp;quot;Gene Ontology&amp;quot; (&amp;lt;u&amp;gt;GO&amp;lt;/u&amp;gt;) and &amp;lt;u&amp;gt;Keywords&amp;lt;/u&amp;gt; annotations here; this time labelled &amp;lt;u&amp;gt;Cellular component&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 2.2:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:#Where in the cell / outside the cell do you find insulin? &lt;br /&gt;
:#Why do you think is it found there? (Hint: consider the function)&lt;br /&gt;
&lt;br /&gt;
Just like in GenBank, a UniProt entry has a &amp;#039;&amp;#039;Feature Table&amp;#039;&amp;#039; containing annotations that are coupled to specific parts of the sequence. In the default view, the Feature Table is not so easy to spot, since it is split up under different sections corresponding to the biological significance of the various annotations. However, in the top part of the window you can click on &amp;lt;u&amp;gt;Feature viewer&amp;lt;/u&amp;gt;, which shows the feature table information in a graphical form. Try it. Then click on &amp;lt;u&amp;gt;Molecule processing&amp;lt;/u&amp;gt; to show the signal peptide and the propeptide.&lt;br /&gt;
&lt;br /&gt;
Now switch back to the default (&amp;lt;u&amp;gt;Entry&amp;lt;/u&amp;gt;) view. In the following, you will see some examples of Feature Table annotations.&lt;br /&gt;
&lt;br /&gt;
*Under &amp;lt;u&amp;gt;Disease &amp;amp; Drugs&amp;lt;/u&amp;gt;, the subsection &amp;lt;u&amp;gt;Variants&amp;lt;/u&amp;gt; lists the variants (mutations) of insulin that have been described in the literature. Under the heading &amp;lt;u&amp;gt;Change&amp;lt;/u&amp;gt;, it is indicated which amino acid is changed into which other amino acid. If the variant is known to be associated with a disease, this is indicated under the heading &amp;lt;u&amp;gt;Description&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
*Under &amp;lt;u&amp;gt;PTM/Processing&amp;lt;/u&amp;gt;, the subsection &amp;lt;u&amp;gt;Features&amp;lt;/u&amp;gt; shows that insulin has both a signal peptide and a pro-peptide. These are both cleaved off before secretion. The mature insulin (the A and B chains) is hence much smaller than what was shown under &amp;lt;u&amp;gt;Sequences&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 2.3:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
:How long is the signal peptide and the propeptide, respectively?&lt;br /&gt;
&lt;br /&gt;
*Under &amp;lt;u&amp;gt;Structure&amp;lt;/u&amp;gt;, the subsection &amp;lt;u&amp;gt;Features&amp;lt;/u&amp;gt; shows the secondary structure elements &amp;quot;Helix&amp;quot; (&amp;amp;alpha;-helix), &amp;quot;Beta strand&amp;quot; (part of a &amp;amp;beta;-pleated sheet) or &amp;quot;Turn&amp;quot;. The regions without specified secondary structure are often called &amp;quot;Loop&amp;quot; or &amp;quot;Coil&amp;quot;. &lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 2.4:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
:Which positions are in &amp;amp;beta;-sheet conformation in insulin?&lt;br /&gt;
&lt;br /&gt;
===Other databases linked from UniProt===&lt;br /&gt;
&lt;br /&gt;
UniProt has many useful links to other databases. In the graphical view, the cross-references are spread among several different headings, just like the feature table is. &lt;br /&gt;
&lt;br /&gt;
Under the heading &amp;lt;u&amp;gt;Sequence &amp;amp; Isoform&amp;lt;/u&amp;gt;, there is a sub-heading named &amp;lt;u&amp;gt;Sequence databases&amp;lt;/u&amp;gt;. Here, you can e,g, find links to nucleotide sequences in the databases EMBL / GenBank / DDBJ. Try clicking one of the GenBank links marked &amp;quot;Genomic DNA&amp;quot;; that should take you to a page that looks like something you have seen last week.&lt;br /&gt;
&lt;br /&gt;
Under the heading &amp;lt;u&amp;gt;Structure&amp;lt;/u&amp;gt;, there is an interactive window showing a three-dimensional structure of insulin. Note that you can rotate the structure with your mouse. Actually, this structure is not part of UniProt itself, it is a cross-link to the protein structure database PDB. Below the interactive window, you can see the actual cross-links to PDB. Note that PDB is not one single database – just like it was the case for the nucleotide databases, there is a European version (PDBe), an American version (RCSB-PDB), and a Japanese version (PDBj), but luckily, they contain the same data. We will work with the American version of PDB later in the course. As you can see, there are many PDB structures of insulin; in other words, the 3D structure of insulin has been determined several times. &lt;br /&gt;
&lt;br /&gt;
Under the heading &amp;lt;u&amp;gt;Family &amp;amp; Domains&amp;lt;/u&amp;gt;, there is a subsection named &amp;lt;u&amp;gt;Family and domain databases&amp;lt;/u&amp;gt;. It has links to databases containing proteins that are similar (protein families). These have been collected using various techniques that you will hear about later in the course (multiple alignment). In some cases, the proteins are similar only in smaller parts (domains) but not in other parts, and in some cases the databases can tell which parts of the actual protein are known in other species. Some large proteins (not small ones like insulin) can contain several different parts (domains) each with their own evolutionary history. The most important of these databases is InterPro, because it collects the information from most of the other databases. Try to click on one of the InterPro links. This will take you to the Interpro page with lots of information about the protein family that insulin belongs to.&lt;br /&gt;
&lt;br /&gt;
===Text format===&lt;br /&gt;
&lt;br /&gt;
Until now, we have been working with the graphical user interface to UniProt. However, all the information is also available in plain text format, and that&amp;#039;s what you will be working with if you are going to analyze larger amounts of UniProt data later in your studies. For now, let&amp;#039;s just have a look at it. &lt;br /&gt;
&lt;br /&gt;
Scroll to the top of the Human Insulin page and find the menu labeled &amp;lt;u&amp;gt;Download&amp;lt;/u&amp;gt;. It looks like this: [[File:Download.png]]. Click it, and then &amp;#039;&amp;#039;right-click&amp;#039;&amp;#039; the option &amp;lt;u&amp;gt;Text&amp;lt;/u&amp;gt; and open it in a new tab. What you see here basically contains all the information you have seen in the graphical interface.&lt;br /&gt;
&lt;br /&gt;
Scroll through the plain text file and see if you can find the same information that you just found in the graphical interface. Note that every line starts with a two-letter code specifying the type of the information in the line. Here are some examples:&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;ID&amp;#039;&amp;#039;&amp;#039;: Entry name (ID). There is only one ID.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;AC&amp;#039;&amp;#039;&amp;#039;: Accession code. There may be more than one.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;DE&amp;#039;&amp;#039;&amp;#039;: Description (protein names).&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;GN&amp;#039;&amp;#039;&amp;#039;: Gene Name&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;OS&amp;#039;&amp;#039;&amp;#039;: Organism/Species.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;OC&amp;#039;&amp;#039;&amp;#039;: Organism Classification.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;OX&amp;#039;&amp;#039;&amp;#039;: TaxID (as defined in the NCBI Taxonomy database).&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;RN&amp;#039;&amp;#039;&amp;#039;, &amp;#039;&amp;#039;&amp;#039;RP&amp;#039;&amp;#039;&amp;#039;, &amp;#039;&amp;#039;&amp;#039;RX&amp;#039;&amp;#039;&amp;#039;, &amp;#039;&amp;#039;&amp;#039;RA&amp;#039;&amp;#039;&amp;#039;, &amp;#039;&amp;#039;&amp;#039;RT&amp;#039;&amp;#039;&amp;#039;, &amp;#039;&amp;#039;&amp;#039;RL&amp;#039;&amp;#039;&amp;#039;: References.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;CC&amp;#039;&amp;#039;&amp;#039;: Comments (annotations pertaining to the whole protein).&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;DR&amp;#039;&amp;#039;&amp;#039;: Cross-references to other databases.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;KW&amp;#039;&amp;#039;&amp;#039;: Keywords.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;FT&amp;#039;&amp;#039;&amp;#039;: Feature Table (annotations pertaining to specified parts of the sequence).&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;SQ&amp;#039;&amp;#039;&amp;#039;: Sequence header line.&lt;br /&gt;
&lt;br /&gt;
==Advanced search==&lt;br /&gt;
&lt;br /&gt;
The UniProt interface allows you to use most of the fields in the database for searching, not only the fields like name and organism, as we did previously, but also the functional and structural annotations. We shall now try a few of these. &lt;br /&gt;
&lt;br /&gt;
* Go back to UniProt&amp;#039;s main page, http://www.uniprot.org/. &amp;lt;!-- Go back to the main page of UniProt&amp;#039;s beta website, https://beta.uniprot.org/ .--&amp;gt; &amp;#039;&amp;#039;&amp;#039;Important:&amp;#039;&amp;#039;&amp;#039; If the search string from the previous search is still shown in the search field, clear it. Then click &amp;lt;u&amp;gt;Advanced&amp;lt;/u&amp;gt; to the right of the search field. This brings up a box with a new interface.&lt;br /&gt;
&lt;br /&gt;
* Now we will find out how many proteins have signal peptides (just like insulin has). In the drop-down menu that appears in the box, select &amp;lt;u&amp;gt;PTM/Processing&amp;lt;/u&amp;gt;, then select &amp;lt;u&amp;gt;Molecule Processing&amp;lt;/u&amp;gt;, then select &amp;lt;u&amp;gt;Signal peptide&amp;lt;/u&amp;gt;. In the empty field that now appears to the right of the word &amp;lt;u&amp;gt;Signal&amp;lt;/u&amp;gt;, type a &amp;lt;tt&amp;gt;*&amp;lt;/tt&amp;gt; (otherwise, it will not work). Click the &amp;lt;u&amp;gt;Search&amp;lt;/u&amp;gt; button.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.1:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins did you find, how many of them are from Swiss-Prot, and what was the search string (the text that appeared in the search field)?&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;Evidence:&amp;#039;&amp;#039; The proteins we find in this way include proteins that are &amp;#039;&amp;#039;predicted&amp;#039;&amp;#039; to have signal peptides, without necessarily having any experimental evidence for the signal peptides. We will now limit the search to experimentally confirmed signal peptides. Click &amp;lt;u&amp;gt;Advanced&amp;lt;/u&amp;gt; again (without erasing your previous search) and change the &amp;lt;u&amp;gt;Evidence&amp;lt;/u&amp;gt; menu to &amp;lt;u&amp;gt;Any experimental assertion&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.2:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins do you find now, how many of them are from Swiss-Prot, and what has the search string changed into? &lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;Combining fields:&amp;#039;&amp;#039; How many experimentally confirmed signal peptides are found in humans? Click on &amp;lt;u&amp;gt;Advanced Search&amp;lt;/u&amp;gt; again and click &amp;lt;u&amp;gt;Add field&amp;lt;/u&amp;gt; to get a second search line. Leave the menu to the left on &amp;lt;u&amp;gt;AND&amp;lt;/u&amp;gt;, select &amp;lt;u&amp;gt;Organism [OS]&amp;lt;/u&amp;gt; in the drop-down menu, type &amp;lt;tt&amp;gt;human&amp;lt;/tt&amp;gt; in the field &amp;lt;u&amp;gt;Term&amp;lt;/u&amp;gt;, accept the suggestion &amp;quot;Homo sapiens (Human) [9606]&amp;quot; and click the &amp;lt;u&amp;gt;Search&amp;lt;/u&amp;gt; button.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.3:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins do you find now, and what is the search string? (Note that you can always perform the search by editing the text in the search field — however to do this you need to know the names for the fields).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Important note&amp;#039;&amp;#039;&amp;#039; about the organism field: when you type some letters, a drop-down list with suggestions will come up. Each has a number in brackets — this is the TaxID, which you can also find in [http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi the NCBI Taxonomy Browser]. If you search for &amp;#039;&amp;#039;e.g.&amp;#039;&amp;#039; Human proteins, it is a good idea to include the TaxID; if you omit it and just write &amp;quot;human&amp;quot;, you will also find proteins from organisms like Human immunodeficiency virus (try it!). &lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
=== About strains and subspecies ===&lt;br /&gt;
Let us now try something different. If you search for proteins from a microbial species, you may run into trouble, because each subspecies or strain has its own TaxID, and you probably want all possible strains. Let&amp;#039;s try an example (&amp;#039;&amp;#039;&amp;#039;first, clear the previous search&amp;#039;&amp;#039;&amp;#039;): Say you want all proteins from the bacterium &amp;#039;&amp;#039;Bacillus subtilis&amp;#039;&amp;#039; — a very important production organism in biotechnology. Try to type &amp;lt;tt&amp;gt;Bacillus subtilis&amp;lt;/tt&amp;gt; in the &amp;lt;u&amp;gt;Organism [OS]&amp;lt;/u&amp;gt; field: you will see a suggestion named &amp;quot;Bacillus subtilis [1423]&amp;quot; – accept that.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.4:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins are there in UniProt from &amp;#039;&amp;#039;Bacillus subtilis&amp;#039;&amp;#039; with the default TaxID [1423]? How many of these are from Swiss-Prot? And what is the search string?&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- Now note the line above the results that says &amp;quot;Expand search &amp;quot;Bacillus subtilis [1423]&amp;quot; to include lower taxonomic ranks&amp;quot; – click it. --&amp;gt;&lt;br /&gt;
The number of entries in Swiss-Prot may seem low for such a well-studied organism. In addition, you may note that there is a link next to the total number of results saying &amp;quot;&amp;lt;u&amp;gt;or expand search to &amp;quot;1423&amp;quot; to include lower taxonomic ranks&amp;lt;/u&amp;gt;&amp;quot;. Click it. &lt;br /&gt;
&amp;lt;!-- Now do the same thing, but in the field &amp;lt;u&amp;gt;Taxonomy [OC]&amp;lt;/u&amp;gt; instead. --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.5:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins are there in UniProt from &amp;#039;&amp;#039;Bacillus subtilis&amp;#039;&amp;#039; in total (all strains and subspecies)? How many of these are from Swiss-Prot? And what is the search string?&lt;br /&gt;
&lt;br /&gt;
[[Image:Emblem-important_tiny.png‎|left]] In conclusion, use the field &amp;lt;u&amp;gt;Taxonomy [OC]&amp;lt;/u&amp;gt; instead of &amp;lt;u&amp;gt;Organism [OS]&amp;lt;/u&amp;gt; when working with microbial species where you want all strains.&lt;br /&gt;
&lt;br /&gt;
&amp;amp;nbsp;&lt;br /&gt;
&lt;br /&gt;
===Searching for short proteins===&lt;br /&gt;
&lt;br /&gt;
*&amp;#039;&amp;#039;Numerical field:&amp;#039;&amp;#039; Now we will try to answer a completely different question: Which extremely short proteins are present in UniProt? Clear the previous search. In the advanced drop-down menu, select &amp;lt;u&amp;gt;Sequence&amp;lt;/u&amp;gt; and then &amp;lt;u&amp;gt;Sequence length&amp;lt;/u&amp;gt;. Now two new fields appear where you can define the lower and upper limits for the search. Type &amp;lt;tt&amp;gt;1&amp;lt;/tt&amp;gt; and &amp;lt;tt&amp;gt;10&amp;lt;/tt&amp;gt; and search. &amp;#039;&amp;#039;&amp;#039;Note:&amp;#039;&amp;#039;&amp;#039; in your answers to the questions below, include the search string just like you did in the questions above!&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.6:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins of maximum length 10 do you find?&lt;br /&gt;
&lt;br /&gt;
*Extremely short proteins are often mistakes translated directly from a nucleotide sequence with no evidence for the sequences being protein coding. Limit your search to proteins that actually have evidence for their existence at the protein level (add a field, and set the drop-down menu to &amp;lt;u&amp;gt;Protein existence [PE]&amp;lt;/u&amp;gt; and select &amp;lt;u&amp;gt;Evidence at protein level&amp;lt;/u&amp;gt;). &lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.7:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins are now left?&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
&amp;lt;blockquote style=&amp;quot;background-color: lightyellow; border: solid thin grey;&amp;quot;&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;NOTE (February 2020):&amp;#039;&amp;#039;&amp;#039; UniProt currently has a bug related to the &amp;lt;u&amp;gt;Protein existence [PE]&amp;lt;/u&amp;gt; field. When you have made a search using &amp;lt;u&amp;gt;Protein existence&amp;lt;/u&amp;gt; and then click &amp;lt;u&amp;gt;Advanced&amp;lt;/u&amp;gt; again, it will convert the search to a search for &amp;quot;Evidence at protein level [1]&amp;quot; in &amp;lt;u&amp;gt;All&amp;lt;/u&amp;gt; fields. This will produce unexpected results. Therefore, you have to convert it &amp;#039;&amp;#039;manually&amp;#039;&amp;#039; back to a &amp;lt;u&amp;gt;Protein existence [PE]&amp;lt;/u&amp;gt; search before adding another criterion.&amp;lt;br&amp;gt; The bug has been reported to UniProt.&lt;br /&gt;
&amp;lt;/blockquote&amp;gt; &lt;br /&gt;
--&amp;gt;&lt;br /&gt;
*A large fraction of the proteins identified in this way are fragments. Try to exclude fragments from the search. Add a field. In the drop-down menu, choose &amp;lt;u&amp;gt;Sequence&amp;lt;/u&amp;gt;, then &amp;lt;u&amp;gt;Fragment&amp;lt;/u&amp;gt;, then &amp;lt;u&amp;gt;No&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.8:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many proteins are now left?&lt;br /&gt;
&lt;br /&gt;
*And how many of these proteins are found in humans?. Do as before...&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.9:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:How many human non-fragment proteins of maximum length 10 do you find in UniProt?&lt;br /&gt;
&lt;br /&gt;
*Finally you can save the results of your search. First, sort them by length by clicking on the column header. Then, click on &amp;lt;u&amp;gt;Download&amp;lt;/u&amp;gt; above the list of results. You can now save the search results in the format you prefer (try &amp;lt;u&amp;gt;FASTA (canonical)&amp;lt;/u&amp;gt; and click &amp;lt;u&amp;gt;Preview&amp;lt;/u&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]]&lt;br /&gt;
:&amp;#039;&amp;#039;&amp;#039;QUESTION 3.10:&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
:Copy the FASTA sequences to your report.&lt;br /&gt;
&lt;br /&gt;
== On your own ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Image:Office-notes-line_drawing.png|30px|left]] &amp;#039;&amp;#039;&amp;#039;QUESTION 4&amp;#039;&amp;#039;&amp;#039;: Now that you are proficient in UniProt searches, try the following:&lt;br /&gt;
&lt;br /&gt;
(As always, remember to write your search string in the answer).&lt;br /&gt;
# Find out how many proteins from &amp;#039;&amp;#039;Escherichia coli&amp;#039;&amp;#039; (all strains) there are in UniProt.&lt;br /&gt;
# How many of these are from the notorious pathogenic serotype [https://en.wikipedia.org/wiki/Escherichia_coli_O157:H7 O157:H7] (including its sub-strains)?&lt;br /&gt;
# Find insulin from as many organisms as possible, without including entries that are not insulin (&amp;#039;&amp;#039;&amp;#039;Hint&amp;#039;&amp;#039;&amp;#039;: If you attempt to do this with the Protein Name field only, it will require an unwieldy amount of kill-words. Therefore, take the gene name into account).&lt;br /&gt;
# Find alpha-globin (the alpha subunit of hemoglobin) from as many ruminants as possible (see the GenBank exercise).&lt;br /&gt;
# Find alpha-A globin and alpha-D globin from &amp;#039;&amp;#039;Columba livia&amp;#039;&amp;#039; (&amp;#039;&amp;#039;&amp;#039;Hint&amp;#039;&amp;#039;&amp;#039;: You can use a &amp;quot;*&amp;quot; to perform the search with one search string).&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>