<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22111/index.php?action=history&amp;feed=atom&amp;title=ExSeqLogosAnswers</id>
	<title>ExSeqLogosAnswers - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22111/index.php?action=history&amp;feed=atom&amp;title=ExSeqLogosAnswers"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=ExSeqLogosAnswers&amp;action=history"/>
	<updated>2026-04-14T05:20:47Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22111/index.php?title=ExSeqLogosAnswers&amp;diff=141&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;Answers to the Sequence logo exercise  &#039;&#039;&#039;Written by:&#039;&#039;&#039; [http://www.dtu.dk/service/telefonbog/person?id=18103&amp;cpid=214039&amp;tab=2&amp;qt=dtupublicationquery Rasmus Wernersson], April 2015 (latest update: April 2018)  = Q1 = Image:Q1_logo.png  * The &quot;GT&quot; consensus site is clearly seen (completely conserved: 2 bits of information), and it appears that there is also some signal on the exon side (preference for &quot;G&quot; on the position before) and on the intron side...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=ExSeqLogosAnswers&amp;diff=141&amp;oldid=prev"/>
		<updated>2024-03-15T09:31:26Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Answers to the &lt;a href=&quot;/22111/index.php/ExSeqLogos&quot; title=&quot;ExSeqLogos&quot;&gt;Sequence logo exercise&lt;/a&gt;  &amp;#039;&amp;#039;&amp;#039;Written by:&amp;#039;&amp;#039;&amp;#039; [http://www.dtu.dk/service/telefonbog/person?id=18103&amp;amp;cpid=214039&amp;amp;tab=2&amp;amp;qt=dtupublicationquery Rasmus Wernersson], April 2015 (latest update: April 2018)  = Q1 = &lt;a href=&quot;/22111/index.php/File:Q1_logo.png&quot; title=&quot;File:Q1 logo.png&quot;&gt;Image:Q1_logo.png&lt;/a&gt;  * The &amp;quot;GT&amp;quot; consensus site is clearly seen (completely conserved: 2 bits of information), and it appears that there is also some signal on the exon side (preference for &amp;quot;G&amp;quot; on the position before) and on the intron side...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Answers to the [[ExSeqLogos|Sequence logo exercise]]&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Written by:&amp;#039;&amp;#039;&amp;#039; [http://www.dtu.dk/service/telefonbog/person?id=18103&amp;amp;cpid=214039&amp;amp;tab=2&amp;amp;qt=dtupublicationquery Rasmus Wernersson], April 2015 (latest update: April 2018)&lt;br /&gt;
&lt;br /&gt;
= Q1 =&lt;br /&gt;
[[Image:Q1_logo.png]]&lt;br /&gt;
&lt;br /&gt;
* The &amp;quot;GT&amp;quot; consensus site is clearly seen (completely conserved: 2 bits of information), and it appears that there is also some signal on the exon side (preference for &amp;quot;G&amp;quot; on the position before) and on the intron side.&lt;br /&gt;
* The intron starts at position 11 - this means that position 1-10: EXON and 11-20: INTRON.&lt;br /&gt;
&lt;br /&gt;
= Q2 - pretty LOGO=&lt;br /&gt;
[[Image:Q2_logo.png]]&lt;br /&gt;
&lt;br /&gt;
= Q3 - frequency LOGO =&lt;br /&gt;
[[Image: Q3_logo.png]]&lt;br /&gt;
&lt;br /&gt;
= Q4 - cross species comparison =&lt;br /&gt;
[[Image:Emblem-important_tiny.png‎]] &amp;#039;&amp;#039;&amp;#039;IMPORTANT:&amp;#039;&amp;#039;&amp;#039; This question is the easiest to answer, if you compare the DONOR and ACCEPTOR site separately across the 5 species (it&amp;#039;s simply easier to spot the differences this way)&lt;br /&gt;
&lt;br /&gt;
== Donor sites ==&lt;br /&gt;
[[Image:Q4_hs_donor.png]]&lt;br /&gt;
[[Image:Q4_dmel_donor.png]]&lt;br /&gt;
[[Image:Q4_Athal_donor.png]]&lt;br /&gt;
[[Image:Q4_scer_donor.png]]&lt;br /&gt;
[[Image:Q4_spombe.png]]&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Observations:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
* The animal (human + fruit fly) DONOR sites contain ~1 bit of information in the very last position of the exon (with a preference for &amp;quot;G&amp;quot;), and some information (&amp;lt; 1 bit) for the 4 positions following &amp;quot;GT&amp;quot; in the intron (=&amp;gt; in 6 intron positions in all)&lt;br /&gt;
* The plant (&amp;#039;&amp;#039;Arabidopsis&amp;#039;&amp;#039;) has the same pattern of some signal in the final 2 exon positions, but the signal in the intron is very weak after the &amp;quot;GT&amp;quot;.&lt;br /&gt;
* The two fungal species show a pattern of next to no signal in the exon, but a &amp;#039;&amp;#039;&amp;#039;very strong&amp;#039;&amp;#039;&amp;#039; signal in the intron side beyond the GT.&lt;br /&gt;
&lt;br /&gt;
== Acceptor sites ==&lt;br /&gt;
[[Image:Q4_hs_acceptor.png]]&lt;br /&gt;
[[Image:Q4_dmel_acceptor.png]]&lt;br /&gt;
[[Image:Q4_Athal_acceptor.png]]&lt;br /&gt;
[[Image:Q4_scer_acceptor.png]]&lt;br /&gt;
[[Image:Q4_spombe_acceptor.png]]&lt;br /&gt;
&lt;br /&gt;
* Overall observation: the ACCEPTOR site motif is much more alike across all 5 species compared to the DONOR sites.&lt;br /&gt;
** In all cases there is next to no signal on the EXON side (after the &amp;quot;AG&amp;quot;) - and there is a strong preference for T and C (~1 bit, as strong as it can get for a two-letter preference) immediately before AG.&lt;br /&gt;
** There is a diffuse preference for Ts in the region before AG in the animals + the plant.&lt;br /&gt;
** This preference for Ts is clearly centered around the -9 position in the fungi.&lt;br /&gt;
&lt;br /&gt;
= Q5 - &amp;#039;&amp;#039;E. coli&amp;#039;&amp;#039; - Shine-Dalgano =&lt;br /&gt;
[[Image:Q5_ecoli_logo.png]]&lt;br /&gt;
* The START codon is mostly ATG, but GTG is common enough to be seen at the first position. If you zoom in at position 51-53 it is possible to a small number of other bases also being used in rare cases.&lt;br /&gt;
* A region with As and Gs can been seen a position 40-44 which could potentially be part of the SD sequence.&lt;br /&gt;
&lt;br /&gt;
= Q6 SD zoom =&lt;br /&gt;
[[Image:Q6_ecoli_logo.png]]&lt;br /&gt;
&lt;br /&gt;
The LOGO is consistent with the consensus sequence AGGAGG in the way, that it&amp;#039;s not a huge disagreement with it. From the LOGO it appears that it&amp;#039;s a bit of a stretch to claim A over G in any of the positions, but a clear overrepresentation of As OR Gs is clearly seen.&lt;br /&gt;
&lt;br /&gt;
= Q7 - Kozak sequence (Yeast) =&lt;br /&gt;
[[Image:Q7_yeast_25-75.png|center]]&lt;br /&gt;
There appears to be a weak signal in the positions immediately before the START codon (especially the -3 position = coordinate 48).&lt;br /&gt;
&lt;br /&gt;
[[Image:Q7_yeast_40-50_zoom1.png|center|Zoom of the 40-50 region]]&lt;br /&gt;
[[Image:Q7_yeast_40-50_zoom2.png|center|Zoom + Y axis rescale of the 40-50 region]]&lt;br /&gt;
&amp;lt;br style=&amp;quot;clear: both&amp;quot; /&amp;gt;&lt;br /&gt;
&lt;br /&gt;
It can now clearly be seen that only position 48 (= 3 before the ATG) has information content above 0.2 bits.&lt;br /&gt;
&lt;br /&gt;
[[Image:Q7_yeast_40-50_zoom_freq.png|center]]&lt;br /&gt;
&lt;br /&gt;
By plotting a &amp;#039;&amp;#039;&amp;#039;frequency plot&amp;#039;&amp;#039;&amp;#039; of the same region, it can be seen that &amp;gt;50% of the sequences have A in position 48.&lt;br /&gt;
&lt;br /&gt;
= Q8 - Signal peptides comparison =&lt;br /&gt;
[[Image:Q8_EUK.png|center]]&lt;br /&gt;
[[Image:Q8_gram_neg.png|center]]&lt;br /&gt;
[[Image:Q8_gram_pos.png|center]]&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Similarities:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** It&amp;#039;s clearly seen that position -1 (just before the cleavage) and -3 is important and A (alanine) is preferred here (especially in the prokaryotes).&lt;br /&gt;
** In all three cases there is a stretch of &amp;#039;&amp;#039;&amp;#039;hydrophobic&amp;#039;&amp;#039;&amp;#039; (color = black) amino acids (L, V, A, I) in the middle of the signal peptide.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Differences:&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
** The preference for A (alanine) at the -1 position is much stronger in the prokaryotic sequences&lt;br /&gt;
** The hydrophobic stretch is longer in Gram positive bacteria&lt;br /&gt;
** There is a preference for S/A at position -6 in Gram negatives that is not seen elsewhere&lt;br /&gt;
** There is no signal after the cleavage site in eukaryotes and some signal in the first few positions in both prokaryotic groups&lt;br /&gt;
&lt;br /&gt;
= Q9 - seq2logo =&lt;br /&gt;
[[Image:Q9_EUK.png]]&lt;br /&gt;
* Yes - it clearly shows the same overall motif as above. Note that, unlike WebLogo, Seq2logo indicates positions with gaps by making the stack of letters more narrow.&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
= Q10 - Kullback-Leibler LOGO =&lt;br /&gt;
[[Image:Q10_KL_logo.png]]&lt;br /&gt;
* Polar amino acids (color = green) are under-represented in the appx. 10 aa region where hydrophobic (color = black) amino acids are over-represented.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
= Q10 - small data sets =&lt;br /&gt;
[[Image:Emblem-important_tiny.png‎]] &amp;#039;&amp;#039;&amp;#039;IMPORTANT:&amp;#039;&amp;#039;&amp;#039; Compare the LOGOs from the small data set to the LOGO we got from the large data set (Question 9+10) and investigate if you can see the same pattern.&lt;br /&gt;
&lt;br /&gt;
[[Image:Q11_pseudo_00.png]]&lt;br /&gt;
[[Image:Q11_pseudo_200.png]]&lt;br /&gt;
&lt;br /&gt;
* The first plot (without pseudo-counts) is very noisy, and only the very overall trends can be seen: the tendency to have an &amp;quot;A&amp;quot; at the -1 position and a somewhat diffuse hydrophobic region.&lt;br /&gt;
* In the second plot (with the pseudo-counts) the picture looks a lot more like what we saw in the big data sets: a specific pattern at the -1 and -3 positions and the hydrophobic region much more in the shape with what we saw before.&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>