<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22111/index.php?action=history&amp;feed=atom&amp;title=ExGeany-Answers</id>
	<title>ExGeany-Answers - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22111/index.php?action=history&amp;feed=atom&amp;title=ExGeany-Answers"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=ExGeany-Answers&amp;action=history"/>
	<updated>2026-05-03T12:51:37Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22111/index.php?title=ExGeany-Answers&amp;diff=222&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;=Answers to the exercise in Plain text files and Geany= Answers by: Rasmus Wernersson and Henrik Nielsen   == Question 1:== The file sizes are:    453 bytes: alpha_globin_OldMac.fsa  453 bytes: alpha_globin_Unix.fsa  461 bytes: alpha_globin_Windows.fsa   The important thing to notice here is that DOS/Windows newlines actually consists of two bytes (CR + LF), whereas UNIX and the old Mac standard only use  one byte.   The 8 byte difference corresponds to the 8 lines of te...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22111/index.php?title=ExGeany-Answers&amp;diff=222&amp;oldid=prev"/>
		<updated>2024-03-15T11:19:37Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;=Answers to the exercise in Plain text files and Geany= Answers by: Rasmus Wernersson and Henrik Nielsen   == Question 1:== The file sizes are:    453 bytes: alpha_globin_OldMac.fsa  453 bytes: alpha_globin_Unix.fsa  461 bytes: alpha_globin_Windows.fsa   The important thing to notice here is that DOS/Windows newlines actually consists of two bytes (CR + LF), whereas UNIX and the old Mac standard only use  one byte.   The 8 byte difference corresponds to the 8 lines of te...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;=Answers to the exercise in Plain text files and Geany=&lt;br /&gt;
Answers by: Rasmus Wernersson and Henrik Nielsen&lt;br /&gt;
 &lt;br /&gt;
== Question 1:==&lt;br /&gt;
The file sizes are:&lt;br /&gt;
 &lt;br /&gt;
 453 bytes: alpha_globin_OldMac.fsa&lt;br /&gt;
 453 bytes: alpha_globin_Unix.fsa&lt;br /&gt;
 461 bytes: alpha_globin_Windows.fsa&lt;br /&gt;
 &lt;br /&gt;
The important thing to notice here is that DOS/Windows newlines actually&lt;br /&gt;
consists of two bytes (CR + LF), whereas UNIX and the old Mac standard only use &lt;br /&gt;
one byte.&lt;br /&gt;
 &lt;br /&gt;
The 8 byte difference corresponds to the 8 lines of text within the file:&lt;br /&gt;
 &lt;br /&gt;
 001 &amp;gt;pigeon_alpha-globin-D&lt;br /&gt;
 002 ATGCTGACCGACTCTGACAAGAAGCTGGTCCTGCAGGTGTGGGAGAAGGTGATCCGCCACCCAGACTGTG&lt;br /&gt;
 003 GAGCCGAGGCCCTGGAGAGGCTGTTCACCACCTACCCCCAGACCAAGACCTACTTCCCCCACTTCGACTT&lt;br /&gt;
 004 GCACCATGGCTCCGACCAGGTCCGCAACCACGGCAAGAAGGTGTTGGCCGCCTTGGGCAACGCTGTCAAG&lt;br /&gt;
 005 AGCCTGGGCAACCTCAGCCAAGCCCTGTCTGACCTCAGCGACCTGCATGCCTACAACCTGCGTGTCGACC&lt;br /&gt;
 006 CTGTCAACTTCAAGCTGCTGGCGCAGTGCTTCCACGTGGTGCTGGCCACACACCTGGGCAACGACTACAC&lt;br /&gt;
 007 CCCGGAGGCACATGCTGCCTTCGACAAGTTCCTGTCGGCTGTGTGCACCGTGCTGGCCGAGAAGTACAGA&lt;br /&gt;
 008 TAA&lt;br /&gt;
&lt;br /&gt;
== Question 2:==&lt;br /&gt;
Yes - inspecting the files in the associated programs (e.g. Word and FireFox)&lt;br /&gt;
reveals the _textual_ contents to be the same.&lt;br /&gt;
 &lt;br /&gt;
The file sizes differ dramatically:&lt;br /&gt;
 &lt;br /&gt;
 29184 bytes: alpha_globin.doc&lt;br /&gt;
   667 bytes: alpha_globin.html&lt;br /&gt;
   855 bytes: alpha_globin.rtf&lt;br /&gt;
 &lt;br /&gt;
== Question 3:==&lt;br /&gt;
The &amp;lt;tt&amp;gt;alpha_globin.doc&amp;lt;/tt&amp;gt; file cannot be opened, because it is not a text file. In other words, not every byte in the file can be interpreted as a character.&lt;br /&gt;
&lt;br /&gt;
The HTML and RTF files also contain some extra information, but unlike the DOC file, the extra information is text based.&lt;br /&gt;
&lt;br /&gt;
Contents of the HTML file: &lt;br /&gt;
 &amp;lt;!DOCTYPE html PUBLIC &amp;quot;-//W3C//DTD HTML 4.01 Transitional//EN&amp;quot;&amp;gt;&lt;br /&gt;
 &amp;lt;html&amp;gt;&lt;br /&gt;
 &amp;lt;head&amp;gt;&lt;br /&gt;
   &amp;lt;meta content=&amp;quot;text/html;charset=ISO-8859-1&amp;quot; http-equiv=&amp;quot;Content-Type&amp;quot;&amp;gt;&lt;br /&gt;
   &amp;lt;title&amp;gt;&amp;lt;/title&amp;gt;&lt;br /&gt;
 &amp;lt;/head&amp;gt;&lt;br /&gt;
 &amp;lt;body&amp;gt;&lt;br /&gt;
 &amp;lt; PRE&amp;gt;&lt;br /&gt;
 &amp;gt;pigeon_alpha-globin-D&lt;br /&gt;
 ATGCTGACCGACTCTGACAAGAAGCTGGTCCTGCAGGTGTGGGAGAAGGTGATCCGCCACCCAGACTGTG&lt;br /&gt;
 GAGCCGAGGCCCTGGAGAGGCTGTTCACCACCTACCCCCAGACCAAGACCTACTTCCCCCACTTCGACTT&lt;br /&gt;
 GCACCATGGCTCCGACCAGGTCCGCAACCACGGCAAGAAGGTGTTGGCCGCCTTGGGCAACGCTGTCAAG&lt;br /&gt;
 AGCCTGGGCAACCTCAGCCAAGCCCTGTCTGACCTCAGCGACCTGCATGCCTACAACCTGCGTGTCGACC&lt;br /&gt;
 CTGTCAACTTCAAGCTGCTGGCGCAGTGCTTCCACGTGGTGCTGGCCACACACCTGGGCAACGACTACAC&lt;br /&gt;
 CCCGGAGGCACATGCTGCCTTCGACAAGTTCCTGTCGGCTGTGTGCACCGTGCTGGCCGAGAAGTACAGA&lt;br /&gt;
 TAA&lt;br /&gt;
 &amp;lt; /PRE&amp;gt;&lt;br /&gt;
 &amp;lt;/body&amp;gt;&lt;br /&gt;
 &amp;lt;/html&amp;gt;&lt;br /&gt;
&lt;br /&gt;
In this case (cleanly formatted HTML) it&amp;#039;s easy to locate the original DNA &lt;br /&gt;
sequence.&lt;br /&gt;
 &lt;br /&gt;
To some degree it&amp;#039;s possible to figure out what&amp;#039;s going on in the RTF file -&lt;br /&gt;
the codes are basically about formatting:&lt;br /&gt;
 &lt;br /&gt;
Snippet from the file:&lt;br /&gt;
 \f0\b\fs24 \cf0 &amp;gt;pigeon_alpha-globin-D\&lt;br /&gt;
 &lt;br /&gt;
 \f1\b0 ATGCTGACCGACTCTGACAAGAAGCTGGTCCTGCAGGTGTGGGAGAAGGTGATCCGCCACCCAGACTGTG\&lt;br /&gt;
 GAGCCGAGGCCCTGGAGAGGCTGTTCACCACCTACCCCCAGACCAAGACCTACTTCCCCCACTTCGACTT\&lt;br /&gt;
 GCACCATGGCTCCGACCAGGTCCGCAACCACGGCAAGAAGGTGTTGGCCGCCTTGGGCAACGCTGTCAAG\&lt;br /&gt;
 AGCCTGGGCAACCTCAGCCAAGCCCTGTCTGACCTCAGCGACCTGCATGCCTACAACCTGCGTGTCGACC\&lt;br /&gt;
 CTGTCAACTTCAAGCTGCTGGCGCAGTGCTTCCACGTGGTGCTGGCCACACACCTGGGCAACGACTACAC\&lt;br /&gt;
 CCCGGAGGCACATGCTGCCTTCGACAAGTTCCTGTCGGCTGTGTGCACCGTGCTGGCCGAGAAGTACAGA\&lt;br /&gt;
 &lt;br /&gt;
The Word file contain a HUGE amount of additional information in BINARY&lt;br /&gt;
form, this is why Geany refuses to open it. Opening other non-text files such as a JPG image&lt;br /&gt;
or an MP3 sound file will also fail in Geany.&lt;br /&gt;
Certain text editors are less critical with regards to &lt;br /&gt;
the files they open, but when the file is binary, the results will look very strange.&lt;br /&gt;
&lt;br /&gt;
Here is a snippet of the alpha_globin.doc file as displayed by the Unix editor vim:&lt;br /&gt;
 ^@^@^@D^A^@^@^L^@^@^@P^A^@^@^M^@^@^@\^A^@^@^N^@^@^@h^A^@^@^O^@^@^@p^A^@^@^P^@^@^@&lt;br /&gt;
 x^A^@^@^S^@^@^@&amp;lt;80&amp;gt;^A^@^@^Q^@^@^@&amp;lt;88&amp;gt;^A^@^@^B^@^@^@^P&amp;#039;^@^@^^^@^@^@^X^@^@^@&amp;gt;pigeon&lt;br /&gt;
 _alpha-globin-D^@^@^^^@^@^@^D^@^@^@^@^@^@^@^^^@^@^@^T^@^@^@Rasmus Wernersson^@^@&lt;br /&gt;
 ^@^^^@^@^@^D^@^@^@^@^@^@^@^^^@^@^@^H^@^@^@Normal^@^@^^^@^@^@^T^@^@^@Rasmus Werner&lt;br /&gt;
 sson^@^@^@^^^@^@^@^D^@^@^@1^@^@^@^^^@^@^@^X^@^@^@Microsoft Word 11.5.0^@^@^@@^@^@&lt;br /&gt;
 ^@^@FÃ#^@^@^@^@@^@^@^@^@âÄò&amp;lt;91&amp;gt;&amp;lt;81&amp;gt;É^A@^@^@^@^@(&amp;lt;88&amp;gt;^V&amp;lt;92&amp;gt;&amp;lt;81&amp;gt;É^A^C^@^@^@^A^@^@^@&lt;br /&gt;
 ^C^@^@^@^@^@^@^@^C^@^@^@^@^@^@^@^C^@^@^@^@^@^@^@G^@^@^@82^@^@þÿÿÿPICT20^@^@^@^@^C&lt;br /&gt;
 I^BR^@^Q^Bÿ^L^@ÿþ^@^@^A,^@^@^A,^@^@^@^@^@^@^M´    ¯^@^@^@^@^@¡^Aò^@^DMSWD^@^^^@^A&lt;br /&gt;
 ^@^@^@^@^@^M´     ¯^@,^@^N÷@^KCourier New^@^C÷@^@^M^@%^@.^@^D^@^@^@^@^@(^AK^Aw^A&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Interestingly, it actually possible to get a glimpse of a few text-strings within &lt;br /&gt;
the mess of symbols, including the sequence name and the name (Rasmus Wernersson) of the &lt;br /&gt;
person who created the file.&lt;br /&gt;
&lt;br /&gt;
==Question 4:==&lt;br /&gt;
Cleaned up sequence:&lt;br /&gt;
 &lt;br /&gt;
 AACGGGCACGGGACGCATGTAGCTGGAACAGTGGCAGCCGTAAATAATAATGGTATCGGA&lt;br /&gt;
 GTTGCCGGGGTTGCAGGAGGAAACGGCTCTACCAATAGTGGAGCAAGGTTAATGTCCACA&lt;br /&gt;
 CAAATTTTTAATAGTGATGGGGATTATACAAATAGCGAAACTCTTGTGTACAGAGCCATT&lt;br /&gt;
 GTTTATGGTGCAGATAACGGAGCTGTGATCTCGCAAAATAGCTGGGGTAGTCAGTCTCTG&lt;br /&gt;
 ACTATTAAGGAGTTGCAGAAAGCTGCGATCGACTATTTCATTGATTATGCAGGAATGGAC&lt;br /&gt;
 GAAACAGGAGAAATACAGACAGGCCCTATGAGGGGAGGTATATTTATAGCTGCCGCCGGA&lt;br /&gt;
 AACGATAACGTTTCCACTCCAAATATGCCTTCAGCTTATGAACGGGTTTTAGCTGTGGCC&lt;br /&gt;
 TCAATGGGACCAGATTTTACTAAGGCAAGCTATAGCACTTTTGGAACATGGACTGATATT&lt;br /&gt;
 ACTGCTCCTGGCGGAGATATTGACAAATTTGATTTGTCAGAATACGGAGTTCTCAGCACT&lt;br /&gt;
 TATGCCGATAATTATTATGCTTATGGAGAGGGAACATCCATGGCTTGTCCACATGTCGCC&lt;br /&gt;
 GGCGCCGCC&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>