<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk:443/22101/index.php?action=history&amp;feed=atom&amp;title=Stateful_Parsing</id>
	<title>Stateful Parsing - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?action=history&amp;feed=atom&amp;title=Stateful_Parsing"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Stateful_Parsing&amp;action=history"/>
	<updated>2026-04-17T06:50:42Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Stateful_Parsing&amp;diff=79&amp;oldid=prev</id>
		<title>WikiSysop: /* Exercises for extra practice */</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Stateful_Parsing&amp;diff=79&amp;oldid=prev"/>
		<updated>2024-10-30T09:21:04Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Exercises for extra practice&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 11:21, 30 October 2024&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l24&quot;&gt;Line 24:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 24:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Exercises for extra practice ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Exercises for extra practice ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Count the number of RA (Author) lines in the &#039;&#039;sprot1-4.dat&#039;&#039; files.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Count the number of RA (Author) lines in the &#039;&#039;sprot1-4.dat&#039;&#039; files&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;. &#039;&#039;sprot2.dat&#039;&#039; contains 25 RA lines&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Extract the author names from the RA lines in the &amp;#039;&amp;#039;sprot1-4.dat&amp;#039;&amp;#039; files. Display the names - only the names.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Extract the author names from the RA lines in the &amp;#039;&amp;#039;sprot1-4.dat&amp;#039;&amp;#039; files. Display the names - only the names.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Continuing previous exercise: Now also extract the title (RT lines). Display like title first on one line, followed by authors on the next. Then empty line followed by the next title and authors and so forth, until no more authors/titles.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Continuing previous exercise: Now also extract the title (RT lines). Display like title first on one line, followed by authors on the next. Then empty line followed by the next title and authors and so forth, until no more authors/titles.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Stateful_Parsing&amp;diff=73&amp;oldid=prev</id>
		<title>WikiSysop: /* Exercises to be handed in */</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Stateful_Parsing&amp;diff=73&amp;oldid=prev"/>
		<updated>2024-10-30T08:53:42Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Exercises to be handed in&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 10:53, 30 October 2024&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l16&quot;&gt;Line 16:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 16:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Exercises to be handed in ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Exercises to be handed in ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The following 5 exercises deal with SwissProt. The file sprot1.dat is a SwissProt database entry. Study it carefully. Locate the SwissProt ID (SP96_DICDI), the accession number (P14328) and the amino acid sequence (MRVLLVLVAC....TTTATTTATS). There are other entries ( &#039;&#039;sprot2.dat&#039;&#039;, &#039;&#039;sprot3.dat&#039;&#039;, &#039;&#039;sprot4.dat&#039;&#039;). Your programs should work on those, too. Also your programs must solve all the problems in ONE reading of the file. It is acceptable if you just hand in one program that solves 1 to 4. 5 is separate. &lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;To be clear - this exercise is only &lt;/del&gt;about studying and understanding the file format.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The following 5 exercises deal with SwissProt. The file sprot1.dat is a SwissProt database entry. Study it carefully. Locate the SwissProt ID (SP96_DICDI), the accession number (P14328) and the amino acid sequence (MRVLLVLVAC....TTTATTTATS). There are other entries ( &#039;&#039;sprot2.dat&#039;&#039;, &#039;&#039;sprot3.dat&#039;&#039;, &#039;&#039;sprot4.dat&#039;&#039;). Your programs should work on those, too. Also your programs must solve all the problems in ONE reading of the file. It is acceptable if you just hand in one program that solves 1 to 4. 5 is separate. &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;These exercises are &lt;/ins&gt;about studying and understanding the file format.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Make a program that reads the ID and prints it.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Make a program that reads the ID and prints it.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Add the following functionality to the program: Read the accession number and print it.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Add the following functionality to the program: Read the accession number and print it.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Stateful_Parsing&amp;diff=29&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;__NOTOC__ {| width=500  style=&quot;float:right; margin-left: 10px; margin-top: -56px;&quot; |Previous: Exceptions and Bug Handling |Next: Lists/Sequences |} == Required course material for the lesson == Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_08-StatefulParsing.ppt Stateful Parsing]&lt;br&gt; Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=fe50507e-9009-49b1-a3a8-af27012d8cdf Stateful Parsing]&lt;br&gt; Video: [https://panopto.dtu.dk/Panopto/P...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Stateful_Parsing&amp;diff=29&amp;oldid=prev"/>
		<updated>2024-03-01T15:58:04Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;__NOTOC__ {| width=500  style=&amp;quot;float:right; margin-left: 10px; margin-top: -56px;&amp;quot; |Previous: &lt;a href=&quot;/22101/index.php/Exceptions_and_Bug_Handling&quot; title=&quot;Exceptions and Bug Handling&quot;&gt;Exceptions and Bug Handling&lt;/a&gt; |Next: &lt;a href=&quot;/22101/index.php/Lists/Sequences&quot; title=&quot;Lists/Sequences&quot;&gt;Lists/Sequences&lt;/a&gt; |} == Required course material for the lesson == Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_08-StatefulParsing.ppt Stateful Parsing]&amp;lt;br&amp;gt; Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=fe50507e-9009-49b1-a3a8-af27012d8cdf Stateful Parsing]&amp;lt;br&amp;gt; Video: [https://panopto.dtu.dk/Panopto/P...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;__NOTOC__&lt;br /&gt;
{| width=500  style=&amp;quot;float:right; margin-left: 10px; margin-top: -56px;&amp;quot;&lt;br /&gt;
|Previous: [[Exceptions and Bug Handling]]&lt;br /&gt;
|Next: [[Lists/Sequences]]&lt;br /&gt;
|}&lt;br /&gt;
== Required course material for the lesson ==&lt;br /&gt;
Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_08-StatefulParsing.ppt Stateful Parsing]&amp;lt;br&amp;gt;&lt;br /&gt;
Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=fe50507e-9009-49b1-a3a8-af27012d8cdf Stateful Parsing]&amp;lt;br&amp;gt;&lt;br /&gt;
Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=ce785005-14cd-424a-a406-af27012b3e1b Finding errors with Stateful Parsing]&amp;lt;br&amp;gt;&lt;br /&gt;
&amp;lt;!--&lt;br /&gt;
Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=acd48c9a-cacb-4832-9184-af27012c7021 Live coding]&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Subjects covered ==&lt;br /&gt;
Using stateful parsing to extract data spanning several lines, by recognizing keywords.&lt;br /&gt;
&lt;br /&gt;
== Exercises to be handed in ==&lt;br /&gt;
The following 5 exercises deal with SwissProt. The file sprot1.dat is a SwissProt database entry. Study it carefully. Locate the SwissProt ID (SP96_DICDI), the accession number (P14328) and the amino acid sequence (MRVLLVLVAC....TTTATTTATS). There are other entries ( &amp;#039;&amp;#039;sprot2.dat&amp;#039;&amp;#039;, &amp;#039;&amp;#039;sprot3.dat&amp;#039;&amp;#039;, &amp;#039;&amp;#039;sprot4.dat&amp;#039;&amp;#039;). Your programs should work on those, too. Also your programs must solve all the problems in ONE reading of the file. It is acceptable if you just hand in one program that solves 1 to 4. 5 is separate. To be clear - this exercise is only about studying and understanding the file format.&lt;br /&gt;
# Make a program that reads the ID and prints it.&lt;br /&gt;
# Add the following functionality to the program: Read the accession number and print it.&lt;br /&gt;
# Add the following functionality to the program: Read the amino acid sequence and print it. You really should use Stateful Parsing in this exercise. Maybe check the video.&lt;br /&gt;
# Add the following functionality to the program: Verification of amino acid number. This means extract the number from the SQ line (example: SQ SEQUENCE 629 AA;) and check that the amino acid sequence has that number of residues. It should be the program that determines if something is wrong - not the user. Imagine that before you go home, you set the computer to run through a million swisprot entries. The next day, you must be able to see what failed. In a sense you don&amp;#039;t care about what succeeded, as that is the common case. You care about what failed, because it is here you must take action.&lt;br /&gt;
# Now that you have the ID, accession number and AA sequence save it to a file &amp;#039;&amp;#039;sprot.fsa&amp;#039;&amp;#039; in FASTA format. Look in the file &amp;#039;&amp;#039;dna.fsa&amp;#039;&amp;#039; for an example of FASTA. Notice the first line starts with &amp;gt; and immediately after comes an unique identifier, like an accession number or a SwissProt ID. Any other data must be on the header line only, but in free format. Sequence data is on the following lines. &amp;lt;br&amp;gt;Notice that this exercise incorporates the previous 4, but uses the result in a slightly different way.&lt;br /&gt;
&lt;br /&gt;
== Exercises for extra practice ==&lt;br /&gt;
* Count the number of RA (Author) lines in the &amp;#039;&amp;#039;sprot1-4.dat&amp;#039;&amp;#039; files.&lt;br /&gt;
* Extract the author names from the RA lines in the &amp;#039;&amp;#039;sprot1-4.dat&amp;#039;&amp;#039; files. Display the names - only the names.&lt;br /&gt;
* Continuing previous exercise: Now also extract the title (RT lines). Display like title first on one line, followed by authors on the next. Then empty line followed by the next title and authors and so forth, until no more authors/titles.&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>