<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22113/index.php?action=history&amp;feed=atom&amp;title=Regular_Expressions</id>
	<title>Regular Expressions - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22113/index.php?action=history&amp;feed=atom&amp;title=Regular_Expressions"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22113/index.php?title=Regular_Expressions&amp;action=history"/>
	<updated>2026-04-08T00:40:18Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22113/index.php?title=Regular_Expressions&amp;diff=113&amp;oldid=prev</id>
		<title>WikiSysop: /* Exercises to be handed in */</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22113/index.php?title=Regular_Expressions&amp;diff=113&amp;oldid=prev"/>
		<updated>2025-03-17T17:06:11Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Exercises to be handed in&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 19:06, 17 March 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l21&quot;&gt;Line 21:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 21:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;If you don&amp;#039;t know what &amp;#039;&amp;#039;&amp;#039;stateful parsing&amp;#039;&amp;#039;&amp;#039; is, [https://teaching.healthtech.dtu.dk/22101/index.php/Stateful_Parsing look here]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;If you don&amp;#039;t know what &amp;#039;&amp;#039;&amp;#039;stateful parsing&amp;#039;&amp;#039;&amp;#039; is, [https://teaching.healthtech.dtu.dk/22101/index.php/Stateful_Parsing look here]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Make a program that accepts a string as input from the keyboard. Use regular expressions (RE) to determine if the input is a number. The goal is to do this with a SINGLE regex.&amp;lt;br&amp;gt; These should all be considered as numbers: &amp;quot;4&amp;quot;   &amp;quot;-7&amp;quot;   &amp;quot;0.656&amp;quot;   &amp;quot;-67.35555&amp;quot;&amp;lt;br&amp;gt; These are not numbers: &amp;quot;5.&amp;quot;  &amp;quot;56F&amp;quot;  &amp;quot;.32&amp;quot;  &amp;quot;-.04&amp;quot;  &amp;quot;1+1&amp;quot;&amp;lt;br&amp;gt; Note: The program is very simple, but it is likely the most difficult regular expression, you will have to make in this set of exercises. Perhaps you should do the following exercises before attempting this one - just to get some experience first.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Make a program that accepts a string as input from the keyboard. Use regular expressions (RE) to determine if the input is a number. The goal is to do this with a SINGLE regex.&amp;lt;br&amp;gt; These should all be considered as numbers: &amp;quot;4&amp;quot;   &amp;quot;-7&amp;quot;   &amp;quot;0.656&amp;quot;   &amp;quot;-67.35555&amp;quot;&amp;lt;br&amp;gt; These are not numbers: &amp;quot;5.&amp;quot;  &amp;quot;56F&amp;quot;  &amp;quot;.32&amp;quot;  &amp;quot;-.04&amp;quot;  &amp;quot;1+1&amp;quot;&amp;lt;br&amp;gt; Note: The program is very simple, but it is likely the most difficult regular expression, you will have to make in this set of exercises. Perhaps you should do the following exercises before attempting this one - just to get some experience first.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Make a program that can read and verify a fasta file. Test with &#039;&#039;dna7.fsa&#039;&#039; and &#039;&#039;dnanoise.fsa&#039;&#039;. Verification here means that the program prints &quot;DNA fasta&quot; or &quot;Protein fasta&quot; if the file is successfully verified for either dna or protein sequence, and &quot;Not fasta&quot; if unsuccessfully verified. You can find a description of fasta format in [[Biological knowledge needed in the course]]. You are expected to know which symbols are used for DNA and protein sequence - or that you are able to look it up.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Make a program that can read and verify a fasta file. Test with &#039;&#039;dna7.fsa&#039;&#039; and &#039;&#039;dnanoise.fsa&#039;&#039;. Verification here means that the program prints &quot;DNA fasta&quot; or &quot;Protein fasta&quot; if the file is successfully verified for either dna or protein sequence, and &quot;Not fasta&quot; if unsuccessfully verified. You can find a description of fasta format in [[Biological knowledge needed in the course]]. You are expected to know which symbols are used for DNA and protein sequence - or that you are able to look it up&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;. Hint: If you have made a program before (previous course) that reads a fasta file, this and the following exercise is not too hard, but otherwise you can consider doing them last&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Change exercise 2 in the following way: Make the program discard entries that can not conform to DNA or protein sequence, and rewrite the acceptable entries in the output file &amp;#039;&amp;#039;fastaout.fsa&amp;#039;&amp;#039;, in such a way that the normal 60 chars per line is followed with no spaces in between. The program must inform the user how many entries was kept and how many discarded. Test on &amp;#039;&amp;#039;dnanoise.fsa&amp;#039;&amp;#039;, which contain 3 entries that should be discarded - this is a strong hint.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# Change exercise 2 in the following way: Make the program discard entries that can not conform to DNA or protein sequence, and rewrite the acceptable entries in the output file &amp;#039;&amp;#039;fastaout.fsa&amp;#039;&amp;#039;, in such a way that the normal 60 chars per line is followed with no spaces in between. The program must inform the user how many entries was kept and how many discarded. Test on &amp;#039;&amp;#039;dnanoise.fsa&amp;#039;&amp;#039;, which contain 3 entries that should be discarded - this is a strong hint.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# The last exercises will all have to do with the files &amp;#039;&amp;#039;data1-4.gb&amp;#039;&amp;#039;, which are various Genbank entries of genes. First you should study the files, notice the structure of the data. In all exercises you will have to parse (read and find the wanted data) the files using RE&amp;#039;s which are very well designed for that purpose. This is a build-up process, so every exercise is added to the previous ones, so the final program can do a lot. Your program should be able to handle all files (so test them), but just one at a time.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;# The last exercises will all have to do with the files &amp;#039;&amp;#039;data1-4.gb&amp;#039;&amp;#039;, which are various Genbank entries of genes. First you should study the files, notice the structure of the data. In all exercises you will have to parse (read and find the wanted data) the files using RE&amp;#039;s which are very well designed for that purpose. This is a build-up process, so every exercise is added to the previous ones, so the final program can do a lot. Your program should be able to handle all files (so test them), but just one at a time.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22113/index.php?title=Regular_Expressions&amp;diff=24&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;__NOTOC__ {| width=500  style=&quot;font-size: 10px; float:right; margin-left: 10px; margin-top: -56px;&quot; |Previous: Python Recap and Objects |Next: Making Functions |} == Required course material for the lesson == Powerpoint: [https://teaching.healthtech.dtu.dk/material/22113/22113_03-Regex.ppt Regular expressions in Python]&lt;br&gt; Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=04cb2b80-d941-42a5-a632-af27012cd0d7 Regular Expressions] Monday&lt;br&gt; Video: [http...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22113/index.php?title=Regular_Expressions&amp;diff=24&amp;oldid=prev"/>
		<updated>2024-03-06T13:37:45Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;__NOTOC__ {| width=500  style=&amp;quot;font-size: 10px; float:right; margin-left: 10px; margin-top: -56px;&amp;quot; |Previous: &lt;a href=&quot;/22113/index.php/Python_Recap_and_Objects&quot; title=&quot;Python Recap and Objects&quot;&gt;Python Recap and Objects&lt;/a&gt; |Next: &lt;a href=&quot;/22113/index.php/Making_Functions&quot; title=&quot;Making Functions&quot;&gt;Making Functions&lt;/a&gt; |} == Required course material for the lesson == Powerpoint: [https://teaching.healthtech.dtu.dk/material/22113/22113_03-Regex.ppt Regular expressions in Python]&amp;lt;br&amp;gt; Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=04cb2b80-d941-42a5-a632-af27012cd0d7 Regular Expressions] Monday&amp;lt;br&amp;gt; Video: [http...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;__NOTOC__&lt;br /&gt;
{| width=500  style=&amp;quot;font-size: 10px; float:right; margin-left: 10px; margin-top: -56px;&amp;quot;&lt;br /&gt;
|Previous: [[Python Recap and Objects]]&lt;br /&gt;
|Next: [[Making Functions]]&lt;br /&gt;
|}&lt;br /&gt;
== Required course material for the lesson ==&lt;br /&gt;
Powerpoint: [https://teaching.healthtech.dtu.dk/material/22113/22113_03-Regex.ppt Regular expressions in Python]&amp;lt;br&amp;gt;&lt;br /&gt;
Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=04cb2b80-d941-42a5-a632-af27012cd0d7 Regular Expressions] Monday&amp;lt;br&amp;gt;&lt;br /&gt;
Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=738d525b-a6c1-4973-b6cc-af27012ca86e An (unfortunately) true story] Monday&amp;lt;br&amp;gt;&lt;br /&gt;
Resource: [[Example code - Regex]]&amp;lt;br&amp;gt;&lt;br /&gt;
Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=2952d382-7059-4691-8170-af27012bc9a1 Live Coding]&amp;lt;br&amp;gt;&lt;br /&gt;
PDF: [https://teaching.healthtech.dtu.dk/material/22113/regular-expressions-cheat-sheet-v2.pdf Regular Expressions Cheat Sheet]&amp;lt;br&amp;gt;&lt;br /&gt;
WWW: [http://regex101.com/ Web page where you can test your regular expressions]&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Subjects covered ==&lt;br /&gt;
* Regular expressions, duh.&lt;br /&gt;
* Patterns, how to design and use them.&lt;br /&gt;
&lt;br /&gt;
== Exercises to be handed in ==&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;You might recognize some of these exercises. You must ONLY use regex for your pattern recognition and extraction of single data points (like an accession number).&amp;#039;&amp;#039;&amp;#039;&amp;lt;br&amp;gt;&lt;br /&gt;
If you don&amp;#039;t know what &amp;#039;&amp;#039;&amp;#039;stateful parsing&amp;#039;&amp;#039;&amp;#039; is, [https://teaching.healthtech.dtu.dk/22101/index.php/Stateful_Parsing look here]&lt;br /&gt;
# Make a program that accepts a string as input from the keyboard. Use regular expressions (RE) to determine if the input is a number. The goal is to do this with a SINGLE regex.&amp;lt;br&amp;gt; These should all be considered as numbers: &amp;quot;4&amp;quot;   &amp;quot;-7&amp;quot;   &amp;quot;0.656&amp;quot;   &amp;quot;-67.35555&amp;quot;&amp;lt;br&amp;gt; These are not numbers: &amp;quot;5.&amp;quot;  &amp;quot;56F&amp;quot;  &amp;quot;.32&amp;quot;  &amp;quot;-.04&amp;quot;  &amp;quot;1+1&amp;quot;&amp;lt;br&amp;gt; Note: The program is very simple, but it is likely the most difficult regular expression, you will have to make in this set of exercises. Perhaps you should do the following exercises before attempting this one - just to get some experience first.&lt;br /&gt;
# Make a program that can read and verify a fasta file. Test with &amp;#039;&amp;#039;dna7.fsa&amp;#039;&amp;#039; and &amp;#039;&amp;#039;dnanoise.fsa&amp;#039;&amp;#039;. Verification here means that the program prints &amp;quot;DNA fasta&amp;quot; or &amp;quot;Protein fasta&amp;quot; if the file is successfully verified for either dna or protein sequence, and &amp;quot;Not fasta&amp;quot; if unsuccessfully verified. You can find a description of fasta format in [[Biological knowledge needed in the course]]. You are expected to know which symbols are used for DNA and protein sequence - or that you are able to look it up.&lt;br /&gt;
# Change exercise 2 in the following way: Make the program discard entries that can not conform to DNA or protein sequence, and rewrite the acceptable entries in the output file &amp;#039;&amp;#039;fastaout.fsa&amp;#039;&amp;#039;, in such a way that the normal 60 chars per line is followed with no spaces in between. The program must inform the user how many entries was kept and how many discarded. Test on &amp;#039;&amp;#039;dnanoise.fsa&amp;#039;&amp;#039;, which contain 3 entries that should be discarded - this is a strong hint.&lt;br /&gt;
# The last exercises will all have to do with the files &amp;#039;&amp;#039;data1-4.gb&amp;#039;&amp;#039;, which are various Genbank entries of genes. First you should study the files, notice the structure of the data. In all exercises you will have to parse (read and find the wanted data) the files using RE&amp;#039;s which are very well designed for that purpose. This is a build-up process, so every exercise is added to the previous ones, so the final program can do a lot. Your program should be able to handle all files (so test them), but just one at a time.&lt;br /&gt;
# Extract the accession number, the definition and the organism (and print it).&lt;br /&gt;
# Extract and print all MEDLINE article numbers which are mentioned in the entries.&lt;br /&gt;
# Extract and print the translated gene (the amino acid sequence). Look for the line starting with /translation=. Generalize; An amino acid sequence can be short, i.e. only one line in the feature table, or long, i.e. more than one line in the feature table. Use stateful parsing.&lt;br /&gt;
# Extract and print the DNA (whole base sequence in the end of the file). Use stateful parsing.&lt;br /&gt;
# &amp;lt;font color=&amp;quot;#AA00FF&amp;quot;&amp;gt;Extract and print ONLY the coding DNA. That is described in FEATURES - CDS (Coding DNA Sequence). As an example, the line in &amp;#039;&amp;#039;data1.gb&amp;#039;&amp;#039; says &amp;#039;join(2424..2610,3397..3542)&amp;#039; and means that the coding sequence are bases 2424-2610 followed by bases 3397-3542. The bases in between are an intron and not a part of the coding DNA. Remember to generalize; there can be more (or less) than two exons, and the &amp;#039;join&amp;#039; line can continue on the next line. Use stateful parsing.&amp;lt;/font&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Exercises for extra practice ==&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>