<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk:443/22101/index.php?action=history&amp;feed=atom&amp;title=Sets</id>
	<title>Sets - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?action=history&amp;feed=atom&amp;title=Sets"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Sets&amp;action=history"/>
	<updated>2026-05-03T20:43:54Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Sets&amp;diff=55&amp;oldid=prev</id>
		<title>WikiSysop: /* Required course material for the lesson */</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Sets&amp;diff=55&amp;oldid=prev"/>
		<updated>2024-08-30T18:57:35Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Required course material for the lesson&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 20:57, 30 August 2024&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l5&quot;&gt;Line 5:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 5:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;|}&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;|}&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Required course material for the lesson ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Required course material for the lesson ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_12-Sets.ppt Sets]&amp;lt;br&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=e69a7b62-4f83-4b92-b254-af27012c464e Sets]&amp;lt;br&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=e69a7b62-4f83-4b92-b254-af27012c464e Sets]&amp;lt;br&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_12-Sets.ppt Sets]&amp;lt;br&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Subjects covered ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== Subjects covered ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Sets&amp;diff=34&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;__NOTOC__ {| width=500  style=&quot;float:right; margin-left: 10px; margin-top: -56px;&quot; |Previous: Simple Pattern Matching |Next: Dictionaries |} == Required course material for the lesson == Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=e69a7b62-4f83-4b92-b254-af27012c464e Sets]&lt;br&gt; Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_12-Sets.ppt Sets]&lt;br&gt;  == Subjects covered == * Sets, which are unordered data collections with no dupli...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk:443/22101/index.php?title=Sets&amp;diff=34&amp;oldid=prev"/>
		<updated>2024-03-01T16:00:53Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;__NOTOC__ {| width=500  style=&amp;quot;float:right; margin-left: 10px; margin-top: -56px;&amp;quot; |Previous: &lt;a href=&quot;/22101/index.php/Simple_Pattern_Matching&quot; title=&quot;Simple Pattern Matching&quot;&gt;Simple Pattern Matching&lt;/a&gt; |Next: &lt;a href=&quot;/22101/index.php/Dictionaries&quot; title=&quot;Dictionaries&quot;&gt;Dictionaries&lt;/a&gt; |} == Required course material for the lesson == Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=e69a7b62-4f83-4b92-b254-af27012c464e Sets]&amp;lt;br&amp;gt; Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_12-Sets.ppt Sets]&amp;lt;br&amp;gt;  == Subjects covered == * Sets, which are unordered data collections with no dupli...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;__NOTOC__&lt;br /&gt;
{| width=500  style=&amp;quot;float:right; margin-left: 10px; margin-top: -56px;&amp;quot;&lt;br /&gt;
|Previous: [[Simple Pattern Matching]]&lt;br /&gt;
|Next: [[Dictionaries]]&lt;br /&gt;
|}&lt;br /&gt;
== Required course material for the lesson ==&lt;br /&gt;
Video: [https://panopto.dtu.dk/Panopto/Pages/Viewer.aspx?id=e69a7b62-4f83-4b92-b254-af27012c464e Sets]&amp;lt;br&amp;gt;&lt;br /&gt;
Powerpoint: [https://teaching.healthtech.dtu.dk/material/22101/22101_12-Sets.ppt Sets]&amp;lt;br&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Subjects covered ==&lt;br /&gt;
* Sets, which are unordered data collections with no duplicates.&amp;lt;br&amp;gt;&lt;br /&gt;
* Methods relevant to sets:&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;clear&amp;#039;&amp;#039;&amp;#039;, clears a set&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;add&amp;#039;&amp;#039;&amp;#039;, adds and element to the set,&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;update&amp;#039;&amp;#039;&amp;#039;, adds several elements to the set, performance,&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;remove&amp;#039;&amp;#039;&amp;#039;, removes an element from the set, error if not present,&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;discard&amp;#039;&amp;#039;&amp;#039;, removes an element from the set, no error,&lt;br /&gt;
** &amp;#039;&amp;#039;&amp;#039;in&amp;#039;&amp;#039;&amp;#039;, determines if an element exist,&lt;br /&gt;
** mathematical set operations, intersection (&amp;amp;), union (|)&lt;br /&gt;
&lt;br /&gt;
== Exercises to be handed in ==&lt;br /&gt;
# Read the sequences in the file &amp;#039;&amp;#039;dna7.fsa&amp;#039;&amp;#039;. Find out which and how many of the 64 codons are &amp;#039;&amp;#039;&amp;#039;not&amp;#039;&amp;#039;&amp;#039; used somewhere in the sequences. Print the unused codons.&lt;br /&gt;
# &amp;lt;font color=&amp;quot;#AA00FF&amp;quot;&amp;gt;You have made a program (let&amp;#039;s call it the X-program), which as input takes a file of accession numbers, &amp;#039;&amp;#039;start10.dat&amp;#039;&amp;#039; and produces some output, which is in &amp;#039;&amp;#039;res10.dat&amp;#039;&amp;#039;. Now you count the lines in your input file and your output file and you discover that the line numbers do not match. Horror - your program does not produce output for some input. Now the assignment is to discover which accession numbers did not produce output. This can be done in various ways, but now you have to use a set. Print the results.&amp;lt;/font&amp;gt;&lt;br /&gt;
# &amp;lt;font color=&amp;quot;#AA00FF&amp;quot;&amp;gt;In the file &amp;#039;&amp;#039;ex5.acc&amp;#039;&amp;#039; are a lot of accession numbers, where some are duplicates. We have earlier cleaned this file of duplicates. Let&amp;#039;s do that again using a set. Make a program that reads the file once, finds the unique accession numbers and write them to the file &amp;#039;&amp;#039;uniq5.acc&amp;#039;&amp;#039;&amp;lt;/font&amp;gt;&lt;br /&gt;
# In the &amp;#039;&amp;#039;data1.gb&amp;#039;&amp;#039; file there are 6 references (to articles). Make a program that extracts all authors from the references, eliminates those that are duplicates and print the list of authors. You will notice that some authors seems to be the same person using different initials. You should only consider a person a duplicate if the name matches exactly. This should also work for the other Genbank entries: &amp;#039;&amp;#039;data2.gb&amp;#039;&amp;#039;, &amp;#039;&amp;#039;data3.gb&amp;#039;&amp;#039; &amp;amp; &amp;#039;&amp;#039;data4.gb&amp;#039;&amp;#039;.&amp;lt;br&amp;gt;Beware: there traps in this exercise, check your output properly.&lt;br /&gt;
&lt;br /&gt;
== Exercises for extra practice ==&lt;br /&gt;
A k-mer is a piece of sequence k units long. It can be both protein and DNA sequence. As an example I have this DNA sequence: GACTAC. It contains the following 4-mers: GACT, ACTA, CTAC.&lt;br /&gt;
* From a genbank file like &amp;#039;&amp;#039;data1-4.gb&amp;#039;&amp;#039; extract the DNA sequence (been there - done that). Now insert all 5-mers in the DNA sequence into a set. Compute the following numbers: how many did you insert in the set, how many are in the set, how many different 5-mers could there possibly be. Display. It seems like the two first numbers are the same, but this is not guaranteed to be true, why?&lt;br /&gt;
* Calculate the overlap of 5-mers between any two of the &amp;#039;&amp;#039;data1-4.gb&amp;#039;&amp;#039; files. Just ask for two filenames and calculate for these.&lt;br /&gt;
* Make a program that asks for a number (integer) then reads the &amp;#039;&amp;#039;dna7.fsa&amp;#039;&amp;#039; file (which contains insulin-like genes) and saves the entry selected by the number in &amp;#039;&amp;#039;selected.fsa&amp;#039;&amp;#039; and the rest of the entries into the file &amp;#039;&amp;#039;rest.fsa&amp;#039;&amp;#039;.&lt;br /&gt;
* Now that you can select a fasta entry, then read the &amp;#039;&amp;#039;selected.fsa&amp;#039;&amp;#039; and create 5-mers from the sequence. Now read the entries from &amp;#039;&amp;#039;rest.fsa&amp;#039;&amp;#039; and for every entry create the 5-mers from the sequence. Report which sequence in &amp;#039;&amp;#039;rest.fsa&amp;#039;&amp;#039; had the greatest overlap (and how much overlap) with the selected sequence. This must be the sequence that looks most like the selected one.&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>