<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22126/index.php?action=history&amp;feed=atom&amp;title=QuantitativeMetagenomics</id>
	<title>QuantitativeMetagenomics - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22126/index.php?action=history&amp;feed=atom&amp;title=QuantitativeMetagenomics"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=QuantitativeMetagenomics&amp;action=history"/>
	<updated>2026-05-01T12:02:47Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22126/index.php?title=QuantitativeMetagenomics&amp;diff=193&amp;oldid=prev</id>
		<title>Gabre at 13:11, 14 January 2025</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=QuantitativeMetagenomics&amp;diff=193&amp;oldid=prev"/>
		<updated>2025-01-14T13:11:41Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 15:11, 14 January 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;H3&amp;gt;Overview&amp;lt;/H3&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;H3&amp;gt;Overview&amp;lt;/H3&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;If you need to use metagenomics for your final project, we have a more thorough workflow that you will need to use [[https://teaching.healthtech.dtu.dk/22136/index.php/22136:Course_plan_autumn_2020 here]].&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Since metagenomics data is often very large, it requires a lot of computational resources and time, we have cheated a little bit and prepared some data for you in advance!&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Since metagenomics data is often very large, it requires a lot of computational resources and time, we have cheated a little bit and prepared some data for you in advance!&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In this exercise we have done the assembly and counting across a cohort of hundreds of human fecal&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In this exercise&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;, &lt;/ins&gt;we have done the assembly and counting across a cohort of hundreds of human fecal&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;samples in advance and in addition provide the gene-wise taxonomy and the BMI of the&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;samples in advance and in addition provide the gene-wise taxonomy and the BMI of the&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;human donors.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;human donors.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;From this data we shall estimate the species richness, diversity and look at the effect of&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;From this data&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;, &lt;/ins&gt;we shall estimate the species richness,&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;and  &lt;/ins&gt;diversity and look at the effect of&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;downsizing. Furthermore we shall see if we can identify any differences between the&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;downsizing. Furthermore&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;, &lt;/ins&gt;we shall see if we can identify any differences between the&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;microbiome of lean and obese.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;microbiome of lean and obese.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Gabre</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22126/index.php?title=QuantitativeMetagenomics&amp;diff=190&amp;oldid=prev</id>
		<title>Gabre at 11:57, 13 January 2025</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=QuantitativeMetagenomics&amp;diff=190&amp;oldid=prev"/>
		<updated>2025-01-13T11:57:41Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 13:57, 13 January 2025&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l20&quot;&gt;Line 20:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 20:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now, let’s load the &amp;quot;vegan&amp;quot; package and thereafter load the read count data from a series of stool samples.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Now, let’s load the &amp;quot;vegan&amp;quot; package and thereafter load the read count data from a series of stool samples.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;pre&amp;gt;library(&amp;quot;vegan&amp;quot;)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;pre&amp;gt;library(&amp;quot;vegan&amp;quot;)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;load(url(&quot;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;http&lt;/del&gt;://teaching.healthtech.dtu.dk/material/22126/Counts_NGS.RData&quot;))&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;load(url(&quot;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;https&lt;/ins&gt;://teaching.healthtech.dtu.dk/material/22126/Counts_NGS.RData&quot;))&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;head(Counts)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;head(Counts)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;str(Counts)&amp;lt;/pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;str(Counts)&amp;lt;/pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l35&quot;&gt;Line 35:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 35:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;H3&amp;gt;Species&amp;lt;/H3&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;H3&amp;gt;Species&amp;lt;/H3&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Lets get the genes associated to species. Here is the gene-wise species taxonomy&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Lets get the genes associated to species. Here is the gene-wise species taxonomy&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;pre&amp;gt;load(url(&quot;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;http&lt;/del&gt;://teaching.healthtech.dtu.dk/material/22126//taxonomy_species.RData&quot;))&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;pre&amp;gt;load(url(&quot;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;https&lt;/ins&gt;://teaching.healthtech.dtu.dk/material/22126//taxonomy_species.RData&quot;))&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;head(taxonomy_species)&amp;lt;/pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;head(taxonomy_species)&amp;lt;/pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We then combine (by summing) the read counts pr. gene to read counts per species.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We then combine (by summing) the read counts pr. gene to read counts per species.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l150&quot;&gt;Line 150:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 150:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;gt; load(url(&quot;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;http&lt;/del&gt;://teaching.healthtech.dtu.dk/material/22126/BMI.RData&quot;))&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;gt; load(url(&quot;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;https&lt;/ins&gt;://teaching.healthtech.dtu.dk/material/22126/BMI.RData&quot;))&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;gt; boxplot(BMI$BMI.kg.m2 ~ BMI$Class, col=c(&amp;quot;red&amp;quot;, &amp;quot;gray&amp;quot;,&amp;quot;blue&amp;quot;), ylab=&amp;quot;BMI&amp;quot;)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;gt; boxplot(BMI$BMI.kg.m2 ~ BMI$Class, col=c(&amp;quot;red&amp;quot;, &amp;quot;gray&amp;quot;,&amp;quot;blue&amp;quot;), ylab=&amp;quot;BMI&amp;quot;)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;/pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;/pre&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Gabre</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22126/index.php?title=QuantitativeMetagenomics&amp;diff=47&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot; &lt;H3&gt;Overview&lt;/H3&gt; If you need to use metagenomics for your final project, we have a more thorough workflow that you will need to use https://teaching.healthtech.dtu.dk/22136/index.php/22136:Course_plan_autumn_2020 here.   Since metagenomics data is often very large, it requires a lot of computational resources and time, we have cheated a little bit and prepared some data for you in advance!  In this exercise we have done the assembly and counting across a cohort of...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=QuantitativeMetagenomics&amp;diff=47&amp;oldid=prev"/>
		<updated>2024-03-19T15:45:53Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot; &amp;lt;H3&amp;gt;Overview&amp;lt;/H3&amp;gt; If you need to use metagenomics for your final project, we have a more thorough workflow that you will need to use &lt;a href=&quot;/22126/index.php?title=Https://teaching.healthtech.dtu.dk/22136/index.php/22136:Course_plan_autumn_2020_here&amp;amp;action=edit&amp;amp;redlink=1&quot; class=&quot;new&quot; title=&quot;Https://teaching.healthtech.dtu.dk/22136/index.php/22136:Course plan autumn 2020 here (page does not exist)&quot;&gt;https://teaching.healthtech.dtu.dk/22136/index.php/22136:Course_plan_autumn_2020 here&lt;/a&gt;.   Since metagenomics data is often very large, it requires a lot of computational resources and time, we have cheated a little bit and prepared some data for you in advance!  In this exercise we have done the assembly and counting across a cohort of...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;
&amp;lt;H3&amp;gt;Overview&amp;lt;/H3&amp;gt;&lt;br /&gt;
If you need to use metagenomics for your final project, we have a more thorough workflow that you will need to use [[https://teaching.healthtech.dtu.dk/22136/index.php/22136:Course_plan_autumn_2020 here]].&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Since metagenomics data is often very large, it requires a lot of computational resources and time, we have cheated a little bit and prepared some data for you in advance!&lt;br /&gt;
&lt;br /&gt;
In this exercise we have done the assembly and counting across a cohort of hundreds of human fecal&lt;br /&gt;
samples in advance and in addition provide the gene-wise taxonomy and the BMI of the&lt;br /&gt;
human donors.&lt;br /&gt;
From this data we shall estimate the species richness, diversity and look at the effect of&lt;br /&gt;
downsizing. Furthermore we shall see if we can identify any differences between the&lt;br /&gt;
microbiome of lean and obese.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;H3&amp;gt;Becoming a pirate&amp;lt;/H3&amp;gt;&lt;br /&gt;
This exercise uses R either locally (install RStudio on your own machine) or on the server by typing &lt;br /&gt;
&amp;lt;pre&amp;gt;R&amp;lt;/pre&amp;gt;&lt;br /&gt;
First, IF you are running RStudio locally you will need to install a package called &amp;quot;vegan&amp;quot;&lt;br /&gt;
&amp;lt;pre&amp;gt;install.packages(&amp;quot;vegan&amp;quot;)&amp;lt;/pre&amp;gt;&lt;br /&gt;
Now, let’s load the &amp;quot;vegan&amp;quot; package and thereafter load the read count data from a series of stool samples.&lt;br /&gt;
&amp;lt;pre&amp;gt;library(&amp;quot;vegan&amp;quot;)&lt;br /&gt;
load(url(&amp;quot;http://teaching.healthtech.dtu.dk/material/22126/Counts_NGS.RData&amp;quot;))&lt;br /&gt;
head(Counts)&lt;br /&gt;
str(Counts)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q1. How many samples do we have and how many genes?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
The different samples may have been sequenced to different depths. Try to count the reads per sample&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
sampleDepth&amp;lt;-(colSums(Counts))&lt;br /&gt;
hist(sampleDepth, breaks=100, ylab=&amp;quot;Number of samples&amp;quot;, xlab=&amp;quot;Number of reads&amp;quot;, main=&amp;quot;Sample depth&amp;quot;)&lt;br /&gt;
range(sampleDepth)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q2. Whats the sample depth range?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;H3&amp;gt;Species&amp;lt;/H3&amp;gt;&lt;br /&gt;
Lets get the genes associated to species. Here is the gene-wise species taxonomy&lt;br /&gt;
&amp;lt;pre&amp;gt;load(url(&amp;quot;http://teaching.healthtech.dtu.dk/material/22126//taxonomy_species.RData&amp;quot;))&lt;br /&gt;
head(taxonomy_species)&amp;lt;/pre&amp;gt;&lt;br /&gt;
We then combine (by summing) the read counts pr. gene to read counts per species.&lt;br /&gt;
&amp;lt;pre&amp;gt;taxCounts&amp;lt;-apply(Counts, 2, tapply, INDEX=taxonomy_species, sum)&amp;lt;/pre&amp;gt;&lt;br /&gt;
Try looking at the taxCounts matrix&lt;br /&gt;
&amp;lt;pre&amp;gt;str(taxCounts)&lt;br /&gt;
head(taxCounts)&amp;lt;/pre&amp;gt;&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q3. How many species are there in total?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;H3&amp;gt;Richness and Diversity&amp;lt;/H3&amp;gt;&lt;br /&gt;
What is the species richness and diversity (Shannon) for the different samples.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q4. What does a high Shannon diversity index mean?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
OK, lets see it for our samples&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
species_richness&amp;lt;-(colSums(taxCounts&amp;gt;0))&lt;br /&gt;
names(species_richness)&amp;lt;-NULL&lt;br /&gt;
require(vegan)&lt;br /&gt;
speciesDiversity&amp;lt;-diversity(t(taxCounts), index = &amp;quot;shannon&amp;quot;)&lt;br /&gt;
names(speciesDiversity)&amp;lt;-NULL &lt;br /&gt;
par(mfrow=c(1,1))&lt;br /&gt;
barplot(sort(species_richness), las=3, main=&amp;quot;Species richness&amp;quot;, xlab=&amp;quot;Samples&amp;quot;, ylab=&amp;quot;Richness&amp;quot;)&lt;br /&gt;
barplot(sort(speciesDiversity), xlab=&amp;quot;Samples&amp;quot;, las=3, main=&amp;quot;Diversity (Shannon)&amp;quot;)&lt;br /&gt;
plot(species_richness,speciesDiversity,xlab=&amp;quot;Richness&amp;quot;, ylab=&amp;quot;Shannon diversity index&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:raw_richness.png]][[File:raw_shannon.png]][[File:raw_richnessVSshannonZoom.png]]&lt;br /&gt;
&lt;br /&gt;
Each samples or persons richness and diversity is shown and the third plot shows each sample/persons richness &amp;amp; diversity as a dot.&lt;br /&gt;
&amp;lt;H3&amp;gt;Downsizing or rarefying&amp;lt;/H3&amp;gt;&lt;br /&gt;
But this was on the raw count data with different sampling depth (number of counts) per sample. We should downsize so that we get fair comparisons.&lt;br /&gt;
&lt;br /&gt;
First suggest the number of reads we should sample per sample for the downsizing [target]. If we chose a low target we will loose abundance resolution and detection sensitivity. If we chose it higher we will loose samples.&lt;br /&gt;
&amp;lt;pre&amp;gt;&amp;gt; plot(sampleDepth, pch=20, log=&amp;quot;y&amp;quot;, xlab=&amp;quot;Samples&amp;quot;, ylab=&amp;quot;Number of reads&amp;quot;)&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:raw_sampledepth.png]]&lt;br /&gt;
&lt;br /&gt;
There is no right answer (but there are less good suggestions). Insert the number you want to downsize to below and plot it again - the samples above the horizontal line we will keep and the samples below the line we will throw out.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; downsizeTarget &amp;lt;- INSERT NUMBER&lt;br /&gt;
&amp;gt; plot(sampleDepth, pch=20, log=&amp;quot;y&amp;quot;, xlab=&amp;quot;Samples&amp;quot;, ylab=&amp;quot;Number of reads&amp;quot;); abline(h=downsizeTarget)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:downsized_sampledepth.png]]&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q5. Which threshold did you chose and why? How many samples did you loose?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
OK lets downsize&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; dz_Counts&amp;lt;-round(t(t(Counts)*downsizeTarget/sampleDepth))&lt;br /&gt;
&amp;gt; weak_samples&amp;lt;-sampleDepth&amp;lt;downsizeTarget&lt;br /&gt;
&amp;gt; dz_Counts[,weak_samples]&amp;lt;-NA # samples that did not make the cut&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This is a quick and dirty downsizing (ideally one resampled the reads to a given depth, but that will take days)&lt;br /&gt;
Count the species again, now on the downsized data.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
dz_taxCounts&amp;lt;-apply(dz_Counts, 2, tapply, INDEX=taxonomy_species, sum); gc() &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
And the richness and diversity again, now on downsized data&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; dz_species_richness&amp;lt;-(colSums(dz_taxCounts&amp;gt;0))&lt;br /&gt;
&amp;gt; names(dz_species_richness)&amp;lt;-NULL&lt;br /&gt;
&amp;gt; require(vegan)&lt;br /&gt;
&amp;gt; dz_speciesDiversity&amp;lt;-diversity(t(dz_taxCounts), index = &amp;quot;shannon&amp;quot;)&lt;br /&gt;
&amp;gt; dz_speciesDiversity[weak_samples]&amp;lt;-NA&lt;br /&gt;
&amp;gt; names(dz_speciesDiversity)&amp;lt;-NULL&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now plot the richness and diversity with downsized data&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; par(mfrow=c(1,1), pch=1)&lt;br /&gt;
&amp;gt; barplot(sort(dz_species_richness), las=3, main=&amp;quot;Species richness (Downsized)&amp;quot;, xlab=&amp;quot;Species&amp;quot;, ylab=&amp;quot;Richness&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:downsized_richness.png]]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
barplot(sort(dz_speciesDiversity), las=3,main=&amp;quot;Shannon&amp;#039;s diversity index (downsized)&amp;quot;, xlab=&amp;quot;Species&amp;quot;, ylab=&amp;quot;Shannon diversity&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:downsized_shannon.png]]&lt;br /&gt;
&lt;br /&gt;
And compare to the raw data&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; plot(dz_species_richness,species_richness, xlab=&amp;quot;downsized richness&amp;quot;, ylab=&amp;quot;raw richness&amp;quot;, main=&amp;quot;Richness&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:Comparing_richness.png]]&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; plot(dz_speciesDiversity,speciesDiversity,xlab=&amp;quot;downsized species diversity&amp;quot;, ylab=&amp;quot;raw species diversity&amp;quot;,main=&amp;quot;Diversity (Shannon)&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:Comparing_shannon.png]]&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q6. What is the effect on the downsizing on richness&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q7. What is the effect on the downsizing on diversity (shannon)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Lets plot the abundance of each species in a sample with low diversity and a sample with high diversity. You should be able to see a clear difference between the two samples!&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; par(mfrow=c(1,2))&lt;br /&gt;
&amp;gt; barplot(taxCounts[,4], main=&amp;quot;Person 4, SD &amp;gt; 3&amp;quot;, xaxt=&amp;quot;n&amp;quot;, xlab=&amp;quot;Species&amp;quot;, ylab=&amp;quot;Normalized abundance&amp;quot;)&lt;br /&gt;
&amp;gt; barplot(taxCounts[,240], main=&amp;quot;Person 240, SD &amp;lt; 0.5&amp;quot;, xaxt=&amp;quot;n&amp;quot;, xlab=&amp;quot;Species&amp;quot;, ylab=&amp;quot;Normalized abundance&amp;quot;)&lt;br /&gt;
&amp;gt; par(mfrow=c(1,1))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:comparing_species_abundance.png]]&lt;br /&gt;
&lt;br /&gt;
&amp;lt;H3&amp;gt;Comparisons&amp;lt;/H3&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now lets see if there is a difference between the microbiome of lean and obese humans. But first load some sample more information: BMI and Class.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; load(url(&amp;quot;http://teaching.healthtech.dtu.dk/material/22126/BMI.RData&amp;quot;))&lt;br /&gt;
&amp;gt; boxplot(BMI$BMI.kg.m2 ~ BMI$Class, col=c(&amp;quot;red&amp;quot;, &amp;quot;gray&amp;quot;,&amp;quot;blue&amp;quot;), ylab=&amp;quot;BMI&amp;quot;)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
[[File:bmi_class.png]]&lt;br /&gt;
&lt;br /&gt;
Class are: le = Lean; ow = Overweight; ob = Obese&lt;br /&gt;
&lt;br /&gt;
First let us see if the abundance of E. coli differs between obese and lean individuals using a Wilcoxon rank sum test (look for the p-value in the output), also lets get the mean abundance of E. coli in the tree groups :&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; wilcox.test(x=dz_taxCounts[&amp;quot;Escherichia coli&amp;quot;,BMI$Classification==&amp;quot;ob&amp;quot;], y=dz_taxCounts[&amp;quot;Escherichia coli&amp;quot;,BMI$Classification==&amp;quot;le&amp;quot;] )&lt;br /&gt;
&amp;gt; tapply(dz_taxCounts[&amp;quot;Escherichia coli&amp;quot;,], BMI$Classification, mean, na.rm=TRUE)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q8. Is there any significant difference in abundance of E. coli between the different BMI groups?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Let&amp;#039;s test all species correcting for multiple testing using Benjamini-Hochberg (False Discovery Rate) (we are testing 120 species) and plot them:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; pval&amp;lt;-apply(dz_taxCounts, 1, function(V){wilcox.test(x=V[BMI$Classification==&amp;quot;ob&amp;quot;],y=V[BMI$Classification==&amp;quot;le&amp;quot;])$p.value})&lt;br /&gt;
&amp;gt; Abundance_ratio&amp;lt;-log2(apply(dz_taxCounts, 1,function(V){mean(x=V[BMI$Classification==&amp;quot;ob&amp;quot;], na.rm=TRUE)/mean(V[BMI$Classification==&amp;quot;le&amp;quot;], na.rm=TRUE)}))&lt;br /&gt;
&amp;gt; pval.adjust = p.adjust(pval, method=&amp;quot;BH&amp;quot;)&lt;br /&gt;
&amp;gt; plot(sort(pval.adjust), log=&amp;quot;y&amp;quot;, pch=16, xlab=&amp;quot;Species&amp;quot;, ylab=&amp;quot;p-values&amp;quot;)&lt;br /&gt;
&amp;gt; abline(h=0.05, col=&amp;quot;grey&amp;quot;, lty=2)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q9. How many species are significant with an false discovery rate &amp;lt; 0.05?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Let us look at the top 10 most significant species abundance.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; o&amp;lt;-order(pval)&lt;br /&gt;
&amp;gt; BMIstat&amp;lt;-data.frame(pval,pval.adjust, Abundance_ratio)[o,]&lt;br /&gt;
&amp;gt; BMIstat[1:10,]&lt;br /&gt;
&amp;gt; par(mar=c(5,18,5,5))&lt;br /&gt;
&amp;gt; barplot(BMIstat[1:10,3], names.arg=rownames(BMIstat)[1:10], las=1,xlab=&amp;quot;log fold difference between lean and obese&amp;quot;, horiz=TRUE)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[File:log_fold_diff_sign.png]]&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q10. Can you see any differences in the abundances - which species have large differences, what are their p-values?&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q11. What type of bacteria is the most significant one? [try google]&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;H3&amp;gt;Beta-diversity and PCA&amp;lt;/H3&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Plot the Bray-curtis distance between samples as a heatmap.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
library(RColorBrewer)&lt;br /&gt;
library(gplots)&lt;br /&gt;
vdist = as.matrix(vegdist(t(taxCounts)))&lt;br /&gt;
rownames(vdist) = colnames(vdist)&lt;br /&gt;
hmcol = colorRampPalette(brewer.pal(9, &amp;quot;GnBu&amp;quot;))(100)&lt;br /&gt;
heatmap.2(vdist, trace=&amp;#039;none&amp;#039;, col=rev(hmcol))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q12. Can you see some clusters of samples?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Finally for the PCA:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; my.rda &amp;lt;- rda(t(taxCounts))&lt;br /&gt;
&amp;gt; biplot(my.rda, display = c(&amp;quot;sites&amp;quot;, &amp;quot;species&amp;quot;), type = c(&amp;quot;text&amp;quot;, &amp;quot;points&amp;quot;))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q13. Can you see which species that seems to be driving the differences between the samples?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;H3&amp;gt;Statistically modelling the variance using DESeq2&amp;lt;/H3&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Now, we will see the power of statistically modelling the variance instead of downsizing.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; if (!requireNamespace(&amp;quot;BiocManager&amp;quot;, quietly = TRUE))&lt;br /&gt;
&amp;gt; install.packages(&amp;quot;BiocManager&amp;quot;)&lt;br /&gt;
&amp;gt; BiocManager::install(&amp;quot;DESeq2&amp;quot;)&lt;br /&gt;
&amp;gt; library(DESeq2)&lt;br /&gt;
&amp;gt; cts &amp;lt;- taxCounts&lt;br /&gt;
&amp;gt; coldata = BMI[,1]&lt;br /&gt;
&amp;gt; coldata = matrix(NA, nrow=nrow(BMI), ncol=1)&lt;br /&gt;
&amp;gt; coldata[,1] = as.vector(BMI[,1])&lt;br /&gt;
&amp;gt; rownames(coldata) = rownames(BMI)&lt;br /&gt;
&amp;gt; colnames(coldata) = &amp;quot;BMI&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Take a look at coldata&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
coldata&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Make sure that all individuals are in our coldata (information) and also in the data is true&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
all(rownames(coldata) == colnames(cts))&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Load data into DESeq format, perform statistical analysis and get results&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; dds &amp;lt;- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ BMI)&lt;br /&gt;
&amp;gt; dds &amp;lt;- DESeq(dds)&lt;br /&gt;
&amp;gt; res &amp;lt;- results(dds)&lt;br /&gt;
&amp;gt; res&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Order the results according to the adjusted p-value and show the most significant&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; resOrdered &amp;lt;- res[order(res$pvalue),]&lt;br /&gt;
&amp;gt; head(resOrdered)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q14. which are the most significant species (google)? Is there an overlap between these and using downsizing+wilcoxon test (what you did above)?&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
Please find answers [[QuantitativeMetagenomicsSolution|here]]&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>