<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22126/index.php?action=history&amp;feed=atom&amp;title=Data_Preprocess_exercise_answers</id>
	<title>Data Preprocess exercise answers - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22126/index.php?action=history&amp;feed=atom&amp;title=Data_Preprocess_exercise_answers"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=Data_Preprocess_exercise_answers&amp;action=history"/>
	<updated>2026-04-07T07:39:26Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22126/index.php?title=Data_Preprocess_exercise_answers&amp;diff=285&amp;oldid=prev</id>
		<title>Mick at 12:16, 5 January 2026</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=Data_Preprocess_exercise_answers&amp;diff=285&amp;oldid=prev"/>
		<updated>2026-01-05T12:16:19Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 14:16, 5 January 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l135&quot;&gt;Line 135:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 135:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;It is normal that a lot of forward reads now find themselves alone because the reverse reads were much worse. Therefore in the unpaired you will have more reads in the forward that in the reverse.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;It is normal that a lot of forward reads now find themselves alone because the reverse reads were much worse. Therefore in the unpaired you will have more reads in the forward that in the reverse.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&#039;&#039;&#039;Q13&#039;&#039;&#039;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;pre&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;leeHom  --auto--ancientdna -fqo ERR4778296 -fq1 /home/projects/22126_NGS/exercises/preprocess/ex4/ERR4778296_1.fastq.gz -fq2 /home/projects/22126_NGS/exercises/preprocess/ex4/ERR4778296_2.fastq.gz &lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Total 250000; Merged (trimming) 237536; Merged (overlap) 9848; Kept PE/SR 2601; Trimmed SR 0; Adapter dimers/chimeras 15; Failed Key 0; UMI problems 0&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/pre&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;2601 were left as is, 1.04% (2601/250000) so very little.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Mick</name></author>
	</entry>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22126/index.php?title=Data_Preprocess_exercise_answers&amp;diff=22&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;&#039;&#039;&#039;Q1&#039;&#039;&#039;  &lt;pre&gt;  zcat /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz|head -n 2  |tail -1 |wc  -c 151 &lt;/pre&gt;  However, the answers is 150 as &quot;wc&quot; counts the end of line character  &#039;&#039;&#039;Q2&#039;&#039;&#039;  Running: &lt;pre&gt; fastqc -o .  /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz fastqc -o .  /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957868_1.fastq.gz &lt;/pre&gt;  SRR957824 is the worse run, the quality scores towards the end of the r...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=Data_Preprocess_exercise_answers&amp;diff=22&amp;oldid=prev"/>
		<updated>2024-03-19T15:26:58Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;&amp;#039;&amp;#039;&amp;#039;Q1&amp;#039;&amp;#039;&amp;#039;  &amp;lt;pre&amp;gt;  zcat /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz|head -n 2  |tail -1 |wc  -c 151 &amp;lt;/pre&amp;gt;  However, the answers is 150 as &amp;quot;wc&amp;quot; counts the end of line character  &amp;#039;&amp;#039;&amp;#039;Q2&amp;#039;&amp;#039;&amp;#039;  Running: &amp;lt;pre&amp;gt; fastqc -o .  /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz fastqc -o .  /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957868_1.fastq.gz &amp;lt;/pre&amp;gt;  SRR957824 is the worse run, the quality scores towards the end of the r...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Q1&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 zcat /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz|head -n 2  |tail -1 |wc  -c&lt;br /&gt;
151&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
However, the answers is 150 as &amp;quot;wc&amp;quot; counts the end of line character&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q2&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Running:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fastqc -o .  /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz&lt;br /&gt;
fastqc -o .  /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957868_1.fastq.gz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
SRR957824 is the worse run, the quality scores towards the end of the reads are low.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q3&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
SRR957868 is the ok one but has a warning sign in the category of &amp;quot;Overrepresented sequences&amp;quot; and an error in the &amp;quot;Adapter Content&amp;quot; due to the presence of untrimmed adapters. The one solution would be to trim the remaining adapter sequences.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q4&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATG -o  SRR957868_1_o.fastq.gz SRR957868_1.fastq.gz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Sequence: AGATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATG; Type: regular 3&amp;#039;; Length: 50; Trimmed: 59789 times&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
so 59789 times&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q5&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
&lt;br /&gt;
The number of times the adapter was trimmed would have been much lower as the program would not have been able to recognize the adapter sequences and would remove random sequences that just happened to look like the erroneous sequence.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q6&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fastqc -o .  SRR957868_1_o.fastq.gz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
The error in the &amp;quot;Adapter Content&amp;quot; section is gone. There is still a warning sign in the category of &amp;quot;Overrepresented sequences&amp;quot; but a program does not report the adapter sequence.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q7&amp;#039;&amp;#039;&amp;#039;  &lt;br /&gt;
&lt;br /&gt;
The command:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
fastp -Q -L   --adapter_sequence AGATCGGAAGAGCACACGTCTGAACTCCAGT --adapter_sequence_r2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGT    --out1 SRR794302_1_trimmed.fastq.gz --out2 SRR794302_2_trimmed.fastq.gz   --in1  /home/projects/22126_NGS/exercises/preprocess/ex2/SRR794302_1.fastq.gz --in2  /home/projects/22126_NGS/exercises/preprocess/ex2/SRR794302_2.fastq.gz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
Should give you the following stats:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
Filtering result:&lt;br /&gt;
reads passed filter: 499988&lt;br /&gt;
reads failed due to low quality: 0&lt;br /&gt;
reads failed due to too many N: 0&lt;br /&gt;
reads with adapter trimmed: 11042&lt;br /&gt;
bases trimmed due to adapters: 138056&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The first forward read to be trimmed is:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
@SRR794302.7 HWI-ST434:134117522:C1N85ACXX:6:1101:1666:2094/1&lt;br /&gt;
GCCTTTCTCCTATCCATCCCCTACTGTTTCAGTGCACCACTGAAACTATTCCACCTCCATAGAGACTCCGCAGGGTGATGTCCTATCAGCCAGCTATGAG&lt;br /&gt;
+&lt;br /&gt;
@@?BDBD?,=C&amp;lt;CFF?EHC@FHEIIEDDFAEHH&amp;lt;F@CBD@AGC@FFHGCBHBFFD8(8B&amp;lt;CHIBGG@F&amp;gt;@AE:/&amp;#039;.;A3&amp;gt;(6.;@@(;;3;?B@3:5&amp;gt;AA&lt;br /&gt;
@SRR794302.8 HWI-ST434:134117522:C1N85ACXX:6:1101:1721:2099/1&lt;br /&gt;
TCACTTTGTCGGCCAGGCTGGAGTGCAGTTGTGCAATCTCAGTTTGTTGCAACATCTGCCTCCCAGGGTCAAGCAATTCTCATGCTTCA&lt;br /&gt;
+&lt;br /&gt;
;??DB;DDDH??DG::AC?FGAG?CGF@::B?BDFB*?@&amp;lt;D?FEG?D39DHH9FH&amp;gt;F4@==.6-7?EF=7;777&amp;gt;3;A&amp;gt;A&amp;gt;;;;CCEA&amp;gt;&lt;br /&gt;
@SRR794302.9 HWI-ST434:134117522:C1N85ACXX:6:1101:1687:2127/1&lt;br /&gt;
TAGAGGGACTAATCTAAAACTACCTTTTTTCAATTTAAGAACTTTGTTTTATTTACCAATTTAAGGGTGATAAGCTGTGAAGAAGTAATTTAGAACAACC&lt;br /&gt;
+&lt;br /&gt;
@@CDFFFFHHFFHIJJGIIHHIHJIJJIJJJFGIJJFEHGGGIJJIGIJJIIJJDHHEGIJJJIIIIDDACCCAEABCECBCC@C;&amp;gt;&amp;gt;CDCC@CA;?CBB&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
so &amp;quot;@SRR794302.8 HWI-ST434:134117522:C1N85ACXX:6:1101:1721:2099&amp;quot; was the read that was trimmed first. The original was:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
@SRR794302.8 HWI-ST434:134117522:C1N85ACXX:6:1101:1721:2099/1&lt;br /&gt;
TCACTTTGTCGGCCAGGCTGGAGTGCAGTTGTGCAATCTCAGTTTGTTGCAACATCTGCCTCCCAGGGTCAAGCAATTCTCATGCTTCAAGATCGGAAGA&lt;br /&gt;
+&lt;br /&gt;
;??DB;DDDH??DG::AC?FGAG?CGF@::B?BDFB*?@&amp;lt;D?FEG?D39DHH9FH&amp;gt;F4@==.6-7?EF=7;777&amp;gt;3;A&amp;gt;A&amp;gt;;;;CCEA&amp;gt;A&amp;gt;&amp;gt;AC@?&amp;lt;?5&amp;lt;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
So the &amp;quot;AGATCGGAAGA&amp;quot; at the end does match the adapter sequence that we provided which was: &amp;quot;AGATCGGAAGAGCACACGTCTGAACTCCAGT&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
It is indeed the same sequence that was trimmed first in the reverse read:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
@SRR794302.8 HWI-ST434:134117522:C1N85ACXX:6:1101:1721:2099/2&lt;br /&gt;
TGAAGCATGAGAATTGCTTGACCCTGGGAGGCAGATGTTGCAACAAACTGAGATTGCACAACTGCACTCCAGCCTGGCCGACAAAGTGA&lt;br /&gt;
+&lt;br /&gt;
@@@DDDBDF8D&amp;gt;F@FA&amp;gt;F?FC??EHEEHG1?:1?;FDGEGC&amp;lt;BFAHEHGAGHEHHIICH&amp;lt;&amp;gt;;=DHI@;AAA&amp;gt;&amp;gt;?CE?BCB&amp;gt;&amp;gt;/;9?C&amp;gt;3&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This makes sense because the length of the original DNA fragment was probably 89bp.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q8&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
According to the message when the program finishes:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
reads with adapter trimmed: 11042&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
11042 sequences out of 499988 were trimmed. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q9&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
The longer the insert size, the less you will have adapters detected. You will get more adapters detected and trimmed if the insert size is short.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q10&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Reverse reads has particularly low-quality scores towards the end of the reads.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q11&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Trimmomatic should have substantially eliminated a number of sequences with low base quality the improvement should be more noticeable on the reverse read.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q12&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
zcat SRR8002634_1U.fastq.gz|wc -l &lt;br /&gt;
384672&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
so 384672/4 = 96168&lt;br /&gt;
&lt;br /&gt;
A quicker way:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
echo $(zcat SRR8002634_1U.fastq.gz|wc -l)/4|bc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
96168&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
echo $(zcat SRR8002634_2U.fastq.gz|wc -l)/4|bc&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
4770&lt;br /&gt;
&lt;br /&gt;
It is normal that a lot of forward reads now find themselves alone because the reverse reads were much worse. Therefore in the unpaired you will have more reads in the forward that in the reverse.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q13&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
leeHom  --auto--ancientdna -fqo ERR4778296 -fq1 /home/projects/22126_NGS/exercises/preprocess/ex4/ERR4778296_1.fastq.gz -fq2 /home/projects/22126_NGS/exercises/preprocess/ex4/ERR4778296_2.fastq.gz &lt;br /&gt;
Total 250000; Merged (trimming) 237536; Merged (overlap) 9848; Kept PE/SR 2601; Trimmed SR 0; Adapter dimers/chimeras 15; Failed Key 0; UMI problems 0&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
2601 were left as is, 1.04% (2601/250000) so very little.&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>