<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://teaching.healthtech.dtu.dk/22126/index.php?action=history&amp;feed=atom&amp;title=Cancerseq_exercise_answers</id>
	<title>Cancerseq exercise answers - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://teaching.healthtech.dtu.dk/22126/index.php?action=history&amp;feed=atom&amp;title=Cancerseq_exercise_answers"/>
	<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=Cancerseq_exercise_answers&amp;action=history"/>
	<updated>2026-04-14T04:30:47Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.41.0</generator>
	<entry>
		<id>https://teaching.healthtech.dtu.dk/22126/index.php?title=Cancerseq_exercise_answers&amp;diff=69&amp;oldid=prev</id>
		<title>WikiSysop: Created page with &quot;&#039;&#039;&#039;Q1&#039;&#039;&#039;  We run: &lt;pre&gt; gatk Mutect2  -R /home/databases/references/human/GRCh38_full_analysis_set_plus_decoy_hla.fa  -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-N-WEX_recaled.bam     -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-T-WEX_recaled.bam        -normal TCRBOA2-N-WEX         -L chr10:3100000-5100000   --germline-resource /home/databases/databases/GRCh38/somatic-hg38_af-only-gnomad.hg38.vcf.gz   -O TCRBOA2.vcf.gz &lt;/pre&gt;  Then either: &lt;pre&gt;...&quot;</title>
		<link rel="alternate" type="text/html" href="https://teaching.healthtech.dtu.dk/22126/index.php?title=Cancerseq_exercise_answers&amp;diff=69&amp;oldid=prev"/>
		<updated>2024-03-19T16:00:13Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;&amp;#039;&amp;#039;&amp;#039;Q1&amp;#039;&amp;#039;&amp;#039;  We run: &amp;lt;pre&amp;gt; gatk Mutect2  -R /home/databases/references/human/GRCh38_full_analysis_set_plus_decoy_hla.fa  -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-N-WEX_recaled.bam     -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-T-WEX_recaled.bam        -normal TCRBOA2-N-WEX         -L chr10:3100000-5100000   --germline-resource /home/databases/databases/GRCh38/somatic-hg38_af-only-gnomad.hg38.vcf.gz   -O TCRBOA2.vcf.gz &amp;lt;/pre&amp;gt;  Then either: &amp;lt;pre&amp;gt;...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&amp;#039;&amp;#039;&amp;#039;Q1&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
We run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gatk Mutect2  -R /home/databases/references/human/GRCh38_full_analysis_set_plus_decoy_hla.fa  -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-N-WEX_recaled.bam     -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-T-WEX_recaled.bam        -normal TCRBOA2-N-WEX         -L chr10:3100000-5100000   --germline-resource /home/databases/databases/GRCh38/somatic-hg38_af-only-gnomad.hg38.vcf.gz   -O TCRBOA2.vcf.gz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then either:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bcftools view -H TCRBOA2.vcf.gz|wc -l&lt;br /&gt;
bcftools stats TCRBOA2.vcf.gz&lt;br /&gt;
zgrep -v &amp;quot;^#&amp;quot; TCRBOA2.vcf.gz|wc -l &lt;br /&gt;
zcat TCRBOA2.vcf.gz |grep -v &amp;quot;#&amp;quot;|wc -l &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Will give you 9 variants.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q2&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
First we run:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gatk HaplotypeCaller -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-T-WEX_recaled.bam -R /home/databases/references/human/GRCh38_full_analysis_set_plus_decoy_hla.fa -L chr10:3100000-5100000 -O TCRBOA2-T.vcf.gz --dbsnp /home/databases/databases/GRCh38/Homo_sapiens_assembly38.dbsnp138.vcf.gz&lt;br /&gt;
gatk HaplotypeCaller -I /home/projects/22126_NGS/exercises/cancer_seq/TCRBOA2-N-WEX_recaled.bam -R /home/databases/references/human/GRCh38_full_analysis_set_plus_decoy_hla.fa -L chr10:3100000-5100000 -O TCRBOA2-N.vcf.gz --dbsnp /home/databases/databases/GRCh38/Homo_sapiens_assembly38.dbsnp138.vcf.gz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Then counting the number of variants in the tumor:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bcftools view -H TCRBOA2-T.vcf.gz | wc -l&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
424&lt;br /&gt;
&lt;br /&gt;
and normal sample:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bcftools view -H TCRBOA2-N.vcf.gz | wc -l &lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
413&lt;br /&gt;
&lt;br /&gt;
The tumor has more variants which is expected due to a higher amount of somatic variants.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q3&amp;#039;&amp;#039;&amp;#039; &lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
gatk FilterMutectCalls -V TCRBOA2.vcf.gz -R /home/databases/references/human/GRCh38_full_analysis_set_plus_decoy_hla.fa -O TCRBOA2_filtered.vcf.gz&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
We can insect visually:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
bcftools view -H TCRBOA2_filtered.vcf.gz |less -S&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Or to classify filters in a straightforward manner:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
 bcftools view -H TCRBOA2_filtered.vcf.gz |cut -f 7   |tr &amp;quot;;&amp;quot; &amp;quot;\n&amp;quot; |sort |uniq -c  |sort&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
You get:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
      1 slippage&lt;br /&gt;
      2 haplotype&lt;br /&gt;
      2 PASS&lt;br /&gt;
      2 weak_evidence&lt;br /&gt;
      3 clustered_events&lt;br /&gt;
      3 map_qual&lt;br /&gt;
      4 strand_bias&lt;br /&gt;
      6 normal_artifact&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
see some notes on the meaning of these filters [https://www.biorxiv.org/content/biorxiv/early/2019/12/02/861054/DC1/embed/media-1.pdf?download=true here]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q4&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
Running&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
java -jar /usr/local/bin/SnpSift.jar  annotate /home/databases/databases/GRCh38/Homo_sapiens_assembly38.dbsnp138.vcf.gz TCRBOA2_filtered.vcf.gz  | bgzip -c &amp;gt; TCRBOA2_filtered_anno.vcf.gz&lt;br /&gt;
bcftools view -H -f PASS TCRBOA2_filtered_anno.vcf.gz | less -S&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
should give you:&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
chr10	3165513	rs9423502	G	C	.	PASS	AS_FilterStatus=SITE;AS_SB_TABLE=39,119|0,3;DP=168;ECNT=1;GERMQ=93;MBQ=38,39;MFRL=225,184;MMQ=60,60;MPOS=4;NALOD=1.86;NLOD=21.37;POPAF=1.32;TLOD=6.08;CAF=[0.9454,0.05464];COMMON=1;G5;GNO;HD;KGPROD;KGPhase1;NSM;OTHERKG;PH3;REF;RS=9423502;RSPOS=3207705;S3D;SAO=0;SLO;SSR=0;VC=SNV;VLD;VP=0x050300000a01150517000101;WGT=1;dbSNPBuildID=119	GT:AD:AF:DP:F1R2:F2R1:SB	0/0:79,0:0.014:79:41,0:38,0:19,60,0,0	0/1:79,3:0.054:82:41,2:38,1:20,59,0,3&lt;br /&gt;
chr10	4972935	.	A	T	.	PASS	AS_FilterStatus=SITE;AS_SB_TABLE=8,84|0,4;DP=102;ECNT=1;GERMQ=93;MBQ=36,34;MFRL=302,495;MMQ=50,40;MPOS=20;NALOD=1.58;NLOD=11.08;POPAF=1.62;TLOD=8.82GT:AD:AF:DP:F1R2:F2R1:SB	0/0:37,0:0.026:37:16,0:20,0:1,36,0,0	0/1:55,4:0.083:59:26,2:28,2:7,48,0,4&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The ID &amp;quot;rs9423502&amp;quot; is a dbSNP ID so the SNP at 3165513 was previously found whereas 4972935 was not.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q5&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
The SNP can be found here: https://www.ncbi.nlm.nih.gov/snp/?term=rs9423502&lt;br /&gt;
&lt;br /&gt;
Generally, the prevalence of the SNPs is relatively low (2-5%) which indicates that there is a potential role for diving cancer.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q6&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
The variant on chromosome 18 are missense, potentially deleterious and have the COSMIC ID: COSV99493765. In the COSMIC database, it hits the DYM gene and is mostly found mutated in liver and prostate. &lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q7&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Often enough, around 6% in certain cases&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Q8&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
kidney but the confidence is low as the prediction score is a virtual tie with liver.&lt;/div&gt;</summary>
		<author><name>WikiSysop</name></author>
	</entry>
</feed>