User contributions for WikiSysop
Jump to navigation
Jump to search
19 March 2024
- 16:3416:34, 19 March 2024 diff hist +8 Denovo exercise No edit summary current
- 16:3116:31, 19 March 2024 diff hist +20,412 N Denovo exercise Created page with "<H2>Overview</H2> First: <OL> <LI>Navigate to your home directory: <LI>Create a directory called "denovo" <LI>Navigate to the directory you just created. </OL> <p>In this exercise we will try to perform de novo assembly of Illumina paired-end reads. The data is from a <i>Vibrio cholerae</i> strain isolated in Nepal. You will try to: <OL> <LI>Run FastQC, adaptor and quality trimming reads (Optional - repeat of analysis you have already done in data pre-processing)..."
- 16:3116:31, 19 March 2024 diff hist +7,265 N SNP calling exercise answers Created page with "'''Q1''' First, running: <pre> tabix -f -p vcf NA24694.gvcf.gz </pre> then <pre> gatk --java-options "-Xmx10g" HaplotypeCaller -R /home/databases/references/human/GRCh38_full_analysis_set_plus_decoy_hla.fa -I /home/projects/22126_NGS/exercises/snp_calling/NA24694.bam -L chr20 -O NA24694.gvcf.gz --dbsnp /home/databases/databases/GRCh38/Homo_sapiens_assembly38.dbsnp138.vcf.gz -ERC GVCF </pre> <pre> gatk GenotypeGVCFs -R /home/databases/references/human/GRCh3..." current
- 16:3016:30, 19 March 2024 diff hist +16,677 N SNP calling exercise Created page with "<H2>Overview</H2> First: <OL> <LI>Navigate to your home directory: <LI>Create a directory called "variant_call" <LI>Navigate to the directory you just created. </OL> We will: <OL> <LI>Genotype some whole-genome sequencing data. <LI>Get acquainted with VCF files <LI>Soft filtering <LI>Hard filtering <LI> Annotation of variants </OL> ---- <H2>Genotyping</H2> We will genotype a chromosome from a BAM file that has been processed using the steps we detailed before. It i..." current
- 16:2916:29, 19 March 2024 diff hist +1,834 N Postprocess exercise answers Created page with "'''Q1''' Running: <pre> java -jar /home/ctools/picard_2.23.8/picard.jar MarkDuplicates -I /home/projects/22126_NGS/exercises/dupremoval/ERR016028_chr20_sort.bam -M ERR016028_chr20_sort_markdup.metrics.txt -O ERR016028_chr20_sort_markdup.bam </pre> The log should state: <pre> Marking 9798 records as duplicates. </pre> Please note that this is very low but that is because we have very little data so that it runs faster. '''Q2''' They do not have the same sequence:..." current
- 16:2916:29, 19 March 2024 diff hist +3,530 N Postprocess exercise Created page with "<H2>Overview</H2> First: <OL> <LI>Navigate to your home directory: <LI>Create a directory called "postalign" <LI>Navigate to the directory you just created. </OL> In this exercise, we will pre-process bam-files so they are ready for SNP calling. This is necessary to reduce the high number of potential false SNPs that will get called. You will try to: <OL> <LI>Mark read duplicates from the BAM-files <LI>Merge BAM files </OL> <H2>Duplicate removal</H2> <p>We are..." current
- 16:2816:28, 19 March 2024 diff hist +4,533 N Alignment exercise answers Created page with "'''Q1:''' 3 possible ways: * The file with the smaller file size contains the trimmed reads. * Peek in the file and determine which file contains reads of uneven lengths * Use fastqc, to determine which file contains overrepresented adapter sequences '''Q2:''' 4 lines if you have added the RG '''Q3:''' 166782 '''Q4:''' 0, this means that the probability of being mismapped is one. This means that this read cannot be confidently assigned to this position. '''Q5:''' The..." current
- 16:2716:27, 19 March 2024 diff hist +13,848 N Alignment exercise Created page with " <H2>Overview</H2> First: <OL> <LI>Navigate to your home directory: <LI>Create a directory called "align" <LI>Navigate to the directory you just created. </OL> We will try to align different types of NGS data. # <i>Pseudomonas alcaligenes</i> single-end Illumina reads # Human single-end paired-end Illumina reads <H2><i>P. aeruginosa</i> single-end Illumina reads</H2> <H3>Alignment using bwa mem</H3> <p> We will align some of the single-end reads that we trimmed f..." current
- 16:2616:26, 19 March 2024 diff hist +5,351 N Data Preprocess exercise answers Created page with "'''Q1''' <pre> zcat /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz|head -n 2 |tail -1 |wc -c 151 </pre> However, the answers is 150 as "wc" counts the end of line character '''Q2''' Running: <pre> fastqc -o . /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957824_1.fastq.gz fastqc -o . /home/projects/22126_NGS/exercises/preprocess/ex1/SRR957868_1.fastq.gz </pre> SRR957824 is the worse run, the quality scores towards the end of the r..." current
- 16:2616:26, 19 March 2024 diff hist +12,804 N Data Preprocess exercise Created page with " <H3>Overview</H3> First: <OL> <LI>Navigate to your home directory: <LI>Create a directory called "preprocess" <LI>Navigate to the directory you just created. </OL> We will try to pre-process several types of NGS data. # <i>Escherichia coli</i> single-end Illumina reads # <i>Pseudomonas aeruginosa</i> paired-end Illumina reads <HR> <h2><i>Escherichia coli</i> single-end Illumina reads</h2> <h3>Introduction</h3> <p> An outbreak of <i>E. coli</i> has occurred. Peop..." current
- 16:2516:25, 19 March 2024 diff hist +317 N Data basics exercise answers Created page with "Answers: # S or L # X, I or J # X # Yes, quality scores picked from [<=>?@ABCDEFGHI], either very good quality (+33) or very poor (+64). # D = 68, 68-33 = 35, -> p[error] = 10^[-3.5] = 0.00031622776 = 1/3162 This really goes to show that having metadata from the sequencing run is essential for proper analysis." current
- 16:2516:25, 19 March 2024 diff hist +3,384 N Data basics exercise Created page with " <p>This is a small exercise where we will try to identify the quality encoding of some reads.</p> <HR> <H3>Read quality encoding table</H3> We have seen that the fastq format encodes quality scores which represent the probability of an error. '''Beware''' because there are many different types of encoding for quality scores. The table below summarizes it. This table is adapted from Wikipedia article on [https://en.wikipedia.org/wiki/FASTQ_format FASTQ format]: <pre>..." current
- 16:2416:24, 19 March 2024 diff hist +4,770 N Zip codes answers Created page with " Please note that in UNIX, there is more than one way to do things. '''Q1:''' <pre> zcat /home/projects/22126_NGS/exercises/unix/ZIP_CODES.csv.gz |awk 'BEGIN{FS=","}{if($5=="\"NY\""){print $0}}'|wc -l </pre> <ol> <li><code>zcat</code>: Decompresses the ZIP_CODES.csv.gz file and outputs its content.</li> <li><code>awk 'BEGIN{FS=","}</code>: Sets the field separator to a comma for the CSV file.</li> <li><code>if($5=="\"NY\""){print $0}</code>: Checks if the 5th field (st..." current
- 16:2316:23, 19 March 2024 diff hist +2,402 N Zip codes Created page with " <H2>Extra fun with US zip codes</H2> <p>If you are 100% done with everything, you can have fun with the following exercise involving [https://en.wikipedia.org/wiki/ZIP_Code US zip codes]. This is mostly for people with previous Unix experience. </p> You will find the following file: <pre> /home/projects/22126_NGS/exercises/unix/ZIP_CODES.csv.gz </pre> No need to copy it or unzip it. You can view it with '''zcat''' or '''zless'''. csv stands for comma-separated val..." current
- 16:2316:23, 19 March 2024 diff hist +1,452 N First look exercise answers Created page with " <H2> Solutions </H2> Illumina data: 1. <pre> cd </pre> 2. <pre> mkdir first_look/ </pre> 3. <pre> cp /data/shared/exercises/first_look/reads.fastq.gz . </pre> 4. <pre> zless -S reads.fastq.gz </pre> 5. <pre> zcat /data/shared/exercises/first_look/reads.fastq.gz |wc -l </pre> 1000 lines so 1000/4 250 sequences. 1. <pre> tar xvfz /data/shared/exercises/first_look/pairedReads.tar.gz </pre> 2. <pre> head ERR243038_1.fastq ERR243038_2.fastq </pre>..." current
- 16:2216:22, 19 March 2024 diff hist +9,367 N First look exercise Created page with " <H2>Overview</H2> <p>In this exercise you will try to look at empirical NGS data. Additionally, you will try to use the '''screen''' command when using the shell. </p> <OL> <LI>Use standard UNIX commands to work with NGS data <LI>Use '''screen''' in shell </OL> <HR> <H2>First look at data</H2> <OL> <LI>Navigate to your home directory: <pre> cd </pre> '''cd''' without arguments will bring you back to your home directory. In our case, your home is: <pre> /home/people..." current
- 16:2116:21, 19 March 2024 diff hist +2 Program 2024 →Course Program - January 2024 current
- 16:1816:18, 19 March 2024 diff hist +5,409 N Unix answers Created page with " 1. Use a text editor to (nedit/gedit/komodo/textwrangler) to create a file mycommands.txt where you write all commands and observations you do in the following exercises. Use copy/paste to copy the commands. Note: There are more standard text editors than nedit, etc. Examples are emacs, xemacs, vi, vim, and pico. Make sure that we can easily see which exercise you attempt to solve. 2. First list the files in the directory. <pre> ls </pre> 3. Copy ex1.acc to myfile.ac..." current
- 16:1716:17, 19 March 2024 diff hist +90 Logging on to pupil system No edit summary current
- 16:1416:14, 19 March 2024 diff hist +4,741 N Logging on to pupil system Created page with " <HR> <H2>Overview</H2> In this exercise, we will prepare our computers to log on to our servers called "pupilX", where X is 1/2/3. These are small but reliable machines. Please read the instructions carefully. Please be aware: there is '''no''' backup. All of your data will be deleted when the class concludes. <H2>Are you physically at DTU?</H2> First, make sure you are connected to the internet. If you are on campus, you do not need to do anything extra, however,..."
- 15:5915:59, 19 March 2024 diff hist +12,776 N Program 2024 Created page with " '''REMEMBER TO BRING A LAPTOP FOR EXERCISES''' Lectures will be in person in building [https://goo.gl/maps/k4wYkMjTJ2HLHuyN8 303A] in auditorium 44. Offline discussions will take place on Discord (https://discord.gg/7PKuKhKYQJ). Please register with your '''full name'''. Will use Discord for online classes and collaboration with your project partners. <!-- Lectures and exercises will take place on Discord (https://discord.gg/FBb2edFW). Please register with your ful..."
- 15:5715:57, 19 March 2024 diff hist −70 22126 - Next Generation Sequencing Analysis No edit summary
- 15:5615:56, 19 March 2024 diff hist 0 22126 - Next Generation Sequencing Analysis No edit summary
- 15:5515:55, 19 March 2024 diff hist 0 22126 - Next Generation Sequencing Analysis No edit summary
- 15:5415:54, 19 March 2024 diff hist +3,282 N 22126 - Next Generation Sequencing Analysis Created page with "<small>Introduction to Next-Generation Sequencing Analysis, 5 ECTS</small> <hr> <br> [http://kurser.dtu.dk/course/36626 DTU's Studies Handbook about #36626] [http://kurser.dtu.dk/course/36826 DTU's Studies Handbook about #36826]<be> Program 2024 Program 2023 Program 2022 Program 2021 Program 2020 Program 2019 The next course will be held in January 2024, the course runs every day for a three weeks period. The course consists of lectures, e..."
- 15:5215:52, 19 March 2024 diff hist +43 N MediaWiki:Mainpage Created page with "22126 - Next Generation Sequencing Analysis" current
- 15:5115:51, 19 March 2024 diff hist 0 N MediaWiki:Disclaimers Created blank page current
- 15:5115:51, 19 March 2024 diff hist 0 N MediaWiki:Aboutsite Created blank page current
- 15:5015:50, 19 March 2024 diff hist 0 N MediaWiki:Privacy Created blank page current