User contributions for WikiSysop

A user with 76 edits. Account created on 4 August 2025.
Jump to navigation Jump to search
Search for contributionsExpandCollapse
⧼contribs-top⧽
⧼contribs-date⧽
(newest | oldest) View (newer 50 | ) (20 | 50 | 100 | 250 | 500)

7 November 2025

6 November 2025

29 October 2025

17 October 2025

26 September 2025

  • 13:4213:42, 26 September 2025 diff hist −9 m 22118/22168 Advanced Unix & Python for Bioinformaticians→‎Resources
  • 13:4113:41, 26 September 2025 diff hist 0 m 22118/22168 Advanced Unix & Python for Bioinformaticians→‎Resources
  • 13:3413:34, 26 September 2025 diff hist 0 N File:DNA.jpgNo edit summary current
  • 13:3313:33, 26 September 2025 diff hist +4,874 N Biological knowledge needed in the courseCreated page with "__NOTOC__ == Genetic information == The genetic information is stored in the DNA double helix strand, check [https://en.wikipedia.org/wiki/DNA Wikipeida on DNA]. A strand consists of a sequence of the 4 nucleotides (bases); Adenine (A), Thymine (T), Cytosine (C) and Guanine (G). A gene is a sequence of the 4 different nucleotides, where subsequent triplets of nucleotides (a codon) is translated into a sequence of amino acids, which then forms the proteins of our body. A..." current
  • 13:3213:32, 26 September 2025 diff hist +2,596 N Shortest path in graphCreated page with "__NOTOC__ === Description === The program is given as input a file containing connected nodes in a graph and a weight assigned to the edge between the nodes. The program shall answer the questions: Is there a path between two given nodes in the graph? If so, what is the shortest path ?<br> This is useful in a number of situations: Protein interaction, which proteins interact together, thereby discovering f.ex. new pathways. Social networks, who knows who, proving the "si..." current
  • 13:3113:31, 26 September 2025 diff hist +1,846 N Positive proteinsCreated page with "__NOTOC__ === Description === Find the top 1000 most positively charged protein sequences in uniprot and put them in a fasta file. Repeat the search but this time find the most positively charged protein sequences per molecular weight of the sequence and put that into another fasta file. Among the 20 common amino acids, five have a side chain which can be charged. At pH=7, two are negative charged: aspartic acid (D) and glutamic acid (E) (acidic side chains), and three a..." current
  • 13:3013:30, 26 September 2025 diff hist +1,614 N Spider toxinsCreated page with "__NOTOC__ === Description === Find all spider toxins in uniprot and output them in a fasta file. Who knows when it will be useful to produce venom? === Input/output === Download the entire [https://teaching.healthtech.dtu.dk/material/22118/uniprot_sprot.dat.gz swissprot database]. This will be your input file to your program.<br> Unpack it yourself with gunzip uniprot_sprot.dat.gz or whatever method you prefer. Careful, it will take up 3 GB.<br> Notice there are many s..." current
  • 13:2913:29, 26 September 2025 diff hist +1,493 N Fun with biology - find english wordsCreated page with "__NOTOC__ === Description === Parse the entire uniprot database and extract the ID and the sequences. Find English words that are hidden (actually occur randomly) in the sequences. The words must be between 3 and 10 letters long, both inclusive. Display or save in a file the ID together with the words found in the sequence, but only if the total number of letters is 5 or more for that entry. === Input/output === Download the entire [https://teaching.healthtech.dtu.dk/ma..." current
  • 13:2813:28, 26 September 2025 diff hist +2,398 N Find short virus genes with disulfid bridgesCreated page with "__NOTOC__ === Description === Find all short (150 or less aa) virus genes in uniprot, that contain intrachain disulfid bridges. Interchain disulfide bonds can produce stable, covalently linked protein dimers, multimers or complexes, whereas intrachain disulfide bonds can contribute to protein folding and stability. === Input/output === Download the entire [https://teaching.healthtech.dtu.dk/material/22118/uniprot_sprot.dat.gz swissprot database]. This will be your input..." current
  • 13:2713:27, 26 September 2025 diff hist +1,658 N Find the mature part of human genes with a signal peptideCreated page with "__NOTOC__ === Description === Find all human genes in uniprot with a signal peptide. Extract the entire sequence and create a fasta file with only the mature proteins. === Input/output === Download the entire [https://teaching.healthtech.dtu.dk/material/22118/uniprot_sprot.dat.gz swissprot database]. This will be your input file to your program.<br> Unpack it yourself with gunzip uniprot_sprot.dat.gz or whatever method you prefer. Careful, it will take up 3 GB.<br> Not..." current
  • 13:2613:26, 26 September 2025 diff hist +1,543 N Heuristic methods for fair sharingCreated page with "__NOTOC__ === Description === Distributing jobs/items to a number of consumers in a fair way has a number of applications. In this project you must implement 5 methods mentioned in the [https://teaching.healthtech.dtu.dk/material/22112/HPCLife-LoadBalancing.ppt powerpoint]: Random Assignment, Round Robin, Max-Min Round Robin, Reverse Round Robin and Least Load. === Input/output === As can be seen from the powerpoint, there is some randomness in the input, i.e. the numb..." current
  • 13:2513:25, 26 September 2025 diff hist +1,844 N Human genes with activities in more than one region of the cellCreated page with "__NOTOC__ === Description === Find human genes which are targeted to more than one region in the cell. Save the genes in fasta format. === Input/output === Download the entire [https://teaching.healthtech.dtu.dk/material/22118/uniprot_sprot.dat.gz swissprot database]. This will be your input file to your program.<br> Unpack it yourself with gunzip uniprot_sprot.dat.gz or whatever method you prefer. Careful, it will take up 3 GB.<br> Notice there are many swissprot entr..." current
  • 13:2413:24, 26 September 2025 diff hist +5,590 N Score sequence data with a PSSMCreated page with "__NOTOC__ === Description === Position specific scoring matrices (PSSM) are statistically motivated sequence motif models that provide higher sensitivity and specificity than regular expressions. The project consists of reading a TRANSFAC matrix table, converting it to a log-likelihood matrix, which is used to find matching motifs in a DNA fasta file. Learn more about PSSM: https://en.wikipedia.org/wiki/Position_weight_matrix === Input and output === The program is giv..." current
  • 13:2213:22, 26 September 2025 diff hist +662 N Mini projectsCreated page with "This is a list of small projects. Making the code for each project should be possible within a day or two (full time). No solutions exists for these projects, nor will they be made. * Project 1: Score sequence data with a PSSM * Project 2: Human genes with activities in more than one region of the cell * Project 3: Heuristic methods for fair sharing * Project 4: Find the mature part of human genes with a signal peptide * Project 5: Find short virus gen..." current
  • 12:4812:48, 26 September 2025 diff hist +4,332 N Artificial Neural NetworkCreated page with "__NOTOC__ ===Description=== Implement a simple artificial neural network algorithm with backpropagation in Python. ANNs are of great interest in bioinformatics. The institute has created many online prediction servers, which utilises ANNs.<br> The data is a part of a project at DTU HealthTech, which is about prediction of whether certain variations of a SNP will lead to a disease or not. A lot of work has already gone into preparing a data set for network training. The r..." current
  • 12:4712:47, 26 September 2025 diff hist +1,916 N Pairwise alignmentCreated page with "__NOTOC__ ===Description=== Aligning sequences is of great importance in bioinformatics. Many discoveries are based on finding sequences that align to each other. Evolution theory and phylogeny are based on sequence alignments. This project is about implementing a well-known algorithm for aligning two sequences, i.e. finding where they match in an optimal fashion. You must choose to implement either: # Smith-Waterman alignment where the goal is to find the best local al..." current
  • 12:4512:45, 26 September 2025 diff hist +6,909 N QT clusteringCreated page with "__NOTOC__ ===Description=== The program reads a number of data points (multi-dimensional vectors) from a file and partitions those into clusters. Clustering is important in discovering patterns or modes in multi-dimensional data sets. It is also a method of organizing data examples into similar groups (clusters). In this particular case, QT clustering partitions the data set such that each example (data point) is assigned to exactly one cluster. QT clustering is superior..." current
  • 12:4312:43, 26 September 2025 diff hist +5,509 N Read trimmer for Next-Generation-Sequencing dataCreated page with "__NOTOC__ ===Description=== The advent of Next Generation Sequencing (NGS) technologies have transformed how biological research is being performed and today almost all biological fields use the technology for cutting edge discoveries. Today, a human genome can be sequenced in very short time for approximately $1000 giving unprecedented possibilities for investigating human traits, evolution and diseases. Similarly whole bacterial communities and their interplay with the..." current
  • 12:4112:41, 26 September 2025 diff hist +3,133 N K-nearest neighbor (k-NN) continuous variable estimationCreated page with "__NOTOC__ ===Description=== This scripts read a matrix-styled data file, containing missing values, and infers these values by finding the k-nearest neighbors. An application of this can be seen in Microarray experiments, in which the observed signal is not always significantly different from the background signal. Imputing these values are a cheaper solution rather than redoing the whole experiment. This method has been shown to perform better than e.g. rowmeans, and fa..." current
  • 12:4012:40, 26 September 2025 diff hist 0 m Resistance to antibiotics→‎Input and output current
  • 12:3912:39, 26 September 2025 diff hist 0 N File:Absent.jpgNo edit summary current
  • 12:3912:39, 26 September 2025 diff hist 0 N File:Present.jpgNo edit summary current
(newest | oldest) View (newer 50 | ) (20 | 50 | 100 | 250 | 500)