February 3rd 2020

Who am I

Agenda for the Day

Course Introduction

First time ever!

  • You’re the first batch of students in this course!
  • Therefore, I will develop and adjust the course as we go along
  • Please approach me with feedback

Course Resources

Online

Text Book(s)

  • Main: R for Data Science by Garrett Grolemund and Hadley Wickham
  • Also: Statistical Inference via Data Science - A moderndive into R and the tidyverse by Chester Ismay and Albert Y. Kim
  • Also: Mastering Shiny by Hadley Wickham
  • Also: Introduction to Data Science, Data Analysis and Prediction Algorithms with R by Rafael A. Irizarry

(See schedule for details)

Course format

  • Active Learning: Strong emphasis on students working, rather than me talking
  • First ~1h will be recap of exercises and subject-of-the-day introduction
  • Next ~3h will be hands-on exercises
  • Communication: Slack!

Student Composition

Course Expectation

  • General course objectives from DTU Course Base:
    • The overall aim of the course is to provide students with a toolkit of concrete skills in modern bio data science in Tidyverse R via the RStudio IDE. There will be a strong application-oriented focus on moving from a messy to a clean data set. Followed by data transformation, gaining insight through EDA and communication through data visualization using ggplot. All in the context of reproducible data analysis. Furthermore, the focus will be on the design and organization of a modern bio data science project in Tidyverse R. In this course we will work with biology related data sets.
  • Pre-course questionnaire

  • Expected work load: 1 ECTS ~28h, so 5 ECTS ~140h, distributed on 13 weeks and an exam, ~10h/week

Reproducibility and Replicability in modern Bio Data Science

“How to organize a project - The most important talk you never heard!”

Getting started with RStudio and Rmarkdown

RStudio Cloud

Exercises: R - The very basics

The very basics