Introduction

R is a powerful language for data science in many disciplines of research with a steep learning curve. The tidyverse group of packages provide a dialect that greatly simplifies:

  • data importing
  • cleaning
  • processing
  • visualization as well as providing reproducible workflows using pipelines (%>%)

Adopt Hadley Wickham, Chief Scientist at RStudio, philosophy: take each step of data science and replace many intricacies of R with clear, consistent and easy to learn syntax. RStudio will be the software to use since it eases package management, scripting, plotting and data handling.

The four day course provides a complete introduction to data science in R with the tidyverse. The course will not go deep into statistics but rather getting data ready, some exploratory analysis, visualization and handling models.

Preparing data takes up to 90% of the time spent in analysis — speeding this up is the mission of this course.

Tidyverse

The tidyverse is an official CRAN package and here is its manifesto. Hadley proposed the following workflow described in his must-read book R for data science

H. Wickham - R for data science, licence CC

In terms of R packages, the workflow is nicely depicted as in this picture, by David Robinson

Requirements

Prior knowledge

Participants should have basic experience in programming environments such as Matlab, Octave or other programming languages or complete a simple free online course as this one offered by DataCamp.

Material

Each student must bring their own laptop with R and Rstudio installed with recent versions. Please look at install tutorial to set it up prior to the course.

Schedule

Dates and time

From: 19 to 22 February 2018.

Each day, the workshop will be a mixture of lectures and practicals from:

  • 9:30 - 12:30
  • 13:30 - 18:00

Program

Date Time Session Teacher Notes
2018-02-19 9:30 Introduction AG
11:00 Tidy data RK
12:00 String manipulation RK stringr
13:30 RMarkdown EK knitr
14:30 Import EK readr
2018-02-20 9:30 Data transformation RK dplyr, tidyr
13:30 Visualization AG ggplot
2018-02-21 9:30 Functional programming EK purrr
13:30 Tidy models AG broom
15:30 Advanced Programming AG tidyeval
2018-02-22 9:30 Practical All
  • Coffee will be served in the morning and afternoon in the course room
  • Lunch breaks from 12:30 to 13:30

Location

The course will be held at:

Maison du Savoir

University of Luxembourg

Belval Campus

2, avenue de l’Université

L-4365 Esch-sur-Alzette

Luxembourg

map

View Larger Map

In the room 4.510, Maison du Savoir, 4th floor.

room 4.510

room 4.510

Elixir

This event is supported by ELIXIR-Luxembourg