Introduction to R: Management, Exploration, and Communication of data, 2-6 July 2018

Posted on Tue, Jan 09 2018 12:19:00


ICI3D logo


Introduction to R: Management, Exploration, and Communication of data, 2-6 July 2018 (Stellenbosch)

This intensive five-day course, registered as a University of Stellenbosch Short Course, will be presented at Stellenbosch under the auspices of the South African DST-NRF Centre for Epidemiological Modelling and Analysis (SACEMA). The course will take place at the Stellenbosch Institute for Advanced Study (StIAS), from 9 am to 5 pm daily. Guest presenters and tutors will support the main presenter, Roxanne Beauclair, of SACEMA and the International Centre for Reproductive Health, Ghent University, Ghent, Belgium.

The fee structure and registration form will be available online at in March 2018.

Enquiries may be directed to the SACEMA Assistant Director for Training, Gavin Hitchcock, or the SACEMA Research Manager Lynnemore Scheepers, copied to Roxanne Beauclair


Course Overview

R is an open-source, statistical software platform that is growing in popularity due to its rapidly expanding amount of libraries containing cutting-edge statistical functions, as well as the user-friendly, built-in communication tools in RStudio. Course participants will not only be introduced to the basics of programming in R, but they will also learn how import and clean data in addition to visualising and reporting results. Specific topics that will be covered include: importing data; reshaping/tidying data; merging/joining datasets; handling dates and transforming numeric and categorical variables; summarizing data with plots and tables; and producing reproducible and shareable reports.

This course does NOT cover any form of hypothesis confirmation (e.g. using statistical tests, predictive or causal modelling), or methods used for “big data” -- all data will be small, in-memory datasets.

Lectures will be interwoven with practical exercises. Participants will be encouraged to follow along with exercises and programming on their own laptops.

Participants will learn the basics of:

●      R data types

●      R programming and style

And will acquire the skills to:

●      Import raw data into RStudio

●      Tidy and transform data into a format suitable for analysis

●      Explore data in R using ggplot and dplyr (libraries for plotting and summarising data, respectively)

●      Create reproducible, dynamic reports using RMarkdown


Target Audience

The course is suitable for graduate students or professionals who are familiar with statistical analysis and have managed or interacted with datasets using other software platforms. It is assumed that course participants are familiar with programming in another language.


Roxanne Beauclair is a specialist in applying biostatistical methods to epidemiological data, and expects shortly to graduate with a PhD in this field from Ghent University, while launching her own statisticsl consultancy company, Data Yarn, based in Pretoria. She received training in Epidemiology (MPH) from the University of Cape Town in South Africa. She has previously been involved in an analytical capacity for different epidemiological studies of sexual behaviour and HIV in Southern Africa. For her PhD research at Ghent University she studied how age-mixing patterns influence HIV transmission in the South African and Malawian contexts. Over the past few years she has become an R enthusiast and enjoys learning new ways to improve upon statistical programmes by creating clean, reproducible, and legible code.