Introduction

This workshop is designed for beginners with little to no programming experience who want to learn how to work with REDCap data using R. Whether you are new to R or have some experience with spreadsheets and clinical data, this workshop will guide you through importing, cleaning, analysing, and visualising data from REDCap.

The workshop will cover essential programming concepts and gradually introduce more advanced topics, with a focus on using the REDCapTidieR, tidyverse, and ggplot2 packages to manage, explore and visualise your datasets. The aim of this workshop is to equip participants with the skills to import, clean, analyse, and visualise REDCap data in R.

Learning Objectives

Participants will gain the following skills:

  • Proficiency in using R and RStudio for data analysis.
  • Basic R programming skills.
  • Connect to REDCap using API tokens and securely retrieve data.
  • Import and clean REDCap data in R using REDCapTidieR and tidyverse.
  • Create visualisations using ggplot2 to explore trends in clinical data.

Prerequisites

Before starting this course you will need to ensure that your computer is set up with the required software. If you have any difficulty installing any of this software then please contact one of the trainers for help.

Installing R and RStudio

R and RStudio are separate downloads and installations.

R is the underlying statistical computing environment. The base R system and a very large collection of packages that give you access to a huge range of statistical and analytical functionality are available from CRAN, the Comprehensive R Archive Network.

However, using R alone is no fun. RStudio is a graphical integrated development environment (IDE) that makes using R much easier and more interactive. You need to install R before you install RStudio.

Windows
  • If you already have R and RStudio installed:

    • Open RStudio, and click on “Help” > “Check for updates”. If a new version is available, quit RStudio, and download the latest version for RStudio.
    • To check which version of R you are using, start RStudio and the first thing that appears in the console indicates the version of R you are running. Alternatively, you can type sessionInfo(), which will also display which version of R you are running. Go on the CRAN website and check whether a more recent version is available. If so, please download and install it. You can check here for more information on how to remove old versions from your system if you wish to do so.
  • If you don’t have R and RStudio installed, but don’t have Admin rights (i.e., using Peter Mac Windows computer):

    • Create a SNOW ticket here with the request item Software install (Windows PC).
  • If you don’t have R and RStudio installed, but have Admin rights:

    • Download R from the CRAN website.
    • Run the .exe file that was just downloaded
    • Go to the RStudio download page
    • Under Installers select RStudio x.yy.zzz - Windows 10/8/7 where x, y, and z represent version numbers)
    • Double click the file to install it
    • Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.
macOS
  • If you already have R and RStudio installed:

    • Open RStudio, and click on “Help” > “Check for updates”. If a new version is available, quit RStudio, and download the latest version for RStudio.
    • To check the version of R you are using, start RStudio and the first thing that appears on the terminal indicates the version of R you are running. Alternatively, you can type sessionInfo(), which will also display which version of R you are running. Go on the CRAN website and check whether a more recent version is available. If so, please download and install it.
  • If you don’t have R and RStudio installed:

    • Download R from the CRAN website.
    • Select the .pkg file for the latest R version
    • Double click on the downloaded file to install R
    • It is also a good idea to install XQuartz (needed by some packages)
    • Go to the RStudio download page
    • Under Installers select RStudio x.yy.zzz - Mac OS X 10.6+ (64-bit) (where x, y, and z represent version numbers)
    • Double click the file to install RStudio
    • Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.
Linux
  • Follow the instructions for your distribution from CRAN, they provide information to get the most recent version of R for common distributions. For most distributions, you could use your package manager (e.g., for Debian/Ubuntu run sudo apt-get install r-base, and for Fedora sudo yum install R), but we don’t recommend this approach as the versions provided by this are usually out of date. In any case, make sure you have at least R 4.3.2.
  • Go to the RStudio download page
  • Under Installers select the version that matches your distribution, and install it with your preferred method (e.g., with Debian/Ubuntu sudo dpkg -i rstudio-x.yy.zzz-amd64.deb at the terminal).
  • Once it’s installed, open RStudio to make sure it works and you don’t get any error messages.

Installing R Packages

On this course we will be making use of several packages designed for importing and processing data from REDCap, data manipulation and visualisation. After installing R and RStudio, follow the instructions below to install the required packages.

  • After starting RStudio, at the console type: (look for the ‘Console’ tab and type at the > prompt)
install.packages(c("tinytex", "tidyverse", "REDCapR", "REDCapTidieR", "httr", "jsonlite", "skimr", "labelled"))
  • You can also do this by going to Tools -> Install Packages and typing the names of the packages separated by a comma.

Installing Quarto

To install Quarto follow this link: https://quarto.org/docs/get-started/.

Additional Resources

Into the tidyverse

The core tidyverse includes the packages that you’re likely to use in everyday data analyses. Therefore, this workshop offers an introduction to these core packages. As of tidyverse 1.3.0, the following packages are included in the core tidyverse:

Figure 1: Hex logos for the eight core tidyverse packages and their primary purposes. Image source:https://education.rstudio.com/blog/2020/07/teaching-the-tidyverse-in-2020-part-1-getting-started/
  • ggplot2: Grammar of Graphics. Enables the creation of graphics in a declarative manner.
  • dplyr: Grammar for data manipulation. Presents a set of verbs to address common challenges in data manipulation.
  • tidyr: Provides a collection of functions for achieving tidy data.
  • readr: Facilitates the rapid and user-friendly reading of rectangular data (e.g., csv, tsv, and fwf).
  • purrr: Functional programming toolkit. Offers a set of tools for efficient work with functions and vectors.
  • tibble: Tibbles, a modern re-imagining of the data frame, offering a more user-friendly and efficient approach to handling tabular data.
  • stringr: Provides a set of functions designed to simplify and enhance string manipulations.
  • forcats: Provides a suite of useful tools for handling and manipulating categorical variables (factors).

Credits and Acknowledgement

These content were adapted from the following course materials: