R from First to Second Gear

Moving (just) beyond the absolute basics of R


A one day training course for NHS data analysts who already have a very basic familiarity with R. We assume that you've used the key functions in dplyr and that you've created some basic charts in ggplot2 but you now want to develop your R skills that little bit more.

R from First to Second Gear, like its companion course R from a Standing Start, avoids the 'curse of knowledge'. It has been designed by—and is taught by—someone who is new enough to R that they can still remember the difficulties, frustrations and annoyances felt and experienced by non-experts.

The course covers its material by means of four case studies, one for each of the four 90-minute sessions. In each session we use the dplyr package to explore and analyse the data, and the ggplot2 package to create visualizations of the same data. The exercises start simple, build up gently using lots of repetition (we only introduce the complexities one at a time!), gradually building up your repertoire of functions and arguments. All the while you will be learning how to do useful things in R using real, fully anonymised, healthcare data.


Session 1 / Create hospital stays using a customized definition
The first session of the course shows how to take a definition of a hospital stay (as provided by a clinical director) and then apply that definition to a raw dataset so that stays are now counted using the new method. We make extensive use of the group_by() and summarize() functions and also the left_join() function, before plotting length of stay histograms for each consultant using the facet_wrap() function within ggplot2.

Session 2 / A visualization of 24 hours in an Emergency Department
In the second session everything we do is geared towards creating a graphic that shows a day's arrivals, departures and transfers in an Emergency Department. Starting with raw data (we show the SQL that geerated it), we wrangle the data (using—amongst other functions—the rbind() and cumsum() functions, so that it is in a shape to then be visualized using ggplot2's geom_point() function.

Session 3 / A visualization of Emergency Department crowding
In the third session we use an almost-identical dataset to the one we used in Session Two but this time we focus on how full the Emergency Department was by getting R to take 'retrospective snapshots' of the number of people in the department on each minute of the 24-hour day. We use the pivot_longer() and complete() functions to create a summary table that allows us to use geom_line() within ggplot2 to draw a graph of minute-by-minute ED crowding.

Session 4 / How does end-of-life hospital care vary from GP practice to GP practice?
The final session looks at data on the number of hospital admissions generated during the final three months of a person's life, and we analyse the data to see if we can describe and visualize how the number of admissions varies from GP practice to GP practice.


The course has been designed to be delivered conventionally or virtually (via Microsoft Teams). Each participant will need their own laptop and we will work through a series of exercises together. All of the installation instructions will be circulated well in advance of the course (and help will be given to people who experience installation issues) so that we minimise the risk of technical glitches during the course itself.


R from a Standing Start can be booked as either an on-site face-to-face course or as a virtual course (via Microsoft Teams) for £1,250+VAT, and up to 12 participants can be accommodated in each workshop session. Email info@kurtosis.co.uk to start making arrangements.

A small amount of experience of R is needed for this training course.