Overlapping histograms

A way of visualizing difference


I'm getting increasingly drawn to the idea of using overlapping histograms as a way of showing how something is - or is not - differnet from another thing. On eof the classic uses of overlapping histograms (in the stats textbooks, at least!) is a distribution of adult female heights on the left and a distribution of adult male heights on the right.

What I've done here is I've tried to show how a change in practice in an A&E department led to a higher proportion of patients being treated in less than four hours. The virtue of the histogram approach is that you can still see the amount of variation, whilst also seeing the difference.


Two overlapping histograms.
The pink (slightly transparent) histogram on the left shows the four-hour compliance (for non-admitted patients only) for the first 28 days of April. On average, the compliance was 52.8%. The blue histogram to the right shows what happened in the first 28 days of May. Average compliance had increased to 69.6%.

The trouble is, I cannot get a decent overlapping histogram effect in Microsoft Excel. The chart above is a weird, convoluted mash-up of Excel and PowerPoint, which took quite a few attempts before I got something that looked even remotely OK and even now still looks a bit rubbish. I think one of the issues is that there's an expectation that the overlapping area should be the colour you'd get when you actually mix the two colours.

So I had another go but this time using R and {ggplot2}:


Two overlapping histograms.
It's the same data as before but this time the 'core', non-overlapping data is a more solid colour for both time periods, and the overlapping area is the shade of purple you'd expect to see when you mix red and blue.

[25 October 2024]