home | about kurtosis | courses | course calendar | booking information | ideas | technique | work in progress | clients | contact us  

How to draw a boxplot in Microsoft Excel

This is the best way to do it



Here is a boxplot that compares the waiting times for routine new patient attendances at ten outpatient clinics in a medium-sized district general hospital:


Waiting times for routine new patient attendances at ten ENT outpatient clinics, 2009

The box in this boxplot represents the inter-quartile range of values, in this case, waiting times for routine attendances. The vertical line within the grey box represents the median waiting time. The left hand whisker (sometimes boxplots are referred to as "box-and-whisker plots") extends leftwards to the 10th percentile value; the right-hand whisker extends rightwards to the 90th percentile waiting time.

Taking Clinic 1 as an example, the shortest waiting time was zero weeks, the lower quartile was 9 weeks, the median wait was 12 weeks, the upper quartile was 14 weeks and the 90th percentile was 15 weeks.

Here's how to draw a boxplot using Microsoft Excel. Use the data in this Excel workbook if you want to follow the steps methodically:

Box Plot Data.xls

Step One

You need to analyse your raw data so that you can calculate the five measures of position that you need for your boxplot. In this example, we want our five measures of position to be (1) 10th percentile; (2) lower quartile; (3) median; (4) upper quartile; (5) 90th percentile. The middle three of these are non-negotiable: the box in any boxplot should always represent the inter-quartile range, and the vertical line within the box should always be the median. However, you can exercise discretion in relation to the two extreme values. Sometimes you might choose the minimum and maximum values; sometimes the 5th and 95th percentiles, and so on.

In our Excel workbook, rows two to six in the Data worksheet contain the five numbers we need.

Step Two

You are going to graph the data using Excel's horizontal stacked bar chart. But you have to make some changes to the dat before you can do this. The grey cells in rows 10-14 of the Data worksheet show what you need to do.

Ignore the Y error (-) and Y error (+) values for the moment. We'll deal with them later in Step Three.

Instead we'll start with the grey lower quartile figure. You simply have to enter a value in the grey lower quartile cell (for Clinic 10 this will be cell B11) that refers to the lower quartile value in the original table (cell B3). This value of 10.25 will represent the right-hand limit of the first stack in the stacked bar chart.

The median value in cell B12 will be the right-hand limit of the second stack in the bar chart. Therefore you don't want to simply enter the median value into cell B12; instead you have to enter the difference between the median and the lower quartile, because you want the median to be positioned in the right place on the chart.

The upper quartile will be the right-hand limit of the third stack in the horizontal bar chart. So—following the same method as in the previous paragraph—you have to enter in cell B13 the difference between the upper quartile and the median. This will give you a value of 3.

Step Three

The data you now have in the grey rows 11-13 is the data that will allow you to plot the boxes of the boxplots. They give you the boundaries of the inter-quartile range, together with the median value within the box. You now have to attend to the whiskers.

First of all, the left-hand whisker. In this example, we are using the tenth percentile as our limit for our left-hand whisker. This is the value in cell B2 of the original table. In order to create a horizontal line on the stacked bar chart we are going to use the Y error bar functionality of Excel. So the value you want to enter into the grey cell B10 is the difference between the lower quartile and the 10th percentile (=B3-B2). You want a line extending leftwards from the lower quartile that is 5.15 weeks long.

Secondly, the right-hand whisker. Similar methodology. The grey cell B14 is the difference between the 90th percentile and the upper quartile. This will give you a right-hand whisker that extends three weeks to the right of the upper quartile.

Step Four

You now need to draw the chart. Highlight the cells B1:K1 (for the labels) and B11:K13 (for the data points). Select a horizontal stacked bar chart from the wizard (top-middle in the Chart sub-type: choices). If your Excel default colours are the same as ours, you'll have three bar colours. The ones that matter are the maroon and cream ones. Together these represent the inter-quartile range. The vertical line that separates maroon from cream is the median value.

Step Five

Now draw the whiskers. The left-hand whisker first. Position your mouse pointer over the left-hand (blue) bar and right-click. Choose Format Data Series... and then choose the Y Error Bars tab. In the Error amount box, choose Custom: and enter the values B10:K10 into the minus box. For the right-hand whisker you need to position your mouse pointer over the cream bar on the chart. Right-click, then choose Format Data Series... and then choose the Y Error Bars tab. In the Error amount box, choose Custom: and enter the values B14:K14 into the plus box.

Step Six

The final bit of necessary formatting is that you need to make the blue bars invisible. Right-click over one of the blue bars, choose Format Data series and then - in the Patterns tab - make the colour none and the border none.



  © Kurtosis 2010 | 99 Giles Street, Edinburgh EH6 6BZ | Tel 0131 555 5300 | email info@kurtosis.co.uk