home  about kurtosis  courses  course calendar  booking information  ideas  technique  work in progress  clients  contact us  
How to draw a boxplot in Microsoft Excel This is the best way to do it 

Introduction Here is a boxplot that compares the waiting times for routine new patient attendances at ten outpatient clinics in a mediumsized district general hospital: ANYTOWN GENERAL HOSPITAL Waiting times for routine new patient attendances at ten ENT outpatient clinics, 2009
The box in this boxplot represents the interquartile range of values, in this case, waiting times for routine attendances. The vertical line within the grey box represents the median waiting time. The left hand whisker (sometimes boxplots are referred to as "boxandwhisker plots") extends leftwards to the 10th percentile value; the righthand whisker extends rightwards to the 90th percentile waiting time. Taking Clinic 1 as an example, the shortest waiting time was zero weeks, the lower quartile was 9 weeks, the median wait was 12 weeks, the upper quartile was 14 weeks and the 90th percentile was 15 weeks. Here's how to draw a boxplot using Microsoft Excel. Use the data in this Excel workbook if you want to follow the steps methodically: Step One You need to analyse your raw data so that you can calculate the five measures of position that you need for your boxplot. In this example, we want our five measures of position to be (1) 10th percentile; (2) lower quartile; (3) median; (4) upper quartile; (5) 90th percentile. The middle three of these are nonnegotiable: the box in any boxplot should always represent the interquartile range, and the vertical line within the box should always be the median. However, you can exercise discretion in relation to the two extreme values. Sometimes you might choose the minimum and maximum values; sometimes the 5th and 95th percentiles, and so on. In our Excel workbook, rows two to six in the Data worksheet contain the five numbers we need. Step Two You are going to graph the data using Excel's horizontal stacked bar chart. But you have to make some changes to the dat before you can do this. The grey cells in rows 1014 of the Data worksheet show what you need to do. Ignore the Y error () and Y error (+) values for the moment. We'll deal with them later in Step Three. Instead we'll start with the grey lower quartile figure. You simply have to enter a value in the grey lower quartile cell (for Clinic 10 this will be cell B11) that refers to the lower quartile value in the original table (cell B3). This value of 10.25 will represent the righthand limit of the first stack in the stacked bar chart. The median value in cell B12 will be the righthand limit of the second stack in the bar chart. Therefore you don't want to simply enter the median value into cell B12; instead you have to enter the difference between the median and the lower quartile, because you want the median to be positioned in the right place on the chart. The upper quartile will be the righthand limit of the third stack in the horizontal bar chart. So—following the same method as in the previous paragraph—you have to enter in cell B13 the difference between the upper quartile and the median. This will give you a value of 3. Step Three The data you now have in the grey rows 1113 is the data that will allow you to plot the boxes of the boxplots. They give you the boundaries of the interquartile range, together with the median value within the box. You now have to attend to the whiskers. First of all, the lefthand whisker. In this example, we are using the tenth percentile as our limit for our lefthand whisker. This is the value in cell B2 of the original table. In order to create a horizontal line on the stacked bar chart we are going to use the Y error bar functionality of Excel. So the value you want to enter into the grey cell B10 is the difference between the lower quartile and the 10th percentile (=B3B2). You want a line extending leftwards from the lower quartile that is 5.15 weeks long. Secondly, the righthand whisker. Similar methodology. The grey cell B14 is the difference between the 90th percentile and the upper quartile. This will give you a righthand whisker that extends three weeks to the right of the upper quartile. Step Four You now need to draw the chart. Highlight the cells B1:K1 (for the labels) and B11:K13 (for the data points). Select a horizontal stacked bar chart from the wizard (topmiddle in the Chart subtype: choices). If your Excel default colours are the same as ours, you'll have three bar colours. The ones that matter are the maroon and cream ones. Together these represent the interquartile range. The vertical line that separates maroon from cream is the median value. Step Five Now draw the whiskers. The lefthand whisker first. Position your mouse pointer over the lefthand (blue) bar and rightclick. Choose Format Data Series... and then choose the Y Error Bars tab. In the Error amount box, choose Custom: and enter the values B10:K10 into the minus box. For the righthand whisker you need to position your mouse pointer over the cream bar on the chart. Rightclick, then choose Format Data Series... and then choose the Y Error Bars tab. In the Error amount box, choose Custom: and enter the values B14:K14 into the plus box. Step Six The final bit of necessary formatting is that you need to make the blue bars invisible. Rightclick over one of the blue bars, choose Format Data series and then  in the Patterns tab  make the colour none and the border none. 


© Kurtosis 2010  99 Giles Street, Edinburgh EH6 6BZ  Tel 0131 555 5300  email info@kurtosis.co.uk 