LESSON 12: Error bars
FOCUS QUESTION: How can I depict uncertainty and variability in data?
This lesson discusses various ways of putting error bars on graphs.
In this lesson you will:
|
|
Contents
- DATA FOR THIS LESSON
- SETUP FOR LESSON 12
- EXAMPLE 1: Load the Fisher iris data (comes with MATLAB)
- EXAMPLE 2: Compute the mean and standard deviation of the sepal lengths for 3 species
- EXAMPLE 3: Plot mean sepal length using standard deviation (SD) error bars
- EXAMPLE 4: Plot the SD error bars on a bar chart
- EXAMPLE 5: Compute the standard error of the mean (SEM) for sepal lengths
- EXAMPLE 6: Plot mean sepal length using (SEM) error bars
- EXAMPLE 7: Plot SD and SEM error bars on the same graph
- EXAMPLE 8: Compute the median and inter quartile range (IQR) for sepal lengths
- EXAMPLE 9: Plot median sepal length using the inter quartile range (IQR) for error bars
- EXAMPLE 10: Calculate the means and standard deviations of all characteristics
- EXAMPLE 11: Draw a grouped bar chart of mean iris characteristics
- EXAMPLE 12: Plot the means of all characteristics using SD error bars
- EXAMPLE 13: Plot the means of all characteristics using connected SD error bars
- SUMMARY OF SYNTAX
DATA FOR THIS LESSON
| File | Description |
fisheriris |
Note: This dataset comes with the MATLAB distribution so you don't have to download it separately. |
SETUP FOR LESSON 12
- Set the Current Directory to Z:\working\MATLAB\Lesson12. (You will need to make a new directory for Lesson12.)
- Create a new script called Lesson12Script.m. (Use File->New->Blank M-File from the main MATLAB menubar.) You will enter each of the examples in a new cell in this script.
EXAMPLE 1: Load the Fisher iris data (comes with MATLAB)
Create a new cell in which you type and execute:
load fisheriris;
You should see the following 2 variables in your Workspace Browser:
- meas - an array in which each column corresponds to a particular type of measurement and each row corresponds to the 4 measurements for a particular speciman. The different species are combined into a single array.
- species - a cell column vector containing the species designation for the speciman given in the corresponding row of meas. Possible values are 'setosa', 'versicolor', and 'virginica'.
EXAMPLE 2: Compute the mean and standard deviation of the sepal lengths for 3 species
Create a new cell in which you type and execute:
sepalLens = reshape(meas(:, 1), 50, 3); % Make column 1 into a 50 x 3 array sLenMeans = mean(sepalLens); % Calculate the means for the 3 species sLenSDs = std(sepalLens); % Calculate the standard deviations for the 3 species
You should see the following varibles in your Workspace Browser:
- sepalLens - a 50 x 3 array with the sepal lengths for the three species
- sLenMeans - mean sepal lengths for the three species
- sLenSTDs - standard deviations of the sepal lengths of the three species.
Note: std(sepalLens) is population estimate of standard deviation, not the sample standard deviation.
- Define a variable called petalWidths that contains the petal widths separated into column by species.
- Define a variable called meanPetalWidths that contains the mean petal widths of each of the three species.
- Calculate the overall mean petal width.
EXAMPLE 3: Plot mean sepal length using standard deviation (SD) error bars
Create a new cell in which you type and execute:
fisherTitle = 'Comparison of three species in the Fisher Iris data'; irisSpecies = {'Setosa', 'Virginica', 'Versicolor'}; % Use for legend figure % Label the top errorbar(sLenMeans, sLenSDs, 'ks'); % Error bars use black squares set(gca, 'XTick', 1:3, 'XTickLabel', irisSpecies) % Set ticks and tick labels xlabel('Species of Iris') ylabel('Sepal length in mm') title(fisherTitle) legend('Mean (SD error bars)', 'Location', 'Southeast') % Put in lower right
You should see the following 2 variables in your Workspace Browser:
- fisherTitle - string with the title for the figure and the figure window
- irisSpecies - a cell vector containing the names of the three species
You should see a Figure Window with a labeled error bar plot:
Create a new cell right here (beginning of a cell starts with %%). Write MATLAB code to display the mean petal widths for the three species, showing standard deviation errorbars.
EXAMPLE 4: Plot the SD error bars on a bar chart
Create a new cell in which you type and execute:
figure hold on bar(sLenMeans, 'FaceColor', [0.5, 0.5, 1]) % Lighter so error bars show up errorbar(sLenMeans, sLenSDs, 'ks'); % Error bars use black squares set(gca, 'XTick', 1:3, 'XTickLabel', irisSpecies) % Set ticks and tick labels xlabel('Species of Iris'); ylabel('Sepal length in mm') title(fisherTitle) legend('Mean (SD error bars)', 'Location', 'Northwest') % Put in lower right box on % Force box around axes hold off
You should see a Figure Window with a labeled error bar plot:
EXAMPLE 5: Compute the standard error of the mean (SEM) for sepal lengths
Create a new cell in which you type and execute:
numSamples = length(sepalLens); % Length along the longest dimension sLenSEMs = sLenSDs./sqrt(numSamples); % Compute the standard error of the mean (SEM)
You should see the following 2 variables in your Workspace Browser:
- numSamples - the number of samples in each group
- sLenSEMs- the standard error for the mean sepal length for the three species
The SEM estimates the standard deviation of sample means from the true population mean.
EXAMPLE 6: Plot mean sepal length using (SEM) error bars
Create a new cell in which you type and execute:
figure errorbar(sLenMeans, sLenSEMs, 'rd') % Red diamonds set(gca, 'XTick', 1:3, 'XTickLabel', irisSpecies) % Set ticks and tick labels xlabel('Species of Iris'); ylabel('Sepal length in mm') title(fisherTitle) legend('Mean (SEM error bars)', 'Location', 'Southeast')
You should see a Figure Window with a labeled error bar plot:
EXAMPLE 7: Plot SD and SEM error bars on the same graph
Create a new cell in which you type and execute:
xPositions = [1, 2, 3]; % Use these as base x-axis error bar positions figure hold on errorbar(xPositions-0.1, sLenMeans, sLenSDs, 'g^') % Green up triangles errorbar(xPositions+0.1, sLenMeans, sLenSEMs, 'bv') % Blue down triangles set(gca, 'XTick', 1:3, 'XTickLabel', irisSpecies) % Set ticks and tick labels xlabel('Species of Iris'); ylabel('Sepal length in mm') title(fisherTitle) legend({'Mean (SD error bars)', 'Mean (SEM error bars)'}, 'Location', 'Southeast') hold off
You should see a Figure Window with two sets of error bars:
EXAMPLE 8: Compute the median and inter quartile range (IQR) for sepal lengths
Create a new cell in which you type and execute:
sLenMedians = median(sepalLens); % Species median sepal lengths sLenIQR = prctile(sepalLens, [25, 75]); % 25th and 75th percentiles
You should see the following 2 variables in your Workspace Browser:
- sLenMedians - the median sepal lengths of the three species
- sLenIQR - IQRs for the sepal lengths of the three species
The rows of sLenIQR correspond to the percentiles, and the columns correspond to the species.
EXAMPLE 9: Plot median sepal length using the inter quartile range (IQR) for error bars
Create a new cell in which you type and execute:
lowerDist = sLenMedians - sLenIQR(1, :); % Size of bottom bar upperDist = sLenIQR(2, :) - sLenMedians; % Size of top bar figure errorbar(xPositions, sLenMedians, lowerDist, upperDist, 'm*') % Magenta asterisks set(gca, 'XTick', 1:3, 'XTickLabel', irisSpecies) xlabel('Species of Iris'); ylabel('Sepal length in mm') title(fisherTitle) legend('Median (IQR error bars)', 'Location', 'Northwest') % Upper left
You should see the following 2 variables in your Workspace Browser:
- lowerDist - positions of lower edges of IQR error bars for median sepal lengths
- upperDist - positions of lower edges of IQR error bars for median sepal lengths
You should see a Figure Window with median/IQR error bars:
EXAMPLE 10: Calculate the means and standard deviations of all characteristics
Create a new cell in which you type and execute:
setosa = meas(1:50, :); % First 50 rows are setosa virginica = meas(51:100, :); % Second 50 rows are viginica versicolor = meas(101:150, :); % Third 50 rows are versicolor irisMeans = [mean(setosa); mean(virginica); mean(versicolor)]; irisSTDs = [std(setosa); std(virginica); std(versicolor)];
*You should see the following 2 variables in your Workspace Browser:
- irisMeans - a 3 x 4 array with means of the different measurements in different species
- irisSTDs - a 3 x 4 array with standard deviations of the different measurements in different species
EXAMPLE 11: Draw a grouped bar chart of mean iris characteristics
Create a new cell in which you type and execute:
irisMeas = {'Sepal length', 'Sepal width', 'Petal length', ...
'Petal width'}; % Characteristics for legend
figure
bar(irisMeans, 'grouped') % Rows are groups columns are group members
set(gca, 'XTickLabel', irisSpecies);
legend(irisMeas, 'Location', 'Northwest')
xlabel('Species of Iris');
ylabel('Mean size in mm')
title(fisherTitle)
*You should see the following variable in your Workspace Browser:
- irisMeas - a cell array legend labels for the bar chart
You should see a Figure Window with a labeled group bar chart (groups correspond to species):
EXAMPLE 12: Plot the means of all characteristics using SD error bars
Create a new cell in which you type and execute:
figure errorbar(irisMeans, irisSTDs, 's') % Use default colors and square markers set(gca, 'XTick', 1:3, 'XTickLabel', irisSpecies); legend(irisMeas, 'Location', 'Northwest') xlabel('Species of Iris'); ylabel('Mean size in mm') title(fisherTitle)
You should see a Figure Window with multiple labeled error bar plots:
EXAMPLE 13: Plot the means of all characteristics using connected SD error bars
Create a new cell in which you type and execute:
figure hold on errorbar(irisMeans(:, 1), irisSTDs(:, 1), ':sk') errorbar(irisMeans(:, 2), irisSTDs(:, 2), ':og') errorbar(irisMeans(:, 3), irisSTDs(:, 3), ':vb') errorbar(irisMeans(:, 4), irisSTDs(:, 4), ':^r') set(gca, 'XTick', 1:3, 'XTickLabel', irisSpecies); legend(irisMeas, 'Location', 'Northwest') xlabel('Species of Iris'); ylabel('Mean size in mm') title(fisherTitle) hold off
You should see a Figure Window with multiple labeled error bar plots (with bars representing the same characteristic being connected):
SUMMARY OF SYNTAX
| MATLAB syntax | Description |
errorbar(Y, E) |
creates a plot of the values of Y similar to
plot(Y). The corresponding values in E
show error bars at +/- that amount above and below the
corresponding values in Y.
|
errorbar(X, Y, E) |
creates a plot similar to errorbar(Y, E) except
that this function uses the values of X for the
x positions rather than using the integers 1, 2, ... .
|
errorbar(X, Y, L, U) |
creates a plot similar to errorbar(X, Y, E) except
that this function uses the values of L and U
to determine the span of the error bars. The L
array gives the distances below the corresponding values
in Y, and the U array gives the distances
above the corresponding values of Y.
|
Y = prctile(X, p) |
returns a vector of the percentiles of the vector X.
The vector p specifies the percentiles. When X
is a 2D array, the i-th row of Y contains the percentiles
p(i).
|
set(gca, PropertyName, PropertyValue) |
sets a property of the current axis. The PropertyName
argument is a string representing a property. The PropertyValue
argument gives the value of the property.
|
_This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified on 31-Dec-2010. Please contact krobbins@cs.utsa.edu with comments or suggestions. The image is a reproduction of a drawing by James Sowerby that appeared in The Botanical Magazine vol. 1. no. 1 (1792). The drawing is available on Wikipedia at http://en.wikipedia.org/wiki/File:Iris_persica_%28Sowerby%29.jpg._