LESSON 15: Hypothesis testing

FOCUS QUESTION: How can I tell whether the experimental group is different from the control group?

In this lesson you will:
  • Formulate and test a hypothesis regarding population mean.
  • Apply the one sample t-test to assess the true mean.
  • Apply the two sample t-test to assess whether two samples are likely to come from populations with the same mean.
  • Use p-values and confidence intervals.
Photograph of Sir Ronald Fisher, founder of modern statistics

Contents


DATA FOR THIS LESSON

File Description
fisheriris
  • This data set contains the famous Fisher iris data set. The data set consists of measurements of 150 flower samples from each of three species of flowers: Iris setosa, Iris virginica, and Iris versicolor. The measurements are in mm.
  • Four features were measured for each sample:
    • The length of the flower sepal
    • The width of the flower sepal
    • The length of the flower petal
    • The width of the flower petal
  • All 150 samples from the Fisher iris data are stored in a single table called meas:
    • The four columns correspond to the four types of measurements: sepal length, sepal width, petal length and petal width, respectively.
    • The first 50 rows contain data for Iris setosa
    • The second 50 rows contain data for Iris virginica
    • The third 50 rows contain data for Iris versicolor.
  • The species information is kept in a separate vector called species.
  • The data is sometimes referred to as Anderson's Iris data in honor of Edgar Anderson, the biologist who collected the data. See http://en.wikipedia.org/wiki/Iris_flower_data_set for additional information.

Note: This dataset comes with the MATLAB distribution so you don't have to download it separately.

DaphneIslandBeaks.txt
SantaCruzIslandBeaks.txt
  • The data set consists of measurements of beak sizes in mm. of one species of Darwin's ground finch (Geospiza fortis) taken at Daphne Island and at Santa Cruz Island in the Galápagos by Peter and Rosemary Grant.
  • The populations of the two islands differ, although the islands are less than 10 km apart.
  • The data was extracted from a data set distributed with the case study Natural Selection and Darwin's Finches by Martin Wikelski available on the web at http://wps.prenhall.com/esm_freeman_evol_3/0,8018,8412374-,00.html.
  • The original data is summarized in the article: "The classical case of character release: Darwin's finches (Geospiza) on Isla Daphne Major, Galápagos" by P. T. Boag and P. R. Grant that appeared in Biological Journal of the Linnean Society 22:243-287 (11284).

See http://en.wikipedia.org/wiki/Peter_and_Rosemary_Grant for additional information on the work of Peter and Rosemary Grant.

diaries.mat
  • The data set contains contains sleep diary data for a cohort in MATLAB variables.
  • The arrays have a column for each person.
  • The vectors have an element for each person.
  • The values in column n correspond to the same person as the value in position n of each vector.
  • The file contains the following variables:
    • bedTimes - array of bed times in decimal-date format.
    • dayCaffeine - array of daytime caffeine indicators.
    • gender - vector of male/female gender designators.
    • nightCaffeine - array of evening caffeine indicators.
    • section - vector of section indicators. The possible section numbers are 0, 1, 2, and 3. Section 0 contains only a single instructor. The remaining values correspond to course section numbers.
    • toSleepMinutes - an array of number of minutes to fall asleep.
    • useAlarm - array of alarm use indicators.
    • wakeTimes - array of wakeup times in decimal-date format.
  • The data was originally gathered by students taking CS 1173 in the fall 2009 semester and anonymized and randomized to be unidentifiable.
  • The first column of each array represents the instructor's values, the rest of the columns represent individual students.
  • Diaries were recorded for 21 days (from September 23, 2009 to October 13, 2009).

SETUP FOR LESSON 15

SUGGESTED READING: Wikipedia has a discussion of hypothesis testing found at <http://en.wikipedia.org/wiki/Statistical_hypothesis_testing>.

SUGGESTED READING: Wikipedia also has a discussion of the concept of the null hypothesis which is somewhat readable. The discussion can be found at <http://en.wikipedia.org/wiki/Null_hypothesis>.

SUGGESTED READING: Wikipedia discusses the meaning of the p-value and the frequent misunderstandings in interpreting it. The discussion can be found at <http://en.wikipedia.org/wiki/Pvalue>.


EXAMPLE 1: Load the Fisher iris data and extract sepal lengths of Iris setosa

Create a new cell in which you type and execute:

   load fisheriris;
   sLenSetosa = meas(strcmp(species, 'setosa'), 1);  % Sepal lengths of Iris setosa

*You should see the following variables in your Workspace Browser:


EXAMPLE 2: Test Iris setosa mean sepal length using a one sample t-test

Create a new cell in which you type and execute:

   fprintf('Estimated population mean is %g\n', mean(sLenSetosa));
   testMean = 5.936;    % Could the true mean sepal length be this value?
   fprintf('True population mean of Iris setosa sepal length ');
   if ttest(sLenSetosa, testMean) == 1
       fprintf('is likely to be different from %g\n', testMean);
   else
       fprintf('could be %g\n', testMean);
   end;

You should see the following variable in your Workspace Browser:

You should also see the following output in the Command Window:

Estimated population mean is 5.006
True population mean of Iris setosa sepal length is likely to be different from 5.936


EXAMPLE 3: Apply one sample t-test (standard statistial terminology)

Create a new cell in which you type and execute:

   testMean = 5.936;    % Could the true mean sepal length be this value?
   fprintf(['\nNull hypothesis: The true mean Iris setosa sepal length ' ...
            ' is %g\n'], testMean);
   fprintf(['Alt hypothesis: The true mean Iris setosa sepal length ' ...
           'is different from %g\n'], testMean);
   if ttest(sLenSetosa, testMean) == 1
       fprintf('\tReject null hypothesis in favor of alternative ');
   else
       fprintf('\tCannot reject the null hypothesis ');
   end;
   fprintf('at the 0.05 significance level\n');

You should see the following variable in your Workspace Browser:

You should also see the following output in the Command Window:

Null hypothesis: The true mean Iris setosa sepal length  is 5.936
Alt hypothesis: The true mean Iris setosa sepal length is different from 5.936
	Reject null hypothesis in favor of alternative at the 0.05 significance level


EXAMPLE 4: Test for inequality using the one sample t-test

Create a new cell in which you type and execute:

   testMean = 5.936;    % Could the true mean sepal length be this value?
   fprintf('\nTrue population mean of Iris setosa sepal length ');
   if ttest(sLenSetosa, testMean, 0.05, 'left') == 1
       fprintf('is likely to be less than ')
   else
       fprintf('could be be greater than or equal to ');
   end;
   fprintf('%g\n', testMean);

You should see the following variable in your Workspace Browser:

You should also see the following output in the Command Window:

True population mean of Iris setosa sepal length is likely to be less than 5.936


EXAMPLE 5: Look at the p-value and the confidence interval

Create a new cell in which you type and execute:

   testMean = 5.936;    % Could the true mean sepal length be this value?
   fprintf( ...
      '\nIs the true population mean of Iris setosa different from %g?\n', ...
       testMean);
   [h, p, ci] = ttest(sLenSetosa, testMean);
   fprintf('\t hypothesis = %g\n', h);  % Truth of alternative hypothesis
   fprintf('\t pvalue = %g\n', p); % Lower value indicates more support for alt hyp
   fprintf('\t 95%% confidence interval for population mean: [%g, %g]\n', ci);

You should see the following variable in your Workspace Browser:

You should also see the following output in the Command Window:

Is the true population mean of Iris setosa different from 5.936?
	 hypothesis = 1
	 pvalue = 6.6085e-24
	 95% confidence interval for population mean: [4.90582, 5.10618]


EXAMPLE 6: Load the Daphne Island and Santa Cruz Island beak size data

Create a new cell in which you type and execute:

    Daphne = load('DaphneIslandBeaks.txt');
    SantaCruz = load('SantaCruzIslandBeaks.txt');

You should see the following 2 variables in your Workspace Browser:


EXAMPLE 7: Are Daphne finch beak sizes different from those of Santa Cruz finches?

Create a new cell in which you type and execute:

   fprintf(['\nNull hypothesis: The true mean beak sizes of Daphne and ' ...
            'Santa Cruz finches are equal\n']);
   fprintf(['Alt hypothesis: The true mean beak size of Daphne finches ' ...
            'is different from the true mean beak size of Santa Cruz finches\n']);
   if ttest2(Daphne, SantaCruz) == 1
       fprintf('\tReject null hypothesis in favor of alternative ');
   else
       fprintf('\tCannot reject the null hypothesis ');
   end;
   fprintf('at the 0.05 significance level\n');

You should see the following output in the Command Window:

Null hypothesis: The true mean beak sizes of Daphne and Santa Cruz finches are equal
Alt hypothesis: The true mean beak size of Daphne finches is different from the true mean beak size of Santa Cruz finches
	Reject null hypothesis in favor of alternative at the 0.05 significance level


EXAMPLE 8: Look at the p-value and confidence interval for the two-sample t-test

Create a new cell in which you type and execute:

   fprintf( ...
      '\nIs the true population mean of Daphne finches different from SantaCruz?\n');
   [h, p, ci] = ttest2(Daphne, SantaCruz);
   fprintf('\t hypothesis = %g\n', h);  % Truth of alternative hypothesis
   fprintf('\t pvalue = %g\n', p); % Lower value indicates more support for alt hyp
   fprintf('\t 95%% confidence interval for difference of population means: ');
   fprintf('[%g, %g]\n', ci);

You should see the following variables in your Workspace Browser:

You should also see the following output in the Command Window:

Is the true population mean of Daphne finches different from SantaCruz?
	 hypothesis = 1
	 pvalue = 2.77109e-11
	 95% confidence interval for difference of population means: [-1.4464, -0.795025]


EXAMPLE 9: Load the consolidated sleep diary data

Create a new cell in which you type and execute:

    load diaries.mat;  % Load the sleep diaries

You should see 8 variables in the Workspace Browser:


EXAMPLE 10: Calculate wake-up hours, separated by gender

Create a new cell in which you type and execute:

   wakeupHours = (wakeTimes - floor(wakeTimes))*24; % Get fractional part of wakeTimes
   men = strcmp('male', gender);           % 1's where gender is 'male'
   mensWHours = wakeupHours(:, men);      % Pick columns corresponding to men
   women = ~men;                          % 1's where gender is 'female'
   womensWHours = wakeupHours(:, women);   % Pick columns corresponding to women
   numSubjects = size(wakeupHours, 2);     % Also need number of subjects

*You should see the following variables in your Workspace Browser:


EXAMPLE 11: Do men get up earlier than women on average?

Create a new cell in which you type and execute:

    fprintf('\nThe average wake-up time for men ');
    [h, p, ci] = ttest2(mensWHours(:), womensWHours(:));
    if h == 1
        fprintf('is likely to be different from that of women\n');
    else
        fprintf('could be the same as that of women\n');
    end;
    fprintf('\t hypothesis = %g\n', h);  % Truth of alternative hypothesis
    fprintf('\t pvalue = %g\n', p);
    fprintf('\t 95%% confidence interval for difference = [%g, %g]\n', ci);

You should see the following variables in your Workspace Browser:

You should also see the following output in the Command Window:

The average wake-up time for men is likely to be different from that of women
	 hypothesis = 1
	 pvalue = 0.0114848
	 95% confidence interval for difference = [-0.421104, -0.0533088]


EXAMPLE 12: Compare average wake-up time of instructor to that of a random student

Create a new cell in which you type and execute:

    randStudent = randi(numSubjects - 1, 1, 1) + 1; % Pick a random student
    fprintf('\nThe average wake-up time of the instructor ');
    if ttest2(wakeupHours(:, 1), wakeupHours(:, randStudent)) == 1
        fprintf('is likely to be different than ');
    else
        fprintf('could be the same as ');
    end;
    fprintf('the average wake-up time for subject %d\n', randStudent);

You should see the following variable in your Workspace Browser:

You should also see the following output in the Command Window.

The average wake-up time of the instructor is likely to be different than the average wake-up time for subject 86


EXAMPLE 13: Output IDs of the subjects whose average wake-up is similar instructor's

Create a new cell in which you type and execute:

    averWakeup = mean(wakeupHours);
    fprintf(['\nThe following students have average wake-up times ' ...
            'indistinguishable from instructor''s (%g):\n'], averWakeup(1));
    for k = 2:numSubjects
       if ttest2(wakeupHours(:, 1), wakeupHours(:, k)) == 0
          fprintf('\t Subject %g''s average wake-up time = %g\n', k, averWakeup(k));
       end;
    end;

You should see the following variables in your Workspace Browser:

You should also see the following output in the Command Window:

The following students have average wake-up times indistinguishable from instructor's (6.38685):
	 Subject 6's average wake-up time = 6.44752
	 Subject 10's average wake-up time = 6.26992
	 Subject 13's average wake-up time = 6.95582
	 Subject 32's average wake-up time = 5.87253
	 Subject 37's average wake-up time = 6.963
	 Subject 47's average wake-up time = 7.18662
	 Subject 50's average wake-up time = 7.03739
	 Subject 56's average wake-up time = 6.71099
	 Subject 60's average wake-up time = 7.12994
	 Subject 71's average wake-up time = 7.87514
	 Subject 75's average wake-up time = 6.92933
	 Subject 79's average wake-up time = 7.06183
	 Subject 81's average wake-up time = 7.9929
	 Subject 82's average wake-up time = 7.30281
	 Subject 91's average wake-up time = 5.50243
	 Subject 100's average wake-up time = 6.38685
	 Subject 102's average wake-up time = 6.14901
	 Subject 111's average wake-up time = 7.35553
	 Subject 114's average wake-up time = 6.30092
	 Subject 129's average wake-up time = 6.63787
	 Subject 133's average wake-up time = 7.05643
	 Subject 136's average wake-up time = 7.09694
	 Subject 138's average wake-up time = 6.90769
	 Subject 140's average wake-up time = 5.68481


SUMMARY OF SYNTAX

MATLAB syntax Description
Y = abs(X) the array Y is the same size as the array X and contains the absolute values of the corresponding elements of the array X.
h = ttest(X, m)

performs a one-sample student's t-test to determine whether the true mean of the population represented by the sample in the vector X could have a value different than m. The significance level for the test is 0.05.

If h is 1, then it is likely that the mean of the population represented by the sample X is different from m.

If h is 0, then you don't have enough evidence to conclude that the mean is different from m.

The ttest assumes that X is a random sample drawn from a normally distributed population.

If X is an array ttest works along the first non-singleton dimension. Note: Do NOT take the mean of X before applying ttest.

[h, p, ci] = ttest(X, m)

performs a one-sample student's t-test to determine whether the true mean of the population represented by the sample in the vector X is m.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population mean were actually equal to m.

The variable ci holds the 95% confidence interval for the true mean.

[h, p, ci] = ttest(X, m, alpha)

performs a one-sample student's t-test at significance level alpha to determine whether the true mean of the population represented by the sample in the vector X is is different from m.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population mean were actually equal to m.

The variable ci holds the 100*[1 - alpha]% confidence interval for the true mean.

[h, p, ci] = ttest(X, m, alpha, 'left')

performs a one-sided one-sample student's t-test at significance level alpha to determine whether the true mean of the population represented by the sample in the vector X is different from m.

%

If h is 1, then it is likely that the mean of the population represented by the sample X is less than m.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population mean were actually equal to m.

The variable ci holds the 100*[1 - alpha]% confidence interval for the true mean.

[h, p, ci] = ttest(X, m, alpha, 'right')

performs a one-sided one-sample student's t-test at significance level alpha to determine whether the true mean of the population represented by the sample in the vector X is different from m.

%

If h is 1, then it is likely that the mean of the population represented by the sample X is greater than m.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population mean were actually equal to m.

The variable ci holds the 100*[1 - alpha]% confidence interval for the true mean.

h = ttest2(X, Y)

performs a two-sample student's t-test to determine whether the true means of the population represented by the samples X and Y are different.

If h is 1, then it is likely that the means of the respective populations represented by samples X and Y are different.

If h is 0, then you don't have enough evidence to conclude that the means are different. The significance level for the test is 0.05.

The ttest2 assumes that X and Y are random samples drawn from a normally distributed populations.

If X is an array ttest2 works along the first non-singleton dimension. In this case Y must be the same size as X except along the first non-singleton dimension. Note: Do NOT take the mean of X or of Y before applying ttest2.

[h, p, ci] = ttest2(X, Y)

performs a two-sample student's t-test to determine whether the true means of the population represented by the samples X and Y are different.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population means were actually equal.

The variable ci is the 95% confidence interval for the difference of the two population means.

[h, p, ci] = ttest2(X, Y, alpha)

performs a two-sample student's t-test at significance level alpha to determine whether the true means of the population represented by the samples X and Y are different.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population meand were actually equal.

The variable ci holds the 100*[1 - alpha]% confidence interval for difference of the true population means.

[h, p, ci] = ttest2(X, Y, alpha, 'left')

performs a one-sided two-sample student's t-test at significance level alpha to determine whether the true mean of the population represented by the sample X is less than the true mean of the population represented by the sample Y.

If h is 1, then it is likely that the mean of the population represented by the sample X is less than the population mean represented by the sample Y.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population mean corresponding to X were actually greater than or equal to the population mean corresponding to Y.

The variable ci holds the 100*[1 - alpha]% confidence interval for the difference of the two populations means.

[h, p, ci] = ttest2(X, Y, alpha, 'right')

performs a one-sided two-sample student's t-test at significance level alpha to determine whether the true mean of the population represented by the sample X is greater than the true mean of the population represented by the sample Y.

If h is 1, then it is likely that the mean of the population represented by the sample X is less than the population mean represented by the sample Y.

The variable p represents a p-value, indicating how likely it is to observe the test statistic if the population mean corresponding to X were actually less than or equal to the population mean corresponding to Y.

The variable ci holds the 100*[1 - alpha]% confidence interval for the difference of the two populations means.


_This lesson was written by Kay A. Robbins of the University of Texas at San Antonio and last modified on 31-Dec-2010. Please contact krobbins@cs.utsa.edu with comments or suggestions. The photo is of Sir Ronald Fisher, founder of modern statistics and namesake of the Fisher Iris dataset. (See http://en.wikipedia.org/wiki/File:R._A._Fischer.jpg._