<--- Previous topic | Next topic --->
Single-classification anova: Unplanned comparisons of means
In a Model I anova, it is often desirable to perform additional comparisons of subsets of the means. If you didn't decide on some planned comparisons before doing the anova, you will be doing unplanned comparisons. Because these are unplanned, you can't just do the comparison as an anova and use the resulting P-value. Instead you have to use a test that takes into account the large number of possible comparisons you could have done. For example, if you did an anova with five groups (A, B, C, D, and E), then noticed that A had the highest mean and D had the lowest, you couldn't do an anova on just A and D. There are 10 possible pairs you could have compared (A with B, A with C, etc.) and the probability under the null hypothesis that one of those 10 pairs is "significant" at the p<0.05 level is much greater than 0.05. It gets much worse if you consider all of the possible ways of dividing the groups into two sets (A vs. B, A vs. B+C, A vs. B+C+D, A+B vs. C+D, etc.) or more than two sets (A vs. B. vs C, A vs. B vs. C+D, etc.).
Gabriel's comparison intervals
There is a bewildering array of tests that have been proposed for unplanned comparisons; some of the more popular include the Student-Neuman-Keuls (SNK) test, Duncan's multiple range test, the Tukey-Kramer method, the REGWQ method, and Fisher's Least Significant Difference (LSD). For this class, we will only learn two techniques, Gabriel's comparison intervals and the Tukey-Kramer method, that apply only to unplanned comparisons of pairs of group means.
We will not consider tests that apply to unplanned comparisons of more than two means, or unplanned comparisons of subsets of groups. There are techniques available for this (the Scheffé test is probably the most common), but with a moderate number of groups, the number of possible comparisons becomes so large that the P-values required for significance become ridiculously small.
Gabriel comparison interval
To compute the Gabriel comparison interval, the standard error of the mean for a group is multiplied by the studentized maximum modulus times the square root of one-half. The standard error of the mean is estimated by dividing the MSwithin from the entire anova by the number of observations in the group, then taking the square root of that quantity. The studentized maximum modulus is a statistic that depends on the number of groups, the total sample size in the anova, and the desired probability level (alpha).
Once the Gabriel comparison interval is calculated, the lower comparison limit is found by subtracting the interval from the mean, and the upper comparison limit is found by adding the interval to the mean. This is done for each group in an anova. Any pair of groups whose comparison intervals do not overlap is significantly different at the P<alpha level. For example, on the graph shown below, there is a significant difference in mean scutum width between insects from host 1 and insects from host 2. Host 1 and host 4 do not have significantly different mean scutum widths, because their Gabriel comparison intervals overlap.
I like Gabriel comparison intervals; the results are about the same as with other techniques for unplanned comparisons of pairs of means, but you can present them in a more easily understood form. However, Gabriel comparison intervals are not that commonly used. If you are using them, it is very important to emphasize that the vertical bars represent comparison intervals and not the more common (but less useful) standard errors of the mean or 95% confidence intervals. You must also explain that means whose comparison intervals do not overlap are significantly different from each other.
Tukey-Kramer method
In the Tukey-Kramer method, the minimum significant difference (MSD) is calculated for each pair of means. If the observed difference between a pair of means is greater than the MSD, the pair of means is significantly different.
The Tukey-Kramer method is much more popular than Gabriel comparison intervals. It is not as easy to display the results of the Tukey-Kramer method, however. One technique is to find all the sets of groups whose means do not differ significantly from each other, then indicate each set with a different symbol, like this:
Sport Mean Arch Height Volleyball 10.2 a Basketball 10.4 a Cross-country 11.7 a, b Soccer 12.0 b Lacrosse 12.4 b Rugby 12.4 b Softball 14.9 c Crew 15.3 c Swimming 16.6 d
Then you explain that "Means with the same letter are not significantly different from each other (Tukey-Kramer test, P<0.05)."
How to do the tests
The class single-classification anova spreadsheet, described on the anova significance page, calculates Gabriel comparison intervals and does the Tukey-Kramer test at the alpha=0.05 level, if you have 20 or fewer groups. I am not aware of any web pages that will calculate them.
Reference
Sokal and Rohlf 1995, pp. 240-260 (unplanned comparisons in general), 247-249 (Gabriel comparison intervals).
<--- Previous topic | Next topic --->
Return to the Research Methods in Biology syllabus
Return to John McDonald's home page
This page was last revised August 12, 2006. Its URL is statanovaunplanned.html