Brian S. Yandell (1977)
Chapman & Hall, London
A | B | C | D | E | F | G | H | I
B. Working with Groups of Data
- 4. Comparison of Groups
- 4.1 Graphical Summaries
- 4.2 Estimates of Means and Variance
- 4.3 Assumptions and Pivot Statistics
- 4.4 Interval Estimates of Means
- 4.5 Testing Hypotheses about Means
- 4.6 Formal Inference on the Variance
- 5. Comparing Several Means
- 5.1 Linear Contrasts of Means
- 5.2 Overall Test of Difference
- 5.3 Partitioning Sums of Squares
- 5.4 Expected Mean Squares
- 5.5 Power and Sample Size
- 6. Multiple Comparisons of Means
- 6.1 Experiment- and Comparison-Wise Error Rates
- 6.2 Comparisons Based on F-Tests
- 6.3 Comparisons Based on Range of Means
- 6.4 Comparison of Comparisons
determine nature & extent of differences among groups
stem-and-leaf plot
histogram
box-plot
balanced design -- equal sample size per group
population mean, variance & SD by group
math notation
least squares
normal equations
degrees of freedom
common variance -- equal across groups
basic assumptions
(Linear) model correct (could be nonlinear)
Independent experimental units
Equal variance across units & groups
Normal distribution of errors (symmetry most important)
distribution notation (~)
pivot statistic
confidence interval / hypothesis testing
(usually) known & tabled distribution
relies on the basic assumptions
does not depend on unknown parameters
critical value & alpha
standard error (SE) vs. SD -- reporting
significant digits & precision
confidence interval
+/- 2 SE as rough guide
test of hypothesis
how large is large?
alpha -- risk of error
experience -- substantial deviation
1- vs. 2- vs. 3-sided
power to detect differences
tradeoff
close to expected (null)
chance to find deviation significant
more later with 2-sample testing
classical significance level
Fisher's idea vs. dogma
strength of evidence
p-value as random variable
uniform (0,1) under null hypothesis
null & alternative hypotheses include basic assumptions
chi-square variate
importance of assumptions (normality)
confidence interval
precision & significant digits
chi-square vs. normal approximation
pivot statistic for simple hypothesis
hypothesis tests & confidence intervals for differences
using plots for tests of difference among means
shrinking mean CIs -- approximate tests based on overlap
notched box-plots
compound hypotheses
all at once / as set of simple hypotheses
hint at orthogonality issue
linear combination of means
contrasts sum to zero
pivot statistic
confidence intervals & hypothesis tests
orthogonal contrasts
balanced experiment -- easy check
unbalanced experiment -- depends on sample sizes
orthogonal polynomials
recursive calculation
sequential fitting
Type I approach -- Gram-Schmidt
quadratic form & chi-square
mean square for null hypothesis
F-statistics & F distribution
what is "large enough"?
basic idea
centered response = group mean + error
sums of squares
total = model + error
ANOVA table
F-statistic
expected value & rough critical value (4)
general F-test: full & reduced model
moments of chi-square statistic
non-central chi-square
non-centrality parameter
replace sample means with population means
shift to right & increased spread
relation to moments of F-statistic
power for composite hypothesis seldom computed
power tables for F-test are complicated
noncentrality in "worst case scenario"
reduce to power study for t-test
power of 50% at critical value
balance power & size
4 SEs = 2 * critical value
modest imbalance -- slight adjustment of power
examine differences among means in detail
compromise
find significant differences
comparison-wise error rate
avoid reporting spurious differences as real
experiment-wise error rate
generic methods vs. methods focussed on specific interests
all-contrast comparisons (Scheffe)
all-pairs comparisons
multiple comparisons with best (Hsu)
multiple comparisons with control (Dunnett)
abuses & misconceptions (Hsu 1996)
before (pre-planned) & after (post-hoc) comparisons
conservative & liberal approaches
general procedure
conduct overall F-test
if evidence of differences
conduct pre-planned comparisons
investigate post-hoc comparisons
if no evidence for differences
only examine pre-planned comparisons
interpret results conservatively
cautions
overall test depends on choice of groups
null hypothesis is rarely true in practice
harmonic mean -- adjustment for minor imbalance
multiple-stage or step-down tests
compare progressively fewer means
SNK, REGWF, REGWQ
can address arbitrary contrasts
directly related to overall test of means
Fisher's LSD
ordinary t-tests for each comparison
can be too liberal
Bonferroni (BSD)
crude adjustment for multiple comparisons
can be too conservative if many comparisons done
Scheffe'
all contrasts -- equivalent to overall F-test
tends to be conservative
liked by statisticians, ignored by many practicioners
REGWF
multiple-stage improvement on Scheffe'
Waller-Duncan
studentized range statistic
known properties only for balanced experiment
more powerful for simple comparison of pairs of means
less powerful for general comparisons of means
Tukey's honest significant difference (HSD)
designed for pairwise comparisons
Student-Newman-Keuls (SNK)
multiple-stage test based on Tukey's HSD
can be too liberal
REGWQ
multiple-stage improvement on SNK
Duncan's multiple range
more liberal than SNK
higher level than alpha for comparing all k means
several practioners recommend against this method
but still used by selected disciplines
liberal < conservative
LSD < REGWF < Scheffe'
LSD < Bonferroni
LSD < SNK < REGWQ < Tukey
Waller < Duncan < SNK
Waller < LSD < SNK
Duncan can be more liberal than LSD for all k means
Last modified: Tue Feb 17 08:47:37 1998 by Brian Yandell
(yandell@stat.wisc.edu)