Brian S. Yandell (1977)
Chapman & Hall, London
A | B | C | D | E | F | G | H | I
E. Questioning Assumptions
- 13. Residual Plots
- 13.1 Departures from Assumptions
- 13.2 Incorrect Model
- 13.3 Correlated Responses
- 13.4 Unequal Variance
- 13.5 Non-normal Data
- 14. Comparisons with Unequal Variance
- 14.1 Comparing Means when Variances are Unequal
- 14.2 Weighted Analysis of Variance
- 14.3 Satterthwaite Approximation
- 14.4 Generalized Inference
- 14.5 Testing for Unequal Variances
- 15. Getting Free from Assumptions
- 15.1 Transforming Data
- 15.2 Comparisons using Ranks
- 15.3 Randomization
- 15.4 Monte Carlo Methods
usual assumptions
model is correct
responses are uncorrelated (usually independent)
variances are equal
responses (errors) have normal distribution
cautions & suggestions
use plot symbols
don't focus on range as measure of spread
jitter marginal means at factor levels to see all points
references
Belsey Kuh Welsch (1980) Regression Diagnostics
Carroll Rupert (1988) Transformation and Weighting in Regression
Miller (1986) Beyond Anova, Basics of Applied Statistics
Scheffe (1959, ch. 10) Analysis of Variance
Seber (1977, sec. 6.3) Linear Regression Analysis
plot residuals using symbols for group
against predicteds
against other responses
against marginals of other factors
outlier detection
many zeroes
keep or drop or dichotomize
banded pattern if response is discrete
jitter by 4% (Cleveland 1975)
multiple copies of residual plots
plot symbols by various factors
boxplots of residuals by groups
bias = m_i - m = group mean - grand mean
E(MSE) = (m_i-m)^2 + sigma^2 = bias^2 + variance
residual plot
look for pattern in groups
plot against covariates
overfit has no bias, but is inefficient
high variance of estimates of effects
interaction & margin plots
residual plot
against time or run order
serial correlation
against space -- contour plot
scatter plot of residuals
from one time against another
model covariance structure
random and mixed models
repeated measures
multivariate models
residual plot
check for changing spread
tests in next chapter
T pivot statistics for 2 groups using pooled variance
V(T) = 1 if n1=n2 or var_1=var_2
otherwise it can be larger or smaller
p-value changes accordingly
Scheffe (1959, ch. 10) Analysis of Variance
(see Box (1953) Biometrika)
Seber (1977, sec. 6.3) Linear Regression Analysis
(see Box Watson (1962) Biometrika)
recall Central Limit Theorem
is mean a good summary of center of distribution?
skewness = E[(resp-mean)^3/sigma^3] (= 0 for normal)
mean may be far from most of the data
CIs have correct coverage, but should be skewed as well
kurtosis = E[(resp-mean)^3/sigma^3] - 3 (= 0 for normal)
V(variance) = sigma^4[2/(n-1) + kurtosis / n]
variance of variance estimate affected
inference on variance upset
inference on mean only slightly off
QQ plot against normal scores (normal plot)
histogram of residuals (by group)
allowing for unequal variance
pool, weight, transform
Behrens-Fisher problem -- inference
approximate -- Satterthwaite
exact -- Weerahandi
references
Scheffe (1959, ch. 10) Analysis of Variance
(see Box (1953) Biometrika)
Weerahandi (1994) Exact Statistical Methods for Data Analysis
(see Tsui and Weerahandi (1989) JASA)
pooled variance
size of test as R = n1/n2 varies
weighted anova
known weights
weights function of covariate
problems if weights depend on response
F-test arising when weights = 1 / variances
Welch's t-test
T pivot statistic for two groups with unequal variance
V(T) = 1 if n1=n2 or var_1=var_2
otherwise it can be larger or smaller
p-value changes accordingly (differs from pooled variance)
weighted sum of chi-square variates
S = \var_1/n1 + \var_2/n2
find X such that S \approx X and X \sim a \chi_r^2
E(S) = a r, V(S) = 2 a^2 r
r = 2 [E(S)]^2/V(S)
a = V(S) / [2E(S)] = E(S) / r
r depends in complicated way on sizes and variances
estimate using sample variances
round down to nearest integer
T \approx t_r distribution
can do same idea with linear contrasts, etc.
Weerahandi (1995)
generalize pivot trick
start with pivot where variances known
introduce new random variates
same distribution as variances
independent
combine using beta distribution (skip details)
combine to get pivot that does not depend on variances
problem: method depends heavily on normality
through distribution of variances
references
Milliken Johnson (1984, ch. 2) Analysis of Messy Data
Miller (1197) Beyond Anova
recall how kurtosis affects inference about variances
Hartley's F-max
ratio of largest to smallest group variances
special tables
requires equal sample sizes and normality
liberal or conservative approach to unequal sizes
Bartlett's
requires large samples and normality
compares log(pooled variance) to mean [log(group variances)]
uses chi-square approximation
Box considers anova on log variances
Levene's
analysis of variance on abs( response - predicted )
robust to violations of normality
nearly as good as H's and B's in simulations
nonparametric or distribution-free -- almost
independence
group differences comprise shift of center
similar spread (distribution shape) across groups
references
Snedecor and Cochran (198x, ch. 14)
Scheffe (1959, ch. 10)
Carroll and Rupert (1988)
plot group mean against group SD
should be no relationship
purpose of transformation
stabilize variance
achieve normality (symmetry, kurtosis)
simplify model (remove interaction)
prior theoretical considerations (mechanistic model)
reasons to not transform
anova methods are fairly robust
need to justify transformation to peers
effect of empirical choice of transform on p-value
back-transform
return to original units for tables & plots
skewed confidence intervals reflect variation
or transform scale on plot
uneven spacing (1,2,5,10,20,50,100)
LSD bars may still apply
Box-Cox transformation
(y^L-1)/L or log(y) if L=0
maximize augmented likelihood (minimize MSE(L))
automatic pick of transform guided by common sense
improve normality
logit for proportions -- spreads out tails (kurtosis)
log for right-skewed such as positive measurements
Carroll-Ruppert modification of Box-Cox for skewness
use absolute values, possibly centered
variance stabilizing transformations
E(y) = m, V(y) = g(m)
Taylor series approximation of f(y)
f(y) = f(m) + (y - m) f'(m)
V(f(y)) = [g(m)f'(m)]^2
f'(y) = c/g(y)
f(y) = integral of c/g(y)
most common transforms
SD proportional to mean f(y) = log(y)
constant CV
variance proportional to mean f(y) = sqrt(y)
Poissson counts
proportions (Binomial) f(y) = arcsin(sqrt(y))
variance largest around 0.5
transform to simplify model
Hoaglin-Mosteller-Tukey plot
recall MSE-adjusted effects
pure interaction against main effect
slope of regression line suggests transform
L = 1 - slope
replace observations by their ranks
Wilcoxon-type tests
Friedman test
blocking on one factor
rank within each block
other scores rarely used
normal QQ plot
Fisher's empirical justification of t-test and anova test
empirical distribution approximated by theoretical
theoretical easier to use in practice
permuations must preserve any restrictions on randomization
evidence against null hypothesis in extreme tails
what if there are too many permutations?
take simple random sample
pay attention to restrictions on randomization
choose sample size to achieve desired precision in p-value
Markov chain methods for very complicated models
Last modified: Tue Mar 3 08:44:19 1998 by Brian Yandell
(yandell@stat.wisc.edu)