Practical Data Analysis
for Designed Experiments

Brian S. Yandell (1977) Chapman & Hall, London
A | B | C | D | E | F | G | H | I

E. Questioning Assumptions

13. Residual Plots
13.1 Departures from Assumptions
13.2 Incorrect Model
13.3 Correlated Responses
13.4 Unequal Variance
13.5 Non-normal Data
14. Comparisons with Unequal Variance
14.1 Comparing Means when Variances are Unequal
14.2 Weighted Analysis of Variance
14.3 Satterthwaite Approximation
14.4 Generalized Inference
14.5 Testing for Unequal Variances
15. Getting Free from Assumptions
15.1 Transforming Data
15.2 Comparisons using Ranks
15.3 Randomization
15.4 Monte Carlo Methods

13. Residual Plots

usual assumptions model is correct responses are uncorrelated (usually independent) variances are equal responses (errors) have normal distribution cautions & suggestions use plot symbols don't focus on range as measure of spread jitter marginal means at factor levels to see all points references Belsey Kuh Welsch (1980) Regression Diagnostics Carroll Rupert (1988) Transformation and Weighting in Regression Miller (1986) Beyond Anova, Basics of Applied Statistics Scheffe (1959, ch. 10) Analysis of Variance Seber (1977, sec. 6.3) Linear Regression Analysis

13.1 Departures from Assumptions

plot residuals using symbols for group against predicteds against other responses against marginals of other factors outlier detection many zeroes keep or drop or dichotomize banded pattern if response is discrete jitter by 4% (Cleveland 1975) multiple copies of residual plots plot symbols by various factors boxplots of residuals by groups

13.2 Incorrect Model

bias = m_i - m = group mean - grand mean E(MSE) = (m_i-m)^2 + sigma^2 = bias^2 + variance residual plot look for pattern in groups plot against covariates overfit has no bias, but is inefficient high variance of estimates of effects interaction & margin plots

13.3 Correlated Responses

residual plot against time or run order serial correlation against space -- contour plot scatter plot of residuals from one time against another model covariance structure random and mixed models repeated measures multivariate models

13.4 Unequal Variance

residual plot check for changing spread tests in next chapter T pivot statistics for 2 groups using pooled variance V(T) = 1 if n1=n2 or var_1=var_2 otherwise it can be larger or smaller p-value changes accordingly

13.5 Non-normal Data

Scheffe (1959, ch. 10) Analysis of Variance (see Box (1953) Biometrika) Seber (1977, sec. 6.3) Linear Regression Analysis (see Box Watson (1962) Biometrika) recall Central Limit Theorem is mean a good summary of center of distribution? skewness = E[(resp-mean)^3/sigma^3] (= 0 for normal) mean may be far from most of the data CIs have correct coverage, but should be skewed as well kurtosis = E[(resp-mean)^3/sigma^3] - 3 (= 0 for normal) V(variance) = sigma^4[2/(n-1) + kurtosis / n] variance of variance estimate affected inference on variance upset inference on mean only slightly off QQ plot against normal scores (normal plot) histogram of residuals (by group)

14. Comparisons with Unequal Variance

allowing for unequal variance pool, weight, transform Behrens-Fisher problem -- inference approximate -- Satterthwaite exact -- Weerahandi references Scheffe (1959, ch. 10) Analysis of Variance (see Box (1953) Biometrika) Weerahandi (1994) Exact Statistical Methods for Data Analysis (see Tsui and Weerahandi (1989) JASA)

14.1 Comparing Means when Variances are Unequal

pooled variance size of test as R = n1/n2 varies

14.2 Weighted Analysis of Variance

weighted anova known weights weights function of covariate problems if weights depend on response F-test arising when weights = 1 / variances Welch's t-test T pivot statistic for two groups with unequal variance V(T) = 1 if n1=n2 or var_1=var_2 otherwise it can be larger or smaller p-value changes accordingly (differs from pooled variance)

14.3 Satterthwaite Approximation

weighted sum of chi-square variates S = \var_1/n1 + \var_2/n2 find X such that S \approx X and X \sim a \chi_r^2 E(S) = a r, V(S) = 2 a^2 r r = 2 [E(S)]^2/V(S) a = V(S) / [2E(S)] = E(S) / r r depends in complicated way on sizes and variances estimate using sample variances round down to nearest integer T \approx t_r distribution can do same idea with linear contrasts, etc.

14.4 Generalized Inference

Weerahandi (1995) generalize pivot trick start with pivot where variances known introduce new random variates same distribution as variances independent combine using beta distribution (skip details) combine to get pivot that does not depend on variances problem: method depends heavily on normality through distribution of variances

14.5 Testing for Unequal Variances

references Milliken Johnson (1984, ch. 2) Analysis of Messy Data Miller (1197) Beyond Anova recall how kurtosis affects inference about variances Hartley's F-max ratio of largest to smallest group variances special tables requires equal sample sizes and normality liberal or conservative approach to unequal sizes Bartlett's requires large samples and normality compares log(pooled variance) to mean [log(group variances)] uses chi-square approximation Box considers anova on log variances Levene's analysis of variance on abs( response - predicted ) robust to violations of normality nearly as good as H's and B's in simulations

15. Getting Free from Assumptions

nonparametric or distribution-free -- almost independence group differences comprise shift of center similar spread (distribution shape) across groups

15.1 Transforming Data

references Snedecor and Cochran (198x, ch. 14) Scheffe (1959, ch. 10) Carroll and Rupert (1988) plot group mean against group SD should be no relationship purpose of transformation stabilize variance achieve normality (symmetry, kurtosis) simplify model (remove interaction) prior theoretical considerations (mechanistic model) reasons to not transform anova methods are fairly robust need to justify transformation to peers effect of empirical choice of transform on p-value back-transform return to original units for tables & plots skewed confidence intervals reflect variation or transform scale on plot uneven spacing (1,2,5,10,20,50,100) LSD bars may still apply Box-Cox transformation (y^L-1)/L or log(y) if L=0 maximize augmented likelihood (minimize MSE(L)) automatic pick of transform guided by common sense improve normality logit for proportions -- spreads out tails (kurtosis) log for right-skewed such as positive measurements Carroll-Ruppert modification of Box-Cox for skewness use absolute values, possibly centered variance stabilizing transformations E(y) = m, V(y) = g(m) Taylor series approximation of f(y) f(y) = f(m) + (y - m) f'(m) V(f(y)) = [g(m)f'(m)]^2 f'(y) = c/g(y) f(y) = integral of c/g(y) most common transforms SD proportional to mean f(y) = log(y) constant CV variance proportional to mean f(y) = sqrt(y) Poissson counts proportions (Binomial) f(y) = arcsin(sqrt(y)) variance largest around 0.5 transform to simplify model Hoaglin-Mosteller-Tukey plot recall MSE-adjusted effects pure interaction against main effect slope of regression line suggests transform L = 1 - slope

15.2 Comparisons using Ranks

replace observations by their ranks Wilcoxon-type tests Friedman test blocking on one factor rank within each block other scores rarely used normal QQ plot

15.3 Randomization

Fisher's empirical justification of t-test and anova test empirical distribution approximated by theoretical theoretical easier to use in practice permuations must preserve any restrictions on randomization evidence against null hypothesis in extreme tails

15.4 Monte Carlo Methods

what if there are too many permutations? take simple random sample pay attention to restrictions on randomization choose sample size to achieve desired precision in p-value Markov chain methods for very complicated models

Last modified: Tue Mar 3 08:44:19 1998 by Brian Yandell (yandell@stat.wisc.edu)