Practical Data Analysis for Designed Experiments

Brian S. Yandell (1977) Chapman & Hall, London
A | B | C | D | E | F | G | H | I

C. Sorting out Effects with Data

7. Factorial Designs
7.1 Cell Means Models
7.2 Effects Models
7.3 Estimable Functions
7.4 Linear Constraints
7.5 General Form of Estimable Functions
8. Balanced Experiments
8.1 Additive Models
8.2 Full Models with Two Factors
8.3 Interaction Plots
8.4 Higher Order Models
9. Model Selection
9.1 Pooling Interactions
9.2 Selecting the "Best" Model
9.3 Model Selection Criteria
9.4 One Observation per Cell
9.5 Tukey's Test for Interaction

7. Factorial Designs

7.1 Cell Means Models

response = group mean + random error one-factor and two-factor means models estimable means at least one observation per group unique unbiased estimator linear combination of responses linear comb of estimables is estimable

7.2 Effects Models

one-factor effects model response = reference + group effect + random error group effect = group mean - reference reference is arbitrary overall (population grand) mean intercept (SAS) not estimable two factor effects model population cell & marginal means -- no data yet cell means are estimable provided cell is not empty may combine multiple factors into one additive effects model

7.3 Estimable Functions

functions of parameters which do not depend on particular solution to normal equations normal equations for effects model one factor & two factors matrix form -- overspecified model linear contrasts main effects contrasts pure interaction contrasts simplification in additive model

7.4 Linear Constraints

sum-to-zero linear constraints reference = population grand mean group effect = deviation from grand mean matrix form set-to-zero linear constraints reference = last group mean group effect = deviation from last group mean matrix form (particular) solutions of normal equations estimable functions in terms of constraints

7.5 General Form of Estimable Functions

L-notation as in SAS (Littell et al 1991) overspecied model relations among columns <-> among L's substituting for redundant L's set-to-zero constraints sum-to-zero constraints one- & two-factor effects models show GFEF has unique solution of normal equations (one factor)

8. Balanced Experiments

8.1 Additive Models

response = reference + factor A + factor B + error without replication and with balanced replication model equation & null hypotheses partition of sum of squares expected mean squares & F-statistics relation of marginal means to model & estimators

8.2 Full Models with Two Factors

cell means model & effects model estimates of cell means & marginal means standard errors main effects & interaction hypotheses partition of total sum of squares expected sum of squares F-statistics & non-centrality parameters two-factor anova table

8.3 Interaction Plots

interaction plot plot levels of factor A against cell means connect levels of factor B by lines label levels of both factors accordingly try switch A & B for better clarity order levels by marginal mean? add SE or LSD bar to help interpretation parallel lines or curves constant separation across levels of factor A parallel if no interaction unequal separation vs. crossing lines margin plots use marginal means along horizontal axis include identity line for reference straight lines = Tukey interaction (see 9.4) parallel straight lines = no interaction three-factor interaction separate plots by levels of factor C switch roles of A,B,C for clarity or combine two factors on one plot more lines or more horizontal levels plots to examine sieze of effects half-normal plot factors all at two levels significant effects deviate from identity line effect plot effect = deviation used in MS calculation effects rescaled for mean square by df plot one point for each level main effect -- label by level interactions residuals spread (SD) relative to residual indicates size of effect can identify cells that contribute

8.4 Higher Order Models

cell means & effects models estimates & partition of sums of squares three-factor anova table 3-factor interaction / interpretation two or more 2-factor interactions interaction plots separate plots by level of third factor possibly averaged over third factor again, switch roles to find best view

9. Model Selection

parsimonious model balance bias & over-fit bias -- miss key features over-fit -- high variabilty in paramter estimates hierarchy of factorial models usually keep main effects if interaction significant testing nested models formal F tests & other statistics comparing non-nested models

9.1 Pooling Interactions

decision paths for two-factor models pragmatic consideration of full & additive model report results honestly

9.2 Selecting the "Best" Model

decision paths for three-factor additive model 18 hierarchical models from which to choose suggested method of analysis for full model if 3-factor interaction is significant separately analyze 2-factor models by level of third factor if no 3-factor interaction easy if only one 2-factor interaction analyze several ways of more than one separate analyses as above how to move among models? forward selection add terms one at a time begin with nothing or a few terms danger of biased model -- too simple backward elimination drop one at a time from full model danger of bloated model rule of 2 for pooling interactions sweep down from main effects only examine lower terms if large simplifies hierarchy for interpretation what if different approaches differ? look further look ahead more than one step be skeptical -- take broad view automated tools useful but can be limited designed for regression, not factors consider important contrasts

9.3 Model Selection Criteria

plots half-normal plots when 2 levels per factor effect plots selected interaction plots based on full model fit? test statistic vs. model df (=p) especially Mallow's C(p) F-test careful of multiple testing issues explained variation R^2 (adjusted for p) heuristic guide unadjusted always increases as model grows but how fast does it increase? mean squared error does it change dramatically among models? Mallow's C(p) C(p) > p indicates `large' model bias C(p) = p if model bias eliminated pick smallest such p to avoid overfit sensitive to estimate of variance tricky if no or few df error initial artful choice of reduced model

9.4 One Observation per Cell

effects model with no replication how to simplify interaction -- fewer df Tukey interaction model interaction plots / margin plots

9.5 Tukey's Test for Interaction

formal test (under null additive model) Mandel interaction model

Last modified: Tue Feb 17 08:47:45 1998 by Brian Yandell (yandell@stat.wisc.edu)