Practical Data Analysis
for Designed Experiments

Brian S. Yandell

Chapman & Hall, London (Fall 1996)

F. Regressing with Factors

16. Ordered Groups: 16.1 Groups in a Line; 16.2 Testing for Linearity; 16.3 Path Analysis Diagrams; 16.4 Regression Calibration; 16.5 Classical Error in Variables
17. Parallel Lines: 17.1 Parallel Lines Model; 17.2 Adjusted Estimates; 17.3 Plots with Symbols; 17.4 Sequential Tests with Multiple Responses; 17.5 Sequential Tests with Driving Covariate; 17.6 Adjusted (Type III) Tests of Hypotheses; 17.7 Different Slopes for Different Groups
18. Multiple Responses: 18.1 Overall Tests for Group Differences; 18.2 Matrix Analog of F Test; 18.3 How do Groups Differ?; 18.4 Causal Models

A | B | C | D | E | F | G | H | I

16. Ordered Groups
16.1 Groups in a Line
	linear / polynomial / nonlinear

16.2 Testing for Linearity
	general F-test
	full = ANOVA, reduced = regression
	test slope = 0 if group means not on a line

16.3 Path Analysis Diagrams
	Wright (1934), Bollen (1993)
	Bray and Maxwell (1985)
	Bhattacharyya and Johnson (1988)
	arrows point toward affected variate
	double arrows indicate association, single arrows for causation

	values of partial correlations can be assigned to arrows
		correlation after adjusting for other terms in model
		rho_XY.A = W_XY / \sqrt{ W_XX W_YY }

	factors with more than 2 levels
		can be subdivided by 1-df contrasts (see B&M)
		or use square root of partial R^2

	analysis of variance (ANOVA)
		A -> Y <- E

	analysis of covariance (ANCOVA)
		A -> Y <- X
		     ^
		     E

16.4 Regression Calibration
	resp = group mean + slope * error-in-variable + random error
	slopes could differ by group
	inflation of variance / no bias in means

16.5 Classical Error in Variables
	resp = intercept + slope*true + pure error
	observed = int + attenuation*true + measurement error

	resp = int + attenuation*slope*true + error
	inflation of variance / attenuation of slope
17. Paralell Lines
covariate or covariable
analysis of covariance
removing bias in observational studies

17.1 Parallel Lines Model
	additive model
		response = factor + covariate + error
		yij = \mu_i + \beta * (x_ij-\bar{x}_..) + eij
	notation & graphics (interaction plot)
	expected response, mean response for group, sample grand mean
	simple regression
	single factor analysis of variance
	two factor additive model
		linear relationship for second factor

17.2 Adjusted Estimates
	partition sum of squares
	Searle (1977) notation: T = B + W
		yy, xx, xy
	factor and covariate
		Tx\mu = Bx\mu
	simple regression slope estimator
		\beta = Txy/Txx

	minimizing residual sum of squares (least squares)
	normal equations
	LS estimates
		\mu_i = \bar{y_i.} - \beta * (\bar{x}_i.-\bar{x}_..)
		\beta = Wxy/Wxx
	adjusted slope
		regression of within-group residuals
	adjusted factor (group) means
		corrected for deviation of covariate group mean from overall

	balance
		\bar{x}_i. = \bar{x}_..

	adjusted variance
		V(\beta) = \sigma^2/Wxx
		V(\mu_i) = \sigma^2[1/n_i + (\bar{x}_i.-\bar{x}_..)^2/Wxx]
	adjusted factor means are correlated

	pivot statistics for \beta, \mu_i
	unbiased estimator of variance

17.3 Plots with Symbols
	interaction plots
		response against covariate by group symbol

	diagnostic plots
		residuals against predicted by group symbol
17.4 Sequential Tests with Multiple Responses

17.5 Sequential Tests with Driving Covariate

17.6 Adjusted (Type III) Tests of Hypotheses

17.7 Different Slopes for Different Groups
18. Multiple Responses
p multiple responses in one-factor CRD experiment
responses may be correlated (collinearity)
MANOVA = multivariate ANOVA
	more powerful if responses correlated, less powerful if not
	may or may not be able to interpret in meaningful way

references
	Morrison (1990, ch 5-6) Multivariate Statistical Methods
	Bray and Maxwell (1985) Sage Series: Multivariate Analysis of Variance
	Krzanowski (1988, ch 11-13)
	Bhattacharyya and Johnson (1988) Applied Multivariate Analysis

18.1 Overall Tests for Group Differences
	review univariate testing, partition of SS, general F-test
	composite null hypothesis
	p separate univariate analyses

	linear combination of responses: v = \sum ( b_k y_k )
		direction in p-dimensional space of multiple responses
		first canonical variate
			find weights b_1k to maximize
			\lambda_1 = SS_H ( v_1 ) / SS_E ( v_1 )
			null hypothesis: \sum b_1k \mu_ik are all equal
		second and subsequen canonical variates
		at most s = \min ( p, a-1 ) with non-increasing \lambda_l's
		v_1, ..., v_s are independent
		can test hypotheses for each separately

18.2 Matrix Analog of F Test
	matrix partition of sums of squares and cross products
		characteristic equation : ( H - \lambda ) b = 0
		eigenvalues \lambda_l, eigenvalues b_l = { b_lk }
		how to summarize information in eigenvalues?

	multivariate tests
		express in terms of eigenvalues or ratio of SS
		Roy's greatest root:	\lambda_1
			best if only one dimension to group differences
		Wilk's lambda:		1 / \prod ( 1 + \lambda_l )
			based on likelihood principles
		Hotelling-Lawley trace:	\sum \lambda_l
			proportional to average of univariate F-tests
		Pillai-Bartlet trace:	\sum \lambda_l / ( 1 + \lambda_l )
			proportional to average of canonical correlations
			r_l^2 = \lambda_l / ( 1 + \lambda_l ) = SS_H / SS_T

		power comparison if group differences
			in one direction
				Roy > Wilk > Hotel > Pillai
			more diffuse
				Roy < Wilk < Hotel < Pillai

	manova path diagram
		  -> Y1 <- E1
		 /         ^
		A          |
		 \         v
		  -> Y2 <- E2

	Roy's greatest root path diagram
		       -> Y1 <- E1
		      /         ^
		A -> V          |
		      \         v
		       -> Y2 <- E2

	recomendation in practice
		do all tests and compare
		investigate discrepancies carefully

18.3 How do Groups Differ?
multiple comparisons
	experiment-wise error rate across p responses
	Bonferroni
	p univariate comparisons
	comparisons on linear combinations

discriminant analysis
	canonical DA
		interpret each canoncial variate
		correlations with p responses = factor loading
		caution on interpreting weights = coefficients
			collinearity of responses affects weights
		SAS proc candisc

	stepwise DA
		forward selection of responses
			with backward elimination
			start with response that discriminates best
			stop when no significant improvement

		analogy to analysis of covariance
			y_ij1 = \mu_i1 + e_ij1
			y_ij2 = \mu_i2 + \beta y_ij1 + e_ij2
			. . .
			find best F for group at each step

		order determined EMPIRICALLY
		usually follow with canonical DA

	path diagram
		  -> V1 -> Y1 <- E1
		 /               ^
		A                |
		 \               v
		  -> V2 -> Y2 <- E2

18.4 Causal Models
	prior knowledge of cause and effect
	predetermined order of responses

	step-down analysis
		analysis of covariance
			y_ij1 = \mu_i1 + e_ij1
			y_ij2 = \mu_i2 + \beta y_ij1 + e_ij2
			. . .

		order determined BEFORE experiment
		use backward elimination from fullest model
			to test for no group differences

	path diagram
		  -> Y1 <- E1
		 /   |
		A    |
		 \   v
		  -> Y2 <- E2

Last modified: Mon Jun 17 11:35:46 1996 by Brian Yandell

yandell@stat.wisc.edu