Practical Data Analysis for Designed Experiments

Brian S. Yandell (1977) Chapman & Hall, London
A | B | C | D | E | F | G | H | I

A. Placing Data in Context

1. Practical Data Analysis
1.1 Effect of Factors
1.2 Nature of Data
1.3 Summary Tables
1.4 Plots for Statistics
1.5 Computing
1.6 Interpretation
2. Collaboration in Science
2.1 Asking Questions
2.2 Learning from Plots
2.3 Mechanics of Consulting Session
2.4 Philosophy & Ethics
2.5 Intelligence, Culture & Learning
2.6 Writing
3. Experimental Design
3.1 Types of Studies
3.2 Designed Experiments
3.3 Design Structure
3.4 Treatment Structure
3.5 Designs in This Book

1. Practical Data Analysis

practical data analysis (pda) data in context of scientific experiment Chatfield: initial data analysis -- tables & graphs Tukey: exploratory data analysis confirmatory data analysis human judgement interpretation in terms of original problem key questions

1.1 Effect of Factors

factors have levels; factor combinations as cells analysis of variance (ANOVA) main effects & interaction word model / math symbols / computer language

1.2 Nature of Data

quality & structure garbage in, garbage out (gigo) mechanics of manipulation store, transfer, handle very large data sets analysis & display description & inference data mining dangers of fishing new views on very large problems

1.3 Summary Tables

table of means order by mean values, not alphabetical significant digits avoid repetition cross-tables for two or more factors plots for moderate to large number of levels anova table needed? put in appendix? key results in text (p-values)

1.4 Plots for Statistics

annotation use plot symbols, circles & arrows identify unusual points label axes & subject matter show central tendancy & variation compromise crammed with important details easy to absorb & grasp plots of relationships guide analysis crystalize questions highlight design issues sketch vs. publication quality single group or side-by-side groups histogram or dot diagram stem-and-leaf diagram survival curve or cumulative distribution boxplot eschew bargraphs & piecharts multiple factors interaction plots scatter plots response vs. covariate or group mean residual plot: vs. predicted or covariate use plot symbols for factor levels! invent symbols for factor combinations (cells) care with unbalanced designs nested designs care separating & identifying sources of variation blocking & subsampling split plot design -- key features of nesting repeated measures -- correlation over levels (time)

1.5 Computing

primary tools suggested in this course SAS industry and government standard handles complicated designs well large staff of statisticians local expertise tends to be used in "batch" mode S-Plus becoming industry standard excellent interactive functions & graphics easily extensible with functions intelligent data structures on your own for more complicated designs others whatever works (Minitab, SPSS, Systat, ...) know in detail what it does strengths & weaknesses accuracy & accessibility fancy graphics does not imply correct calculations complement computing tools exploratory vs. presentation graphics complicated analyses ease of transfer to written report dynamic graphics interactive adjustment of plot features Internet StatLib -- http://lib.stat.cmu.edu/ NetLib -- ftp://netlib.att.com/netlib/master/readme.html http://www.stat.wisc.edu/ interactive Internet resources

1.6 Interpretation

inference: sampled vs. target population comparing distribtions means & variances may differ assumptions: how important are they? models vs. reality curve fitting to match data in hand mechanistic model to match process under study Box: "all models are wrong, but some models are useful"

2. Collaboration in Science

communication takes practice applied statistician -- building career in collaborative consulting lab or field scientist -- organizing thoughts before & during research environment for healthy collaboration embark on knowledge discovery process convey concepts in simple, accessible language neutral, comfortable climate for listening consulting as a series of interviews initial grasp of experiment & key questions later elaboration of specific aspects of design & analysis

2.1 Asking Questions

general -> specific -> general start with background of experiment avoid blunt questions & jargon ask neutral questions rephrase material to check comprehension anything else?

2.2 Learning from Plots

initial plots physical layout of experiment raw sketches -- scatter plots & tables augment plots with symbols & comments order factor levels by mean values model fit & check start with simple models using well-behaved subsets subdivide when suggested by analysis (interactions) overlay model on data include precision; identify sources of variation use plots to check assumptions & identify outliers interpretation & presentation keep audience in mind stick to a few self-contained figures annotate to highlight results & key features

2.3 Mechanics of Consulting Session

many activities at once organization of time & responsibilities science of research problem interpersonal dynamics beginning build mutual respect importance of opening climate set clear agenda & time frame establish levels of expertise middle goals, scientific issues statistical approach start simple with plots build complexity at comfortable pace keep technical level appropriate to problem always have goals in mind ending review progress outline future tasks reevaluate time frame & goals as necessary

2.4 Philosophy & Ethics

articles philosophy of consulting training of statisticians for consulting history of statistics & science science does not always move forward statistician as disinterested party statistician's role in ethical misconduct error/oversight vs. misuse/fraud ethical guidelines & avenues for help

2.5 Intelligence, Culture & Learning

learning process & concept of intelligence Herrmann: complementary thinking processes cerebral/limbic - left/right Gardner: seven intelligences linguistic, musical, logical/mathematical, spatial, bodily/kinesthetic, intra-personal, inter-personal Markova: perceptual channels visual, auditory, kinesthetic front/middle/back channels statistical consultant as anthropologist

2.6 Writing

science writing protocols of materials & methods articulate key questions & goals lay out experimental design plan strategy for analysis visualize data as sketched plots notes before, during & after consulting sessions keep in mind how to communicate with peers sample report outline title page (informative title / name / date), abstract / summary (half-page / condensed / specific results), introduction (overview / big picture, literature ), experimental design / materials & methods / data description, results (plots / tables / plain reporting), conclusions (interpretation / cautions / future work ), references (full citations of work referred to in report), appendix (brief! needed?) writing guides Strunk & White: elements of style Gower: classic writing ideas Goldberg: creative writing Higham: handbook of writing for math sciences

3. Experimental Design

data analysis drives experimental design drives data analysis

3.1 Types of Studies

pure observational study (natural history) sample survey designed experiment protocol established ahead scientist controls key aspects biostatistics prospective study retrospective study clinical trial

3.2 Designed Experiments

factor & levels, groups what is the experimental unit (EU)? factor combination as cell factor combination as group designed experiment key questions drive experiment treatment structure: factor levels under study design structure: restrictions on randomization assumptions, goals for inference

3.3 Design Structure

must be understood for proper analysis replication increase precision (central limit theorem) smooth over odd situations (outliers) pseudo-replication, repeated measures randomization sample EUs drawn from one population of interest randomly assign factor levels to EU (drug) samples drawn from several populations random sample of EUs from population (gender) same analysis, different inference / interpretation randomize over extraneous factors, trends, etc. examples one factor subsampling or pseudoreplication completely randomized design (CRD) randomized comple block design (RCBD) two factor strip plot, CRD, split plot

3.4 Treatment Structure

one-factor (one-way layout) two-factors (two-way layout) factorial arrangements fractional factorial arrangement (stat 424)

3.5 Designs in This Book

B: groups, one factor 1,2,3 factors C: balanced designs D: unbalanced / missing cell E: assumptions residual & diagnostics / unequal variances transformations / distribution-free methods F: covariates G: random / fixed / mixed effects H: nested designs blocking / subsampling split plot, strip plot I: correlated measurements (over time, space) repeated measures cross-over designs

Last modified: Tue Feb 17 08:47:29 1998 by Brian Yandell (yandell@stat.wisc.edu)