Draft Syllabus for BMI/Stat 677
Introduction to Statistical Methods for Molecular Biology
This syllabus is aimed at biology graduate students with strong
quantitative skills.
Course will have a fairly fixed syllabus (below) with
lectures. Reading and 'smaller' assignments will be given for each
segment. Larger term assignments will be collaborative projects in
subject area of interest to student team, leading to a paper and
presentation.
Frequency
Ideally BMI/Stat 677 and Stat/BMI 877
will be taught in alternate years, probably during Spring
semester. We anticipate teaching 677 for the first time in Spring
2009.
Instructors
Course will be taught either by a team of instructors or by one of
a rotating list of instructors, including but not limited to the
following:
Cecile Ane,
Karl Broman,
Jason Fine,
Sunduz Keles,
Christina
Kendziorski,
Bret Larget,
Michael Newton,
Brian Yandell.
In the case of a team, one person would be lead
instructor, officially responsible for the course. In Spring 2008,
Christina Kendziorski will lead in teach
Stat/BMI 877.
Give a concise introduction to the statistical problems arising
recently in gene mapping, high throughput -omic data analysis,
phylogenetics and sequence analysis. Basic concepts of key methods
will be developed with considerable attention to analysis of
published data. Biology students should gain a deeper
understanding of state-of-the-art statistical methods to encourage
best practices. The objective is to experience fruitful
cross-disciplinary work. Course projects will be designed to apply
methods to data and possibly extend methods in novel ways to ask
new questions.
Below is an outline of topics that might be covered. They are
organized into four biological areas plus statistical issues
common across topics. Subtopics in [brackets] are advanced and
may be optional.
- statistical issues common across topics (2 weeks)
- maximum likelihood, Bayesian and other methods
- multiple testing and false discovery rates
- resampling methods
- [high-dimensional testing and estimation]
- gene mapping for experimental crosses (2-3 weeks)
- quantitative genetics review
- qualitative traits
- quantitative trait loci (QTL)
- model selection for genetic architecture
- [fine-mapping strategies]
- inbred crosses, [outbred crosses, natural populations]
- high throughput -omic data analysis (2-3 weeks)
- normalization/pre-processing
- hierarchical clustering
- comparing two conditions
- analysis for more complicated designs
- analysis for emerging biotechnological -omic experiments
- ChIP-chip, expression tiling, CGH, CSI
- [microarray data as complex phenotypic traits (eQTL)]
- statistical phylogenetics (2-3 weeks)
- likelihood models of molecular evolution / maximum likelihood estimation
- Bayesian phylogenetic inference / MCMC strategies
- trait evolution and comparative methods
- biological sequence analysis (2-3 weeks)
- review of transcription regulation
- regulatory motif finding
- model-based approaches (constrained and unconstrained mixture models, HMMs)
- regression-based approaches
- integrating high throughput experimental data
- cross-species conservation
- ChIP-chip, nucleosome positioning, CSI
- final reports (1-2 weeks)
- gene mapping for experimental crosses:
- high throughput -omic data analysis
- high throughput -omic data analysis
- Gascuel O, ed. (2005)
Mathematics of Evolution and Phylogeny,
Oxford University Press. [aimed at stat grad students in 800 level course]
- Nielsen R, ed. (2005)
Statistical Methods in Molecular Evolution.
Springer-Verlag. [covers broader range of topics; lacks some basics]
- Felsenstein J (2003)
Inferring Phylogenies.
Sinauer Assoc.
[better for biology students in 600 level course]
- biological sequence analysis
Brian Yandell
Last modified: Fri May 18 10:35:17 CDT 2007