Draft Syllabus for Stat/BMI 877
Statistical Methods for Molecular Biology
This syllabus is aimed at statistics graduate students with interests in
molecular biology
Course will have a fairly fixed syllabus (below) with
lectures. Reading and 'smaller' assignments will be given for each
segment. Larger term assignments will be collaborative projects in
subject area of interest to student team, leading to a paper and
presentation.
Frequency
Ideally BMI/Stat 677 and Stat/BMI 877
will be taught in alternate years, probably during Spring
semester. We anticipate teaching 877 in Spring 2008.
Instructors
Course will be taught either by a team of instructors or by one of
a rotating list of instructors, including but not limited to the
following:
Cecile Ane,
Karl Broman,
Jason Fine,
Sunduz Keles,
Christina
Kendziorski,
Bret Larget,
Michael Newton,
Brian Yandell.
In the case of a team, one person would be lead
instructor, officially responsible for the course. In Spring 2008,
Christina Kendziorski will lead in teach
Stat/BMI 877.
Give a concise review of relevant background biology and
of statistical problems arising recently in gene
mapping, high throughput -omic data analysis, phylogenetics and
sequence analysis. Basic ideas of key methods will be developed
with considerable attention to analysis of published
data. Statistics students should gain sufficient background to
start exploring their own research questions in the area. The
objective is to experience fruitful cross-disciplinary work.
Projects may be theoretical or methodological, designed
to explore how to extend current methods to novel questions.
Below is an outline of topics that might be covered. They are
organized into four biological areas plus statistical issues
common across topics. Subtopics in [brackets] are advanced and
may be optional.
- biological context/motivation (1 week)
- central dogma: DNA/RNA/proteins/traits
- recent massive high-throughput technology
- statistical issues common across topics (1 weeks)
- maximum likelihood, Bayesian and other methods
- multiple testing and false discovery rates
- resampling methods
- [high-dimensional testing and estimation]
- gene mapping for experimental crosses (2-3 weeks)
- quantitative genetics review
- qualitative traits
- quantitative trait loci (QTL)
- model selection for genetic architecture
- [fine-mapping strategies]
- inbred crosses, [outbred crosses, natural populations]
- high throughput -omic data analysis (2-3 weeks)
- normalization/pre-processing
- hierarchical clustering
- comparing two conditions
- analysis for more complicated designs
- analysis for emerging biotechnological -omic experiments
- ChIP-chip, expression tiling, CGH, CSI
- [microarray data as complex phenotypic traits (eQTL)]
- statistical phylogenetics (2-3 weeks)
- likelihood models of molecular evolution / maximum likelihood estimation
- Bayesian phylogenetic inference / MCMC strategies
- trait evolution and comparative methods
- biological sequence analysis (2-3 weeks)
- review of transcription regulation
- regulatory motif finding
- model-based approaches (constrained and unconstrained mixture models, HMMs)
- regression-based approaches
- integrating high throughput experimental data
- cross-species conservation
- ChIP-chip, nucleosome positioning, CSI
- final reports (1-2 weeks)
- gene mapping for experimental crosses:
- high throughput -omic data analysis
- high throughput -omic data analysis
- Gascuel O, ed. (2005)
Mathematics of Evolution and Phylogeny,
Oxford University Press. [aimed at stat grad students in 800 level course]
- Nielsen R, ed. (2005)
Statistical Methods in Molecular Evolution.
Springer-Verlag. [covers broader range of topics; lacks some basics]
- Felsenstein J (2003)
Inferring Phylogenies.
Sinauer Assoc.
[better for biology students in 600 level course]
- biological sequence analysis
Brian Yandell
Last modified: Fri May 18 10:15:39 CDT 2007