Draft Syllabus for BMI/Stat 677/title> </head> <body> <h1>Draft Syllabus for BMI/Stat 677</h1> <h2>Introduction to Statistical Methods for Molecular Biology</h2> This syllabus is aimed at biology graduate students with strong quantitative skills. <ul> <li> <a href="#approach">Teaching Approach</a> <li> <a href="#goals">Course Goals</a> <li> <a href="#outline">Course Outline</a> <li> <a href="#reference">Potential Course References</a> </ul> <hr> <h4><a name="approach">Teaching Approach</a></h4> Course will have a fairly fixed syllabus (below) with lectures. Reading and 'smaller' assignments will be given for each segment. Larger term assignments will be collaborative projects in subject area of interest to student team, leading to a paper and presentation. <h4>Frequency</h4> Ideally BMI/Stat 677 and <a href="syl877.html">Stat/BMI 877</a> will be taught in alternate years, probably during Spring semester. We anticipate teaching 677 for the first time in Spring 2009. <h4>Instructors</h4> Course will be taught either by a team of instructors or by one of a rotating list of instructors, including but not limited to the following: <a href="/~ane">Cecile Ane</a>, <a href="http://www.biostat.jhsph.edu/~kbroman/">Karl Broman</a>, <a href="http://www.biostat.wisc.edu/People/faculty/fine.htm">Jason Fine</a>, <a href="/~keles">Sunduz Keles</a>, <a href="http://www.biostat.wisc.edu/~kendzior/">Christina Kendziorski</a>, <a href="/~larget">Bret Larget</a>, <a href="/~newton">Michael Newton</a>, <a href="/~yandell">Brian Yandell</a>. In the case of a team, one person would be lead instructor, officially responsible for the course. In Spring 2008, Christina Kendziorski will lead in teach <a href="syl877.html">Stat/BMI 877</a>. <hr> <h4><a name="goals">Course Goals</a></h4> Give a concise introduction to the statistical problems arising recently in gene mapping, high throughput -omic data analysis, phylogenetics and sequence analysis. Basic concepts of key methods will be developed with considerable attention to analysis of published data. Biology students should gain a deeper understanding of state-of-the-art statistical methods to encourage best practices. The objective is to experience fruitful cross-disciplinary work. Course projects will be designed to apply methods to data and possibly extend methods in novel ways to ask new questions. <hr> <h4><a name="outline">Course Outline</a></h4> Below is an outline of topics that might be covered. They are organized into four biological areas plus statistical issues common across topics. Subtopics in [brackets] are advanced and may be optional. <ul> <li>statistical issues common across topics (2 weeks) <ul> <li> maximum likelihood, Bayesian and other methods <li> multiple testing and false discovery rates <li> resampling methods <li> [high-dimensional testing and estimation] </ul> <li>gene mapping for experimental crosses (2-3 weeks) <ul> <li> quantitative genetics review <li> qualitative traits <li> quantitative trait loci (QTL) <li> model selection for genetic architecture <li> [fine-mapping strategies] <li> inbred crosses, [outbred crosses, natural populations] </ul> <li>high throughput -omic data analysis (2-3 weeks) <ul> <li> normalization/pre-processing <li> hierarchical clustering <li> comparing two conditions <li> analysis for more complicated designs <li>analysis for emerging biotechnological -omic experiments <ul> <li> ChIP-chip, expression tiling, CGH, CSI <li> [microarray data as complex phenotypic traits (eQTL)] </ul> </ul> <li>statistical phylogenetics (2-3 weeks) <ul> <li> likelihood models of molecular evolution / maximum likelihood estimation <li> Bayesian phylogenetic inference / MCMC strategies <li> trait evolution and comparative methods </ul> <li>biological sequence analysis (2-3 weeks) <ul> <li> review of transcription regulation <li> regulatory motif finding <ul> <li> model-based approaches (constrained and unconstrained mixture models, HMMs) <li> regression-based approaches </ul> <li> integrating high throughput experimental data <ul> <li> cross-species conservation <li> ChIP-chip, nucleosome positioning, CSI </ul> </ul> <li> final reports (1-2 weeks) </ul> <hr> <h4><a name="reference">Potential Course References</a></h4> <ul> <li> gene mapping for experimental crosses: <ul> <li> Broman KW, Sen S (in prep) R/qtl book [should be good on methods and graphics] <li> Yandell BS, Yi N, Churchill G (in prep) <cite>Bayesian Model Selecion for Multiple QTL</cite> <li> Lynch M and Walsh B (1997) <A HREF="http://nitro.biosci.arizona.edu/zbook/book.html"> <CITE>Fundamentals of Quantitative Genetics</CITE></A>. <A HREF="http://www.sinauer.com/Titles/frlynch.htm">Sinauer Associates</A>. ISBN 0-87893-481-2. [good overview, but short on specifics] <li> <a href="qtl.html">Selected References on Gene Mapping for Experimental Crosses</a> </ul> <li> high throughput -omic data analysis <ul> <li> Do KA, Muller P, Vannuccii M, ed. (2006) <cite><a href="http://books.google.com/books?vid=ISBN052186092X">Bayesian Inference for Gene Expression and Proteomics</a></cite>. Cambridge University Press, New York, NY. <li> Causton H, Quackenbush J, Brazma A (2003) <cite><a href="http://books.google.com/books?vid=ISBN1405106824">Microarray Gene Expression Data Analysis: A Beginner's Guide</a></cite>. Blackwell Science Ltd., Malden, MA. </ul> <li> high throughput -omic data analysis <ul> <li> <a href="http://www.lirmm.fr/mab/sommaire_english.php3">Gascuel O</a>, ed. (2005) <cite>Mathematics of Evolution and Phylogeny</cite>, Oxford University Press. [aimed at stat grad students in 800 level course] <li> <a href="http://www.binf.ku.dk/~rasmus/webpage/ras.html">Nielsen R</a>, ed. (2005) <cite><a href="http://books.google.com/books?vid=ISBN0387223339">Statistical Methods in Molecular Evolution</a></cite>. Springer-Verlag. [covers broader range of topics; lacks some basics] <li> <a href="http://evolution.genetics.washington.edu/phylip/felsenstein.html">Felsenstein J</a> (2003) <cite>Inferring Phylogenies</cite>. Sinauer Assoc. [better for biology students in 600 level course] </ul> <li> biological sequence analysis <ul> <li> <a href="http://www.bio.upenn.edu/faculty/ewens/">Ewens WJ</a>, <a href="http://www.greg.grant.org/">Grant GR</a> (2005) <cite><a href="http://books.google.com/books?vid=ISBN0387952292">Statistical Methods in Bioinformatics</a></cite>, 2nd ed. Springer-Verlag. [nice book presenting materials from a statistical viewpoint; many other bioinformatics books are rather algorithmic] <li> Durbin R, Eddy SR, Krogh A, Mitchison G (1998) <cite><a href="http://selab.wustl.edu/publications/cupbook.html">Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids</a></cite>. Cambridge U Press </ul> </ul> <hr> <address><a href="http://www.stat.wisc.edu/~yandell">Brian Yandell</a></address>   Last modified: Fri May 18 10:35:17 CDT 2007  </body> </html>