CS 769: Advanced Natural Language Processing Spring 2009
Your to-do list Schedule: Lecture: 9:30--10:45 MWF. 1207 Computer Science The class will not meet on = or - (subject to change) days: Jan 2009 Feb Mar Apr May 21 23 2 = 6 2 4 6 1 - - 26 = = 9 11 13 9 11 13 6 8 - 4 - - 16 = 20 = = = 13 15 = 23 25 27 23 25 27 20 22 - 30 - - Class mailing list: compsci769-1-s09@lists.wisc.edu (archive) Course URL: http://pages.cs.wisc.edu/~jerryzhu/cs769.html Instructor: Xiaojin (Jerry) Zhu Office: 6391 CS E-mail: jerryzhu@cs.wisc.edu Phone: 608-890-0129 Office Hours: 11-12am Mondays, 9:30-10:30am Tuesdays Teaching Assistant: Chitra Muthukrishnan Office: 7376 CS E-mail: chitra@cs.wisc.edu Phone: 608-262-6625 Office Hours: 6:30-7pm Thursdays Prerequisites: CS 540 or equivalent, or instructor consent Homeworks Keep track of your grades through the learn@uw system. The grade consists of about 5 homeworks (50%) and a project (50%). Class Projects An excellent project addresses an issue with social impact, and/or is creative. A good project applies methods discussed in class to an NLP (or your own research) problem. The amount of work is expected to be equal to 3 to 5 homeworks. Most projects should be individual, group projects are allowed with instructor approval. What we want: 1. Email me and the TA a short (1-2 paragraphs) proposal (early is better, but no later than 3/31) 2. A poster session (5/4, 9:30am in CS1207) 3. A project report in PDF file, 2-page extended abstract style (5/6). Please use the Word or Latex template for AAAI, available at http://www.aaai.org/Publications/Author/author.php. The length limit is 2 pages, plus 1 additional page for references. Projects from previous years: 2008, 2007, 2006 Useful link: ACL wiki, in particular the "Resources" tab. Textbook and Readings There is no required textbook in this course. We will use papers and books from the above reading list instead. Lecture Outline: 1. Get your hands dirty (with text) Lecture notes: Basic Mathematical Background Lecture notes: Text Pre-processing and Zipf's Law 2. The machine learning approach to natural language processing Lecture notes: Bag-of-word and cosine similarity Lecture notes: Introduction to Statistical Machine Learning, handed out in class Lecture notes: Paired t-test Lecture notes: Language as a Stochastic Process Lecture notes: Language Models 3. Simple text classification Lecture notes: Naive Bayes classifiers Lecture notes: Logistic Regression Lecture notes: Support Vector Machines 4. The EM algorithm Lecture notes: The Expectation-Maximization (EM) Algorithm Additional notes handed out in class 5. Latent topic modeling Lecture notes: Latent Semantic Indexing, Principal Component Analysis, Latent Dirichlet Allocation 6. Random walk and graph spectrum Lecture notes: Google PageRank, HITS Lecture notes: Spectral clustering 7. Structured output spaces Lecture notes: Hidden Markov Models Lecture notes: Factor Graph and the Sum-Product Algorithm Lecture notes: Conditional Random Fields 8. Information theory Lecture notes: Basic Information Theory 9. Applications Photo Gallery (protected)