This webpage is for archival purpose only. Prof. Snyder will teach a different version of CS769 in Spring 2011. CS 769: Advanced Natural Language Processing Spring 2010
Schedule: Lecture: 9:30--10:45 MWF. 103 Psychology Office Hours: 3:00-4:00pm Tuesdays or by appointment. 6391 Computer Science Calendar: = (definitely no class), - (tentatively no class, subject to change) Jan 2010 Feb Mar Apr May 1 3 5 1 3 5 = 8 10 12 8 10 - 5 7 - - 5 - 20 22 15 17 19 15 17 - 12 = - 25 27 = 22 24 26 22 24 - 19 21 - = = - - - Class mailing list: compsci769-1-s10@lists.wisc.edu (archive) Course URL: http://pages.cs.wisc.edu/~jerryzhu/cs769.html Lecture Notes: (available before class) 1. Get your hands dirty (with text) Basic Mathematical Background Text Pre-processing and Zipf's Law 2. The statistical machine learning approach to natural language processing Language as a Stochastic Process Introduction to Statistical Machine Learning (read chapter 1) Optional reading: Chapter 2 in The Elements of Statistical Learning Paired t-test Language Models. Reading: [Chen & Goodman 98] 3. Simple text classification Naive Bayes classifiers. Reading: [Bishop 8.1, 8.2] Logistic Regression. Reading: [Ng & Jordan 02] Support Vector Machines. Reading: [Joachims 98] 4. The EM algorithm The Expectation-Maximization (EM) Algorithm. Reading: [Nigam et al. 00] 5. Latent topic modeling Latent Semantic Indexing, Principal Component Analysis, Latent Dirichlet Allocation. Readings: [Blei, Ng, Jordan 03; Griffiths & Steyvers 04] 6. Random walk and graph spectrum Google PageRank, HITS. Reading: [Doyle & Snell 84, up to section 1.3] k-means clustering and Spectral clustering. Reading: [von Luxburg 07] 7. Structured output spaces Hidden Markov Models. Reading: [Ghahramani 01] Factor Graph and the Sum-Product Algorithm (loopy belief propagation). Reading: [Bishop 06, chapter 8] Conditional Random Fields. Reading: [Sutton & McCallum 06] 8. Information theory Basic Information Theory. Reading: [Brown et al. An estimate ... 92] Instructor: Xiaojin (Jerry) Zhu Office: 6391 CS E-mail: jerryzhu@cs.wisc.edu Phone: 608-890-0129 Office Hours: 3:00-4:00pm Tuesdays, or by appointment Teaching Assistant: Tuo Wang Office: 5364 CS E-mail: tuowang@cs.wisc.edu Phone: 608-262-5105 Office Hours: 2:00-3:00pm Thursdays Prerequisites: CS 540 or equivalent, or instructor consent Homeworks Homework 6, due 4/21 Homework 5, (solution) Homework 4, (solution) Homework 3, (solution) Homework 2, (solution) Homework 1, (solution) Homework tips and rules Keep track of your grades through the learn@uw system. The grade consists of 5 to 6 homeworks (50%), an exam (30%), and a project (20%). Class Projects An excellent project addresses an issue with social impact, and/or is creative. A good project applies methods discussed in class to an NLP (or your own research) problem. The amount of work is expected to be similar to 3 homeworks. Most projects should be individual, group projects are allowed with instructor approval. What we want: 1. Email me and the TA a short (1-2 paragraphs) proposal (early is better, but no later than 3/24) 2. A poster session (5/5 9:30am, room TBA) 3. Email me and the TA the project report in PDF file, 2-page extended abstract style (5/7). Please use the Word or Latex template for AAAI, available at http://www.aaai.org/Publications/Author/author.php. The length limit is 2 pages, plus 1 additional page for references. There is no need to hand in a hard copy. Projects from previous years: 2009, 2008, 2007, 2006 Useful link: ACL wiki, in particular the "Resources" tab. Textbook There is no required textbook in this course. We will use papers and books from this list instead.