CS/ECE 861 Theoretical Foundations of Machine Learning

**Fall 2021 Course:**

- Physical lectures in ECE 1209, on most MWF 9:30-10:45am; office hours T 3-4pm (see calendar below)
- Assignments, lecture recordings in Canvas
- Students: join Piazza for discussions
- Masks Policies: All must wear a mask. A student who needs a mask accommodation should contact McBurney Disability Center.
McBurney will send notifications of approved accommodations directly to instructors. If a student fails to comply with
wearing a mask, the instructor may cancel that course session and report the incident to the Office of Student Conduct and Community Standards.
- Late homework policy: We remove two lowest homework scores in the semester. This is meant for emergencies, sickness, and the like.
But otherwise we do not accept late homework.

**Description**
Advanced mathematical theory and methods of machine learning. Statistical learning theory,
Vapnik-Chevronenkis Theory, model selection, high-dimensional models, nonparametric methods,
probabilistic analysis, optimization, learning paradigms.
**Prereq**
CS/ECE 761 or ECE 830
(While not required, my suggested course sequence for incoming machine learning graduate students
is CS 532 Matrix Methods in Machine Learning - CS 761 - CS 861. In addition, Math 521 Analysis I,
ECE 730 Modern Probability Theory and Stochastc Processes, CS/ECE 524 Introduction to Optimization,
and similar math courses are helpful)
**Instructor**
Professor Jerry Zhu, jerryzhu@cs.wisc.edu
**Exam**
Midterm exam: Friday Oct. 8 (in class)
Final exam: Friday Nov. 12 (in class)
Exam grading questions must be raised with the instructor within one week after it is returned.
**Project**
An open machine learning project, done in groups of two. Requires an analysis component.
Project proposal due: Nov. 19 (Friday). One page pdf
Project report due: Dec. 15 (Wednesday). Eight page pdf in NeurIPS format.
**Topics**
Supervised Learning
Probabily Approximately Correct (PAC) [SS] Ch 2, 3, 4
Rademacher complexity, Growth function, VC dimension [SS] Ch 6, 26
Convexity, stability and generalization [SS] Ch 9, 13
Occam's razor, PAC-Bayesian [SS] Ch 7, 31
Online learning
Mistake bound, halving algorithm, Online perceptron algorithm [SS] Ch 21
Expert advice, Hedge [Slivkins] Chapter 5
Multi-armed bandits
Adversarial bandits: EXP3 [LS] Chapters 1, 11
Stochastic bandits: ETC, UCB, successive elimination [LS] Chapters 5, 6, 7
Contextual bandits, LinUCB [LS] Chapters 18, 19
Minimax lower bound [LS] Chapters 13, 14, 15
Reinforcement learning
UCB-VI [AJKS] Ch 1, 6, 7
**References**
[AJKS] Alekh Agarwal, Nan Jiang, Sham M. Kakade, Wen Sun. Reinforcement Learning: Theory and Algorithms
[SS] Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms
[LS] Tor Lattimore and Csaba Szepesvari. Bandit Algorithms.
[Slivkins] Aleksandrs Slivkins. Introduction to Multi-Armed Bandits
**Grading:** Homework (40%), exam (40%), project (20%).
**Class learning outcome**
Student will be able to:
- derive sample complexity bounds using concentration of measure inequalities
- analyze bias-variance tradeoffs and model selection criteria
- derive rates of convergence for nonparametric machine learning algorithms
- gain familiarity with various machine learning paradigms, including supervised, unsupervised, active, multitask, and online learning.