CS/ECE 861 Theoretical Foundations of Machine Learning

Fall 2022 Course:

Physical lectures in CS 1325, MWF 9:30-10:45am (see calendar below for exact dates)
Zoom office hours Thursdays 4-5pm (link in calendar)
Assignments, lecture notes in Canvas

Description
Advanced mathematical theory and methods of machine learning. Statistical learning theory,
Vapnik-Chevronenkis Theory, model selection, high-dimensional models, nonparametric methods,
probabilistic analysis, optimization, learning paradigms.

Prereq
CS/ECE 761 or ECE 830
(While not required, my suggested course sequence for incoming machine learning graduate students
is CS 532 Matrix Methods in Machine Learning - CS 761 - CS 861. In addition, Math 521 Analysis I,
ECE 730 Modern Probability Theory and Stochastc Processes, CS/ECE 524 Introduction to Optimization,
and similar math courses are helpful)

Instructor
Professor Jerry Zhu, jerryzhu@cs.wisc.edu

Assignments
will be released in Canvas
Late policy: 1 day late 25% off, 2 days late 40% off, 3 days late 50% off, not accepted after that.

Discussions
will be in Piazza

Scribe
Each student is expected to scribe three lectures. Each lecture needs two scribers.
1. Sign up for three lectures (spreadsheet link in Canvas). Pick your dates based on your availability
not topic. Topics may change.
2. Latex template for scribing is provided in Canvas.
3. A first draft of your scribe notes is due 72 hours after the lecture and should be emailed to the
instructor. Submissions must include a compiled PDF, the LaTeX source, and necessary figures. The
instructor may request further changes to the draft.
4. The notes will be posted on Canvas.
5. If you decide to drop the course before your scribe date, inform the instructor as soon as possible.
If you are on the waitlist, please do not sign up for scribing until you have been allowed to enroll.

Topics
(tentative)
Supervised Learning
Probabily Approximately Correct (PAC) [SS] Ch 2, 3, 4
Rademacher complexity, Growth function, VC dimension [SS] Ch 6, 26
Convexity, stability and generalization [SS] Ch 9, 13
Occam's razor, PAC-Bayesian [SS] Ch 7, 31
Online learning
Mistake bound, halving algorithm, Online perceptron algorithm [SS] Ch 21
Expert advice, Hedge [Slivkins] Chapter 5
Multi-armed bandits
Adversarial bandits: EXP3 [LS] Chapters 1, 11
Stochastic bandits: ETC, UCB, successive elimination [LS] Chapters 5, 6, 7
Contextual bandits, LinUCB [LS] Chapters 18, 19
Minimax lower bound [LS] Chapters 13, 14, 15
Reinforcement learning
UCB-VI [AJKS] Ch 1, 6, 7

References

[AJKS] Alekh Agarwal, Nan Jiang, Sham M. Kakade, Wen Sun. Reinforcement Learning: Theory and Algorithms

[SS] Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms

[LS] Tor Lattimore and Csaba Szepesvari. Bandit Algorithms.

[Slivkins] Aleksandrs Slivkins. Introduction to Multi-Armed Bandits

Grading: homework (50%), project (40%), scribe (10%)

Class learning outcome
Student will be able to:
- derive sample complexity bounds using concentration of measure inequalities
- analyze bias-variance tradeoffs and model selection criteria
- derive rates of convergence for nonparametric machine learning algorithms
- gain familiarity with various machine learning paradigms, including supervised, unsupervised, active, multitask, and online learning.