CS/ECE/STAT-861: Theoretical Foundations of Machine Learning

University of Wisconsin-Madison, Fall 2025

Lecture Notes

  • Course overview,   Slides.

  • Chapter 0: A toolkit for CS861,   Slides.

  • Chapter 1: Lower bounds,   Slides.

  • Chapter 2: Nonparametric methods,   Slides.

  • Chapter 3: Statistical learning theory,   Slides.

  • Chapter 4: Stochastic bandits,   Slides.

  • Chapter 5: Online learning and adversarial bandits,   Slides.

  • Chapter 6: Online convex optimization,   Slides.

Recommended Reading

Course Schedule

Date Topics Recommended reading Announcements
Wed. 09/03 Course overview and logistics,
Begin Ch0, Sub-Gaussian concentration, Covering and packing
Homework 0 released.
Fri. 09/05 Distances between distributions,
Begin Ch1, Lower bounds for point estimation
Lectures 7, 8, 9, 10 from
Lester Mackey's class
Mon. 09/08 Average risk optimality vs minimax optimality
Lower bounds for hypothesis testing, Le Cam's method
Lectures 7, 8, 9, 10 from
Lester Mackey's class,
JD Chapter 7
Wed. 09/10 Le Cam's method (cont'd),
Review of information theory
JD Chapter 7,
Cover & Thomas Chapter 2
Fri. 09/12 Fano's method,
Reduction from estimation to testing
JD Chapter 7 HW0 due on 09/13.
Mon. 09/15 Reduction to testing (cont'd), Examples for LeCam's method
Constructing alternatives for Fano's method
JD Chapter 7 HW1 released
partially
Wed. 09/17 Gilbert-Varshamov bound, Fano's method examples
Begin Ch2, Nonparametric regression
JD Chapter 7,
AT Chapter 2.5
HW1 updated on 09/18
Fri. 09/19 Nonparametric regression (cont'd) AT Chapters 1.2,1.5, 2.5
Mon. 09/22 Nonparametric regression (cont'd),
Nonparametric density estimation
AT Chapters 1.2,1.5, 2.5
Wed. 09/24 Ch3 begin, Statistical learning theory,
ERM and uniform convergence
MRT Chapter 2, 3,
SB Chapter 6
Fri. 09/26 Rademacher complexity and its properties MRT Chapter 2, 3,
SB Chapter 6, 26
HW1 due on 09/27.
HW2 released partially
Mon. 09/29 Contraction lemma, Symmetrization,
Uniform convergence via Rademacher complexity
MRT Chapter 2, 3,
SB Chapter 4, 26
Wed. 10/01 VC dimension and Sauer's lemma MRT Chapter 3,
SB Chapter 6
HW2 updated.
Fri. 10/03 Proof of Sauer's lemma,
VC dimension-based lower bounds for binary classification
MRT Chapter 3,
SB Chapter 6
Mon. 10/06 Dudley Entropy Integral Ch 4 of Tengyu Ma's Notes
Wed. 10/08 Two-layer Neural Networks, Approximation vs estimation error Ch 5.3 of Tengyu Ma's Notes,
Fri. 10/10 Ch4 begin, Introduction to Stochastic Bandits,
The UCB algorithm
LS Chapters 1, 2, 4, 7 HW2 due on 10/11.
Mon. 10/13 Lower bounds for K-armed bandits LS Chapter 7 HW3 released on 10/14
Wed. 10/15 Linear bandits LS Chapters 19, 20
Fri. 10/17 Martingale concentration,
Ch5 begin, Introduction to online learning
LS Chapters 19, 20
FO Chapter 7
Project preliminary
drafts due on 10/18.
Mon. 10/20 The experts problem and the Hedge algorithm
Adversarial bandits and EXP3
FO Chapter 7,
LS Chapter 11
HW3 updated on 10/19
Wed. 10/22 Lower bounds for adversarial bandits,
Contextual bandits and EXP4
FO Chapter 7,
LS Chapter 11
HW4 released on 10/23
Fri. 10/24 Ch6 begin, Review of convex analysis,
Introduction to online convex optimization
FO Chapter 7 HW3 due on 10/25.
Mon. 10/27 Follow the (regularized) leader, Failure cases for FTL FO Chapter 7
Wed. 10/29 FTRL with strongly convex regularizers,
FTRL examples, Online gradient descent
FO Chapter 7
Fri. 10/31 Follow the perturbed leader,
FTPL for the experts problem
Kalai & Vempala, 2005
Mon. 11/03 FTPL for online shortest paths,
Exam review and logistics
Kalai & Vempala, 2005 HW4 due on 11/08.
End of class
Sat. 11/15 Project questions (with
solutions) due on 11/15.
Mon. 11/17 –
Fri 11/21
Take-home exam
Sat. 12/6
Solutions to assigned project
questions due on 12/06.