CS/ECE/STAT-861: Theoretical Foundations of Machine Learning

University of Wisconsin-Madison, Fall 2025

Overview

This class will cover fundamental and advanced theoretical topics in Machine Learning. We will focus on several paradigms of learning (such as supervised/unsupervised learning, online learning, and sequential decision-making) and examine questions such as: Under what conditions can we learn and generalize from a limited amount of data? How hard is a given learning problem? How good is a learning algorithm and is it optimal for the given problem? When making decisions under uncertainty, how do we trade-off between learning about the environment and achieving our goal? We will use tools from several areas related to machine learning, such as statistics, algorithms, information theory, and game theory.

This course will be primarily targeted towards PhD students who intend to do research in theoretical machine learning and statistics.

Quick links: Canvas, Piazza.

Course staff

Instructor: Kirthevasan Kandasamy.
Office hours: Wednesdays 2:00 PM – 3:20 PM ~~1:30 PM -- 2:50 PM~~ at MH5506.
E-mail: kandasamy@cs{dot}wisc{dot}edu.

Grader: Julia Nakhleh.
E-mail: jnakhleh@wisc{dot}edu.

Lectures

Monday, Wednesday, and Friday. 11:00 AM – 12:15 AM. ENGR HALL 3349 ~~3534~~.
There will be a total of 27–30 lectures.
Lecture notes (slides) will be made available prior to the class.

Topics

This is a tentative list of topics that we intend to cover in this class. The course staff reserves the right to modify the syllabus as they see fit.

Background topics
- Probability, concentration of measure
- Covering and packing
- Information theory, distances between distributions
- Convex analysis

Statistical lower bounds
- Lower bounds for point estimation
- Lower bounds for hypothesis testing (Fano, LeCam methods)
- Going from estimation to testing
- Gilbert-Varshamov bound

Nonparametric methods
- Nonparametric regression, Nadaraya-Watson estimator
- Kernely density estimation
- Lower bounds for regression and density estimation

Statistical learning theory
- Basic framework: loss, hypothesis classes, excess risk
- Empirical risk minimization
- Uniform convergence
- Rademacher complexity
- VC dimension and Sauer's Lemma
- Dudley entropy integral and chaining
- Lower bounds: binary classification, linear regression

Stochastic bandits
- Optimism in the face of uncertainty and the Upper Confidence Bound (UCB) algorithm
- Lower bounds for stochastic K-armed bandits
- Linear bandits, martingale concentration
- ~~Best arm identification (if time permits)~~

Online learning and adversarial bandits
- Learning from experts and the Hedge algorithm
- Adversarial bandits and EXP3
- Lower bounds for adversarial bandits and learning from experts
- Contextual bandits and EXP4
- Learning in games (if time permits)
- ~~Regret minimization in non-stationary environments (if time permits)~~

Online convex optimization
- Follow the leader, Follow the regularized leader
- FTRL with convex regularizers, Online gradient descent
- Follow the perturbed leader, online shortest paths
- ~~Lower bounds for OCO (if time permits)~~

Prerequisites

CS761 or equivalent. I may waive this requirement, but it is the student's responsibility to have an adequate background in probability, statistics, calculus, and algorithms. I will not be doing a review of all necessary topics at the beginning of the class.

I will release a set of diagnostic questions as Homework 0 at the beginning of class. While you are not expected to know the solutions right away, you should be able to solve most of the questions with reasonable effort after looking up any references if necessary.

Foundations of Machine Learning by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.
Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai Ben-David.
Lecture notes on Information Theory by John Duchi.
Introduction to Nonparametric Estimation by Alexandre Tsybakov.
A Modern Introduction to Online Learning by Francesco Orabona.
Bandit Algorithms by Tor Lattimore and Csaba Szepesvári.

Logistics

Canvas: We will use canvas for homeworks and exams.

Piazza: Please sign up for the class on piazza via this link. See the Canvas announcement for the access code.

Piazza will be used for most announcements. But please check Canvas for announcements as well.
If you have any questions about class, it is best to message me via Piazza instead of directly emailing me. Please post your question publicly if you feel that other students may be able to answer it, or if you think that other students may benefit from the answer.
You may use Piazza for peer discussions about lectures or clarifications about homework questions. While I will be checking Piazza regularly, as a general rule, I will not be answering questions about homework in Piazza. It is best to use my OHs to discuss homework questions.

Grading

Your grade will be determined by proof-reading lecture notes, homeworks, a take-home exam, and a course project. See the grading page for more details.