CS/ECE/STAT-861: Theoretical Foundations of Machine Learning

University of Wisconsin-Madison, Fall 2023

Overview

This class will cover fundamental and advanced theoretical topics in Machine Learning. We will focus on several paradigms of learning (such as supervised/unsupervised learning, online learning, and sequential decision-making) and examine questions such as: Under what conditions can we learn and generalize from a limited amount of data? How hard is a given learning problem? How good is a learning algorithm and is it optimal for the given problem? When making decisions under uncertainty, how do we trade-off between learning about the environment and achieving our goal? We will use tools from several areas related to machine learning, such as statistics, algorithms, information theory, and game theory.

This course will be primarily targeted towards PhD students who intend to do research in theoretical machine learning and statistics.

Instructor

Kirthevasan Kandasamy.
Office hours: Wednesday. 1:30 - 3:00 PM at CS5375.
E-mail: kandasamy@cs{dot}wisc{dot}edu.

Lectures

Monday, Wednesday, and Friday. 11:00 AM – 12:15 PM. ENGR HALL 3349.
There will be a total of 27–30 lectures. Lectures will primarily be on the whiteboard, and lecture notes scribed by students will be made available within 4-5 days of the lecture.

Topics

Supervised learning
- Loss, risk, and empirical risk minimization
- PAC Learning
- Rademacher complexity and VC dimension

Lower bounds
- Review of information theory, distances between distributions
- Average risk optimality vs minimax optimality
- Lower bounds for point estimation
- Going from estimation to testing: Fano and LeCam methods
- Lower bounds for classification in a VC class

Nonparametric methods
- Lower bounds for nonparametric regression and density estimation
- Upper bounds using kernel methods

Stochastic bandits
- Optimism in the face of uncertainty and the Upper Confidence Bound (UCB) algorithm
- Lower bounds
- Martingales, structured bandit settings

Online learning and adversarial bandits
- Learning from experts and the Hedge algorithm
- Adversarial bandits and EXP3
- Lower bounds for online learning and adversarial bandits
- Contextual bandits and EXP4
- Online convex optimization, Follow the regularized leader

Advanced topics (we may not have time to cover all of the following topics)
- Regret minimization in non-stationary environments
- Reinforcement learning
- Learning and game theory

Prerequisites

CS761 or equivalent. I may waive this requirement, but it is the student's responsibility to have an adequate background in probability, statistics, calculus, and algorithms. I will not be doing a review of these topics at the beginning of the class.

I will release a set of diagnostic questions as Homework 0 at the beginning of class. While you are not expected to know the solutions right away, you should be able to solve most of the questions with reasonable effort after looking up any references if necessary.

Foundations of Machine Learning by Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.
Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz and Shai Ben-David.
Lecture notes on Information Theory by John Duchi.
Introduction to Nonparametric Estimation by Alexandre Tsybakov.
A Modern Introduction to Online Learning by Francesco Orabona.
Bandit Algorithms by Tor Lattimore and Csaba Szepesvári.

Logistics

Canvas: We will use canvas for homeworks and exams.

Piazza: Please sign up for the class on piazza via this link. See the Canvas announcement for the access code.

Piazza will be used for most announcements. But please check Canvas for announcements as well.
If you have any questions about class, it is best to message me via Piazza instead of directly emailing me. Please post your question publicly if you feel that other students may be able to answer it, or if you think that other students may benefit from the answer.
You may use Piazza for peer discussions about lectures or clarifications about homework questions. While I will be checking Piazza regularly, as a general rule, I will not be answering questions about homework in Piazza. It is best to use my OHs to discuss homework questions.

Grading

Your grade will be determined by scribing, homeworks, a take-home exam, and a course project. See the grading page for more details.