Learning Math for Machine Learning (LMML) Reading Group

Brian Fantana: They've done studies, you know. 60% of the time it works, every time.
Ron Burgundy: That doesn't make sense.
-Anchorman (2004)

UPDATE

This material is now covered more systematically and professionally in a new course taught by Professor Jerry Zhu.

Overview

Machine learning research uses tools and results from a variety of mathematical fields, including (but not limited to): convex optimization, probability, statistics, functional analysis, and computational learning theory. Familiarity with these ideas is crucial in order to fully participate in many exciting research directions. The purpose of this reading group is to gain a better understanding of some mathematical foundations relevant to machine learning research. Because of this focus, much of the material covered will not be about machine learning per se, but rather about general theoretical concepts which have important applications in machine learning.

Meeting format

This group will follow the time-honored format of weekly volunteer presentations. Due to the rigorous nature of this material there will probably be little benefit to simply printing out the paper, attending the meeting, and letting the presentation wash over you in a soothing wave of lemmas and Greek symbols. Please make a serious effort to understand the readings for weeks you choose to attend. Feel free though, to attend some meetings and skip others as your schedule and interests dictate, as the content will not generally be cumulative.

Meeting schedule

Mondays at 3:00 PM, room ~~2310~~ 1263 Computer Sciences
(if unavailable, backup location is CS 1289)

Mailing list

The LMML mailing list is currently inactive.

List of topics (tentative)

2/8, 2/15 Math foundations (Dave Andrzejewski) - inf/sup, open/closed, continuity, sequences, convergence, ...
Introduction to Mathematical Analysis by John Hutchinson (Australian National University).
2/22, 3/8 Linear algebra (Bess Berg) - eigenvalues/vectors, factorization, PSD, Sherman-Morrison-Woodbury...
Review notes by Sam Roweis.
Linear Operators: some basics by Bess Berg.
3/1 Optimization (Sangkyun Lee) - LP, QP, SDP, Lagrange multipliers, ...
The Interplay of Optimization and Machine Learning Research by Kristin P. Bennett and Emilio Parrado-Hernandez.
Lagrange Multipliers without Permanent Scarring by Dan Klein.
3/15 Probability (David Andrzejewski) - sample space, random variables, convergence, ...
Review of probability theory by Terrence Tao (blog).
3/22, 4/5 Statistics (Jie Liu) - Hypothesis testing, estimators, bias-variance, Fisher information, statistical decision theory, minimax ...
Notes based on STAT 609/610 (Prof Chunming Zhang) scribed by Jie Liu.
Further references:
Probability and Random Processes (3rd Edition), by Geoffrey R. Grimmett, David R. Stirzaker
Mathematical Statistics (2nd Edition), by Peter J. Bickel, Kjell A. Doksum
4/12 Bounds/inequalities (Sangkyun Lee) - Cauchy-Schwarz, Markov, Chernoff, Chebyshev, Hoeffding, Efron-Stein ...
Concentration Inequalities by Stephane Boucheron, Gabor Lugosi, and Olivier Bousquet.
Slides by Olivier Bousquet.
Concentration of measure by Terrence Tao (blog).
4/19 Statistical learning theory (Alok Deshpande) - Risk minimization, PAC-style bounds, VC theory, ...
Lecture notes by Rob Nowak (Wisconsin).
Introduction to Statistical Learning Theory by Olivier Bousquet, Stephane Boucheron, and Gabor Lugosi.
Lecture notes by Sham Kakade and Ambuj Tewari (TTI-Chicago).
4/26 Convexity (Prabu Ravindran) - convex sets/functions, Jensen's inequality, duality, ...
Convex Optimization(Ch 2,3,5) by Stephen Boyd and Lieven Vandenberghe.
Convex Optimization & Euclidean Distance Geometry by Jon Dattorro.
5/3 Exponential family (Andreas Vlachos) - definition, properties, estimation, GLMs, ...
Notes by Andreas Vlachos.
Tutorial slides by Tony Jebara.
lecture 1, lecture 2 by Michael Jordan, scribed by Sivakumar Rathinam, Shariq Rizvi, and Xia Jiang (UC-Berkeley).
Function spaces - Properties, Banach, Hilbert, Lp, Sobolev, Lipschitz, ...
Manifolds - Riemannian manifolds, exponential map, Laplace-Beltrami operator...
Applied linear algebra - linear regression (orthogonalization), Netflix prize (SVD), spectral clustering, Page Rank (random walks), ...
The Elements of Statistical Learning (Ch 3.1, 3.2) by Trevor Hastie, Robert Tibshirani, Jerome Friedman.
A tutorial on spectral clustering by Ulrike von Luxburg.
The Use of the Linear Algebra by Web Search Engines by Amy N. Langville and Carl D. Meyer.

This group is maintained by David Andrzejewski (CS login: andrzeje).