Learning Math for Machine Learning (LMML) Reading Group
Brian Fantana: They've done studies, you know. 60% of the time it works, every time.
Ron Burgundy: That doesn't make sense.
-Anchorman (2004)
UPDATE
This material is now covered more systematically and professionally in
a new
course taught
by Professor Jerry
Zhu.
Overview
Machine learning research uses tools and results from a variety of
mathematical fields, including (but not limited to): convex
optimization, probability, statistics, functional analysis, and
computational learning theory. Familiarity with these ideas is
crucial in order to fully participate in many exciting research
directions. The purpose of this reading group is to gain a better
understanding of some mathematical foundations relevant to machine
learning research. Because of this focus, much of the material
covered will not be about machine learning per se, but rather about
general theoretical concepts which have important applications in
machine learning.
Meeting format
This group will follow the time-honored format of weekly volunteer
presentations. Due to the rigorous nature of this material there will
probably be little benefit to simply printing out the paper, attending
the meeting, and letting the presentation wash over you in a soothing
wave of lemmas and Greek symbols. Please make a serious effort to
understand the readings for weeks you choose to attend. Feel free
though, to attend some meetings and skip others as your schedule and
interests dictate, as the content will not generally be cumulative.
Meeting schedule
Mondays at 3:00 PM, room 2310 1263 Computer Sciences
(if unavailable, backup location is CS 1289)
Mailing list
The LMML mailing list is currently inactive.
List of topics (tentative)
- 2/8, 2/15 Math foundations (Dave Andrzejewski) - inf/sup, open/closed, continuity, sequences, convergence, ...
Introduction to Mathematical Analysis by John Hutchinson (Australian National University).
- 2/22, 3/8 Linear algebra (Bess Berg) - eigenvalues/vectors, factorization, PSD, Sherman-Morrison-Woodbury...
Review notes by Sam Roweis.
Linear Operators: some basics by Bess Berg.
- 3/1 Optimization (Sangkyun Lee) - LP, QP, SDP, Lagrange multipliers, ...
The
Interplay of Optimization and Machine Learning Research by Kristin
P. Bennett and Emilio Parrado-Hernandez.
Lagrange Multipliers without Permanent Scarring by Dan Klein.
- 3/15 Probability (David Andrzejewski) - sample space, random variables, convergence, ...
Review of probability theory by Terrence Tao (blog).
- 3/22, 4/5 Statistics (Jie Liu) - Hypothesis testing, estimators, bias-variance, Fisher information, statistical decision theory, minimax ...
Notes based on STAT 609/610 (Prof Chunming Zhang) scribed by Jie Liu.
Further references:
Probability and Random Processes (3rd Edition), by Geoffrey R. Grimmett,
David R. Stirzaker
Mathematical Statistics (2nd Edition), by Peter J. Bickel, Kjell A. Doksum
- 4/12 Bounds/inequalities (Sangkyun Lee) - Cauchy-Schwarz, Markov, Chernoff, Chebyshev, Hoeffding, Efron-Stein ...
Concentration Inequalities by Stephane Boucheron, Gabor Lugosi, and Olivier Bousquet.
Slides by Olivier Bousquet.
Concentration of measure by Terrence Tao (blog).
- 4/19 Statistical learning theory (Alok Deshpande) - Risk minimization, PAC-style bounds, VC theory, ...
Lecture notes by Rob Nowak (Wisconsin).
Introduction to Statistical Learning Theory by Olivier Bousquet, Stephane Boucheron, and Gabor Lugosi.
Lecture notes by Sham Kakade and Ambuj Tewari (TTI-Chicago).
- 4/26 Convexity (Prabu Ravindran) - convex sets/functions, Jensen's inequality, duality, ...
Convex
Optimization(Ch 2,3,5) by Stephen Boyd and Lieven Vandenberghe.
Convex Optimization & Euclidean Distance Geometry by Jon Dattorro.
- 5/3 Exponential family (Andreas Vlachos) - definition, properties, estimation, GLMs, ...
Notes by Andreas Vlachos.
Tutorial slides by Tony Jebara.
lecture 1, lecture 2 by Michael Jordan, scribed by Sivakumar Rathinam, Shariq Rizvi, and Xia Jiang (UC-Berkeley).
- Function spaces - Properties, Banach, Hilbert, Lp, Sobolev, Lipschitz, ...
- Manifolds - Riemannian manifolds, exponential map, Laplace-Beltrami operator...
- Applied linear algebra - linear regression (orthogonalization), Netflix prize (SVD), spectral clustering, Page Rank (random walks), ...
The Elements of Statistical Learning (Ch 3.1, 3.2) by Trevor Hastie, Robert Tibshirani, Jerome Friedman.
A tutorial on spectral clustering by Ulrike von Luxburg.
The
Use of the Linear Algebra by Web Search Engines by Amy N. Langville and
Carl D. Meyer.
This group is maintained by David Andrzejewski (CS login: andrzeje).