CS 839-7 (Spring 2022)
Final Project Instructions

Back to course home page.

The final project can be completed individually or in groups of two. The project can be any of the following:

  • Literature review: Critical summary of one or several papers related to the topics studied.
  • Original research: It can be either theoretic, computatonal, or a mix of the two. 

We particularly welcome projects that may be extended for submission to a peer-reviewed journal or conference (e.g., MOR/AoS/T-IT/COLT/ICML/NeurIPS/ICLR). Project topics must be approved by the instructor.

You are expected to submit a final project report summarizing your findings. The report can have up to 4 pages of main text (we encourage conciseness), with unlimited appendix. Due May 10th.

A few suggested (theoretical) papers for literature review. (We will keep updating this list.) You are more than welcome to propose a paper of your own interest.

 Statistical Learning

  1. "Instability, Computational Efficiency and Statistical Accuracy," Nhat Ho, Koulik Khamaru, Raaz Dwivedi, Martin J. Wainwright, Michael I. Jordan, Bin Yu, 2020.
  2. "A Statistical Taylor Theorem and Extrapolation of Truncated Densities," Constantinos Daskalakis, Vasilis Kontonis, Christos Tzamos, Manolis Zampetakis, 2021.
  3. "A Universal Law of Robustness via Isoperimetry," Sebastien Bubeck, Mark Sellke, 2021.
  4. "Inference and uncertainty quantification for noisy matrix completion," Yuxin Chen, Jianqing Fan, Cong Ma, 2019
  5. "Self-regularizing Property of Nonparametric Maximum Likelihood Estimator in Mixture Models," Yury Polyanskiy, Yihong Wu, 2020. 
  6. "Learning with Semi-Definite Programming: new statistical bounds based on fixed point analysis and excess risk curvature," Stéphane Chrétien, Mihai Cucuringu, Guillaume Lecué, Lucie Neirac, 2020
  7. "Reconciling modern machine learning practice and the bias-variance trade-off," Mikhail Belkina, Daniel Hsu, Siyuan Maa, and Soumik Mandala, 2019
  8. "Two models of double descent for weak features," Mikhail Belkin, Daniel Hsu, and Ji Xu, 2019
  9. "How Many Variables Should Be Entered in a Regression Equation?" L. Breiman, and D. Freedman
  10. ‘‘SLOPE is adaptive to unknown sparsity and asymptotically minimax,’’ W. Su and E. Candes, The Annals of Statistics, 2016
  11. ‘‘Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima,’’ P. Loh and M. Wainwright, 2013
  12. ‘‘The landscape of empirical risk for non-convex losses,’’ S. Mei, Y. Bai, and A. Montanari, 2016.
  13. ‘‘Phase transitions in semidefinite relaxations,’’ A. Javanmard, A. Montanari, and F. Ricci-Tersenghi, Proceedings of the National Academy of Sciences, 2016
  14. ‘‘Spectral methods meet EM: A provably optimal algorithm for crowdsourcing,’’ Y. Zhang, X. Chen, D. Zhou, and M. Jordan, Advances in Neural Information Processing Systems, 2014
  15. ‘‘Spectral algorithms for tensor completion,’’ A. Montanari, N. Sun, 2016
  16. ‘‘Tensor SVD: Statistical and Computational Limits,’’ A. Zhang, Dong Xia, 2020

Neural Networks

  1. "The Global Landscape of Neural Networks: An Overview," Ruoyu Sun, Dawei Li, Shiyu Liang, Tian Ding, R Srikant, 2020.
  2. "On the Landscape of One-hidden-layer Sparse Networks and Beyond," Dachao Lin, Ruoyu Sun, Zhihua Zhang, 2020.
  3. ‘‘What Can ResNet Learn Efficiently, Going Beyond Kernels?’’ Z. Allen-Zhu, Y. Li, 2019
  4. ‘‘Can SGD Learn Recurrent Neural Networks with Provable Generalization?’’ Z. Allen-Zhu, Y. Li, 2019
  5. ‘‘A Mean Field View of the Landscape of Two-Layers Neural Networks,’’ S. Mei, A. Montanari, P. Nguyen, 2018
  6. ‘‘Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data,’’ Y. Li, Y. Liang, 2018
  7. ‘‘On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization,’’ S. Arora, N. Cohen, E. Hazan, 2018
  8. ‘‘On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition,’’ M. Mondelli, A. Montanari, 2018
  9. ‘‘Learning One-hidden-layer Neural Networks with Landscape Design,’’ R. Ge, J. Lee, T. Ma, 2017
  10. ‘‘Approximability of Discriminators Implies Diversity in GANs,’’ Y. Bai, T. Ma, A. Risteski, 2018
  11. ‘‘Plug-and-Play Methods Provably Converge with Properly Trained Denoisers,’’ E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, W. Yin, 2019

Non-convexity in Learning and Statistics

  1. "Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction," Dominik Stoger, Mahdi Soltanolkotabi, 2021.
  2. "Implicit Regularization in Deep Matrix Factorization," Sanjeev Arora, Nadav Cohen, Wei Hu, Yuping Luo, 2019.
  3. ‘‘On the Optimization Landscape of Tensor Decompositions,’’ R. Ge and T. Ma, 2016
  4. ‘‘No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis,’’ R. Ge, C. Jin, Y. Zheng, 2017
  5. ‘‘Characterizing Implicit Bias in Terms of Optimization Geometry,’’ S. Gunasekar, J. Lee, D. Soudry, N. Srebro, 2018
  6. ‘‘Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion and Blind Deconvolution,’’ C. Ma, K. Wang, Y. Chi, and Y. Chen, 2017
  7. ‘‘Gradient Descent Learns Linear Dynamical Systems,’’ M. Hardt, T. Ma, B. Recht, 2016
  8. "Model-free Nonconvex Matrix Completion: Local Minima Analysis and Applications in Memory-efficient Kernel PCA,’’ J. Chen, X. Li, 2017
  9. ‘‘Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent,’’ A. Dalalyan, Conference on Learning Theory, 2017

Reinforcement Learning

  1. "Sample Complexity and Overparameterization Bounds for Temporal Difference Learning with Neural Network Approximation," Semih Cayci, Siddhartha Satpathi, Niao He, and R. Srikant, 2021.
  2. "Representation Learning for Online and Offline RL in Low-rank MDPs," Masatoshi Uehara, Xuezhou Zhang, Wen Sun, 2021.
  3. "Value function estimation in Markov reward processes: Instance-dependent L∞-bounds for policy evaluation," Ashwin Pananjady, Martin J. Wainwright, 2019
  4. ‘‘Variance-reduced Q-learning is minimax optimal,’’ M. Wainwright, 2019
  5. "Provably Efficient Reinforcement Learning with Linear Function Approximation," Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan, 2019
  6. "Provably Efficient Exploration in Policy Optimization," Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang, 2019

Optimization
The following papers are not directly related to this course, but you may still choose from them provided that you discuss with the instructor.

  1. "From low probability to high confidence in stochastic convex optimization," Damek Davis, Dmitriy Drusvyatskiy, Lin Xiao, Junyu Zhang, 2022. 
  2. ‘‘How to Escape Saddle Points Efficiently,’’ C. Jin, R. Ge, P. Netrapalli, S. Kakade, M. Jordan, 2017
  3. ‘‘Natasha 2: Faster Non-Convex Optimization Than SGD,’’ Z. Allen-Zhu, 2017
  4. ‘‘An Alternative View: When Does SGD Escape Local Minima? ’’ R. Kleinberg, Y. Li, Y. Yuan, 2018
  5. ‘‘Sharp analysis for nonconvex SGD escaping from saddle points,’’ C. Fang, Z. Lin, and T. Zhang, 2019
  6. ‘‘Gradient Descent Can Take Exponential Time to Escape Saddle Points, ’’ S. Du, C. Jin, J. Lee, M. Jordan, B. Poczos, A. Singh, 2017
  7. ‘‘Stochastic Cubic Regularization for Fast Nonconvex Optimization, ’’ N. Tripuraneni, M. Stern, C. Jin, J. Regier, M. Jordan, 2017
  8. ‘‘On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions, ’’ S. Lu, M. Hong, Z. Wang, 2018
  9. ‘‘Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solutions for Nonconvex Distributed Optimization, ’’ M. Hong, J. Lee, M Razaviyayn, 2018
  10. ‘‘Convergence Analysis of Alternating Direction Method of Multipliers for a Family of Nonconvex Problems, ’’ M. Hong, Z.Q. Luo, and M. Razaviyayn, 2016
  11. ‘‘On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems,’’ T. Lin, J. Chi, and M. Jordan, 2019
  12. ‘‘Stochastic methods for composite and weakly convex optimization problems,’’ J. Duchi, R. Feng, SIAM Journal on Optimization, 2018
  13. ‘‘Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets, ’’ D. Garber, E. Hazan, 2014
  14. ‘‘Mirror descent in non-convex stochastic programming, ’’ Z. Zhou, P. Mertikopoulos, N. Bambos, S. Boyd, P. Glynn, 2017