ORIE 7790 Final Project Instructiosn (Spring 2020)

ORIE 7790 (Spring 2020)
Final Project Instructions

The final project can be completed individually or in groups of two. The project can be any of the following:

• Literature review: Critical summary of one or several papers related to the topics studied.
• Original research: It can be either theoretic or experimental (ideally a mix of the two).

We particularly welcome projects that may be extended for submission to a peer-reviewed journal or conference (e.g., MOR/AoS/T-IT/COLT/ICML/NeurIPS/ICLR). Project topics must be approved by the instructor.

You are expected to submit a final project report summarizing your findings. The report can have up to 5 pages of main text (we encourage conciseness), with unlimited appendix. Due May 23rd.

A few suggested (theoretical) papers for literature review. (Updated 4/13/2020. You are more than welcome to propose a paper of your own interest.)

Statistical Learning

"Learning with Semi-Definite Programming: new statistical bounds based on fixed point analysis and excess risk curvature," Stéphane Chrétien, Mihai Cucuringu, Guillaume Lecué, Lucie Neirac, 2020
"Reconciling modern machine learning practice and the bias-variance trade-off," Mikhail Belkina, Daniel Hsu, Siyuan Maa, and Soumik Mandala, 2019
"Two models of double descent for weak features," Mikhail Belkin, Daniel Hsu, and Ji Xu, 2019
"How Many Variables Should Be Entered in a Regression Equation?" L. Breiman, and D. Freedman
‘‘SLOPE is adaptive to unknown sparsity and asymptotically minimax,’’ W. Su and E. Candes, The Annals of Statistics, 2016
‘‘Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima,’’ P. Loh and M. Wainwright, 2013
‘‘The landscape of empirical risk for non-convex losses,’’ S. Mei, Y. Bai, and A. Montanari, 2016.
‘‘Phase transitions in semidefinite relaxations,’’ A. Javanmard, A. Montanari, and F. Ricci-Tersenghi, Proceedings of the National Academy of Sciences, 2016
"Singularity, Misspecification, and the Convergence Rate of EM," Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Michael I. Jordan, Martin J. Wainwright, Bin Yu, 2018
"Randomly initialized EM algorithm for two-component gaussian mixture achievesnear optimality in O(√n) iterations," Y. Wu and H. H. Zhou, 2019
‘‘Spectral methods meet EM: A provably optimal algorithm for crowdsourcing,’’ Y. Zhang, X. Chen, D. Zhou, and M. Jordan, Advances in Neural Information Processing Systems, 2014
‘‘Spectral algorithms for tensor completion,’’ A. Montanari, N. Sun, 2016
‘‘Tensor SVD: Statistical and Computational Limits,’’ A. Zhang, Dong Xia, 2020

Neural Networks

‘‘What Can ResNet Learn Efficiently, Going Beyond Kernels?’’ Z. Allen-Zhu, Y. Li, 2019
‘‘Can SGD Learn Recurrent Neural Networks with Provable Generalization?’’ Z. Allen-Zhu, Y. Li, 2019
‘‘A Mean Field View of the Landscape of Two-Layers Neural Networks,’’ S. Mei, A. Montanari, P. Nguyen, 2018
‘‘Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data,’’ Y. Li, Y. Liang, 2018
‘‘On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization,’’ S. Arora, N. Cohen, E. Hazan, 2018
‘‘On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition,’’ M. Mondelli, A. Montanari, 2018
‘‘Learning One-hidden-layer Neural Networks with Landscape Design,’’ R. Ge, J. Lee, T. Ma, 2017
‘‘Approximability of Discriminators Implies Diversity in GANs,’’ Y. Bai, T. Ma, A. Risteski, 2018
‘‘Plug-and-Play Methods Provably Converge with Properly Trained Denoisers,’’ E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, W. Yin, 2019

Non-convexity in Learning and Statistics

‘‘Low-rank matrix recovery with composite optimization: good conditioning and rapid convergence,’’ V. Charisopoulos, Y. Chen, D. Davis, M. Diaz, L. Ding, D. Drusvyatskiy, 2019
‘‘On the Optimization Landscape of Tensor Decompositions,’’ R. Ge and T. Ma, 2016
‘‘No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis,’’ R. Ge, C. Jin, Y. Zheng, 2017
‘‘Characterizing Implicit Bias in Terms of Optimization Geometry,’’ S. Gunasekar, J. Lee, D. Soudry, N. Srebro, 2018
‘‘Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion and Blind Deconvolution,’’ C. Ma, K. Wang, Y. Chi, and Y. Chen, 2017
‘‘Gradient Descent Learns Linear Dynamical Systems,’’ M. Hardt, T. Ma, B. Recht, 2016
‘Model-free Nonconvex Matrix Completion: Local Minima Analysis and Applications in Memory-efficient Kernel PCA,’’ J. Chen, X. Li, 2017
‘‘Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent,’’ A. Dalalyan, Conference on Learning Theory, 2017

Reinforcement Learning

"Value function estimation in Markov reward processes: Instance-dependent L∞-bounds for policy evaluation," Ashwin Pananjady, Martin J. Wainwright, 2019
‘‘Variance-reduced Q-learning is minimax optimal,’’ M. Wainwright, 2019
"Provably Efficient Reinforcement Learning with Linear Function Approximation," Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael I. Jordan
"Provably Efficient Exploration in Policy Optimization," Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang, 2019
"Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium," Qiaomin Xie, Yudong Chen, Zhaoran Wang, and Zhuoran Yang, 2020

Optimization
The following papers are not directly related to this course, but you may still choose from them provided that you discuss with the instructor.

‘‘How to Escape Saddle Points Efficiently,’’ C. Jin, R. Ge, P. Netrapalli, S. Kakade, M. Jordan, 2017
‘‘Natasha 2: Faster Non-Convex Optimization Than SGD,’’ Z. Allen-Zhu, 2017
‘‘An Alternative View: When Does SGD Escape Local Minima? ’’ R. Kleinberg, Y. Li, Y. Yuan, 2018
‘‘Sharp analysis for nonconvex SGD escaping from saddle points,’’ C. Fang, Z. Lin, and T. Zhang, 2019
‘‘Gradient Descent Can Take Exponential Time to Escape Saddle Points, ’’ S. Du, C. Jin, J. Lee, M. Jordan, B. Poczos, A. Singh, 2017
‘‘Stochastic Cubic Regularization for Fast Nonconvex Optimization, ’’ N. Tripuraneni, M. Stern, C. Jin, J. Regier, M. Jordan, 2017
‘‘On the Sublinear Convergence of Randomly Perturbed Alternating Gradient Descent to Second Order Stationary Solutions, ’’ S. Lu, M. Hong, Z. Wang, 2018
‘‘Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solutions for Nonconvex Distributed Optimization, ’’ M. Hong, J. Lee, M Razaviyayn, 2018
‘‘Convergence Analysis of Alternating Direction Method of Multipliers for a Family of Nonconvex Problems, ’’ M. Hong, Z.Q. Luo, and M. Razaviyayn, 2016
‘‘On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems,’’ T. Lin, J. Chi, and M. Jordan, 2019
‘‘Stochastic methods for composite and weakly convex optimization problems,’’ J. Duchi, R. Feng, SIAM Journal on Optimization, 2018
‘‘Faster Rates for the Frank-Wolfe Method over Strongly-Convex Sets, ’’ D. Garber, E. Hazan, 2014
‘‘Mirror descent in non-convex stochastic programming, ’’ Z. Zhou, P. Mertikopoulos, N. Bambos, S. Boyd, P. Glynn, 2017

ORIE 7790 (Spring 2020)Final Project Instructions

ORIE 7790 (Spring 2020)
Final Project Instructions