CS/ISyE/Math/Stat 726 Nonlinear Optimization I (Spring
2024)
1. Basic Info
2. Course Overview
• Lecture Notes
3. Texts and References
4. Course Load and Grading
5. Academic Policies
Lectures:
Tuesday and Thursday 2:30-3:45pm, Engineering Hall 3345
Instructor:
Yudong
Chen (yudong.chen at wisc dot edu, Office: CS
Building 5373)
Office hours: see Canvas/Piazza
Teaching
Assistant:
Matthew Zurek (matthew.zurek@wisc.edu)
Office hours: see Canvas/Piazza
Prerequisites:
This class focuses
on theory. Mathematical maturity is assumed: you should
be comfortable with reading and writing proofs.
Basic knowledge in linear algebra, real analysis, and
probability is expected.
Some of the homework problems involve coding in Python, so basic knowledge of Python is expected.
Homework must be typeset in LaTeX (or other text and equation editors), so you are expected to know how to do so.
Websites and communication:
- • Piazza: For discussion and course announcements. Sign up for this course on Piazza using this link.
- • Canvas: We use Canvas for posting course materials.
This class covers the algorithmic and theoretical foundations of nonlinear continuous optimization. The focus is on first- and second-order iterative optimization algorithms, and rigorous analysis of these algorithms.The coding assignments are used for illustrating the performance of different optimization methods on some characteristic examples.
This class does not focus on modeling or applications. For these two topics, students may consider CS 524 and different machine learning classes.
A tentative list of topics to be covered:
- • Introduction:
-
- continuous optimization background
- convex sets and functions
- convergence rates
-
- Taylor theorem
- growth and smoothness properties
- optimality conditions
-
- gradient descent for convex and nonconvex optimization
- line-search methods
- Nesterov acceleration for convex optimization
- conjugate gradients (CG)
- projected gradient descent
- conditional gradients (Frank-Wolfe methods)
- basic coordinate descent
- stochastic gradient descent
- nonsmooth optimization and subgradient methods
- online convex optimization and mirror descent
- Newton method, trust-region Newton
- quasi-Newton methods (DFP, BFGS, SR-1, general Broyden class)
- limited-memory quasi-Newton (L-BFGS)
- inexact Newton methods and Newton-CG
- • Lecture 1-2: Optimization Background
- • Lecture 3: Solution Concepts; Taylor’s Theorems
- • Lecture 4: Smooth Functions and Optimality Conditions
- • Lecture 5: Minima of Convex Functions; Algorithmic Setup
- • Lecture 6: Gradient Descent and Its Analysis
- • Lecture 7-8: Other Basic Descent Methods
- • Lecture 9-10: Accelerated Gradient Descent
- • Lecture 11: Acceleration via Restarting; Lower Bounds
- • Lecture 12: Conjugate Gradient Methods
- • Lecture 13: Conjugate Gradient Methods: Implementation and Extensions
- • Lecture 14: Constrained Optimization over Closed Convex Sets
- • Lecture 15: Projected Gradient Descent
- • Lecture 16: Frank-Wolfe (aka Conditional Gradient) Method
- • Lecture 17: Nonsmooth Optimization
- • Lecture 18: Stochastic Optimization
- • Lecture 19: Basic Newton’s Method
- • Lecture 20: Line Search Procedures; Newton’s Method with Hessian Modification
- • Lecture 21: Quasi-Newton Methods
- • Lecture 22: Quasi-Newton: The BFGS and SR1 Methods
- • Lecture 23: Limited-Memory BFGS (L-BFGS)
- • Lecture 24: Trust-Region Methods
- • Lecture 25: Online Convex Optimization and Mirror Descent
- • Lecture 26: Saddle
Point Representation of Nonsmooth Optimization and
Mirror Prox
- • Lecture 27: Stochastic
Variance Reduced Gradient Method
3. Texts and References
Lecture notes will be shared on Canvas.
We will use the following textbooks (access through UW libraries) for some of the topics:
- • S. J. Wright and B. Recht, Optimization for Data Analysis, Cambridge University Press, 2022.
- • J. Nocedal and S. J. Wright, Numerical Optimization, Second Edition, Springer, 2006. (It is important that you get the second edition.)
Additional books and resources that you may find useful:
- • Y. Nesterov, Lectures
on Convex Optimization, Springer, 2018.
• A. Beck, First-order Methods in Optimization. Vol. 25. SIAM, 2017.
• S. Boyd, L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004.
• R.T. Rockafellar, R. J-B Wets, Variational Analysis, Springer, 1998.
• D. P. Bertsekas, with A. Nedic and A. Ozdaglar, Convex Analysis and Optimization, Athena Scientific, 2003.
• Dmitriy Drusvyatskiy's course notes on Convex Analysis and Optimization, 2019. - • S. Bubeck, Convex Optimization: Algorithms and Complexity, 2015.
Your final grade will be based on the following formula (tentative and subject to change):
max(0.5H + 0.2M + 0.3F, 0.5H + 0.1M + 0.4F),
where H=homework, M=midterm exam, and F=final exam. Details below:
- • Homework. There will be 5-6 homework assignments.
-
• Homework submission must be typeset
using LaTeX or other text/equation editors.
- • You may discuss problems with other students, but you need to declare it on your homework submission. Any discussion can be verbal only: you are required to work out and write the solutions on your own. You must also cite any resources which helped you obtain your solution.
- • Midterm exam. March 21, in class.
- • Final exam. May 10, 12:25PM - 2:25PM, Engineering Hall 3345.
Homework assignments, solutions and grades will be posted on Canvas.
Homework extension policy:
Blanket approval for up to 6 days. This means that for all homework assignments throughout the semester, you can be late for up to a total of 6 days, without requesting an extension from the instructor. The late days are counted in full days increments: if you are 1min late or 23h 59m late, both would count as a full day.
It is up to you to decide whether to use these late days, and how to allocate them across the HWs. For example, one may use 2 late days for HW2 and 4 late days for HW3. Or, one may use all 6 late days for HW4.
The policy does NOT mean that you can be 6 days late for every HW assignment. The 6 days are for all HW assignments combined.
You may discuss with your peers or the instructors ideas,
approaches and techniques broadly. However, all
examinations, programming assignments, and written
homework must be written up individually. For example,
code for programming assignments must not be developed in
groups, nor should code be shared. Submitting someone
else's work as your own constitutes academic
misconduct. Make sure you work through all problems
yourself, and that your final write-up is your own. You
may discuss problems with other students, but you need to
declare it in your homework submission..
You may use books or legit online resources to help solve
homework problems, but you must always credit all such
sources in your writeup and you must never copy material
verbatim.
Academic integrity issues will be dealt with in accordance
with University procedures (see the UW-Madison
Academic
Misconduct Page)
If you have any questions about this policy, please do not hesitate to contact the instructor.
Accommodations for
Students with Disability
The University of Wisconsin-Madison supports the right of all enrolled students to a full and equal educational opportunity. The Americans with Disabilities Act (ADA), Wisconsin State Statute (36.12), and UW-Madison policy (Faculty Document 1071) require that students with disabilities be reasonably accommodated in instruction and campus life. Reasonable accommodations for students with disabilities is a shared faculty and student responsibility. Students are expected to inform the instructors of their need for instructional accommodations by the end of the third week of the semester, or as soon as possible after a disability has been incurred or recognized. The instructors will work either directly with the student or in coordination with the McBurney Center to identify and provide reasonable instructional accommodations. Disability information, including instructional accommodations as part of a student’s educational record, is confidential and protected under FERPA. (See: McBurney Disability Resource Center)
Respect for Diversity: It is the intent of the
instructors that students from all diverse backgrounds and
perspectives be well served by this course, that students’
learning needs be addressed both in and out of class, and
that the diversity that students bring to this class be
viewed as a resource, strength and benefit. It is our
intent to present materials and activities that are
respectful of diversity: gender, sexuality, disability,
age, socioeconomic status, ethnicity, race, and culture.
Your suggestions are encouraged and appreciated. Please
let us know ways to improve the effectiveness of the
course for you personally or for other students or student
groups. In addition, if any of our class meetings conflict
with your religious events, please let us know so that we
can make arrangements for you.
Please, commit to helping create a climate where we treat
everyone with dignity and respect. Listening to different
viewpoints and approaches enriches our experience, and it
is up to us to be sure others feel safe to contribute.
Creating an environment where we are all comfortable
learning is everyone’s job: offer support and seek help
from others if you need it, not only in class but also
outside class while working with classmates.
Students of the class are expected to comply with the University’s current COVID rules and policies (see in particular the FAQ). Any student who requires an exemption to current policies must contact the McBurney Office, as instructors do not have the authority to grant such exceptions.