CS 726: Nonlinear Optimization I - Fall 2019


Lecture: MWF, 9:50-11:00, Computer Science 1240
Course URL: http://www.cs.wisc.edu/~swright/cs726-f19.html
Canvas Course Page: https://canvas.wisc.edu/courses/155372/

In general, classes will be held on MWF every week, and most lectures will be 60 minutes, but may take 70 minutes on a few occasions. (I will be absent on a number of class days, and the longer lectures will make up for these absences.)

Instructor: Steve Wright

Office: 4379 CS
Email: swright at cs.wisc
Office Hours: Monday 4-5, Thursday 4-5

General Course Information


  • Linear Algebra, some Analysis. See guidebook for specifics.
  • The course will involve some programming to test algorithms. One useful option is to use Matlab with the free add-on cvx. A second option (particularly appealing if you took 524 recently) is to use Julia with the JuMP optimization toolbox. A third option is to use Python, but I can provide less support for this.


  • J. Nocedal and S. J. Wright, Numerical Optimization, Second Edition, Springer, 2006. (It's essential to get the second edition!) Here is the current list of typos.


  • D. P. Bertsekas, with A. Nedic and A. Ozdaglar, Convex Analysis and Optimization, Athena Scientific, Belmont, MA, 2003.
  • Nesterov, Y., Introductory Lectures on Convex Optimization, Kluwer, 2004.
  • S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press. Available here.
  • D. P. Bertsekas, Nonlinear Programming, Second Edition, Athena Scientific, Belmont, MA, 1999.
  • R. Fletcher, Practical Methods of Optimization, 2nd Edition, Wiley, Chichester & New York, 1987.
  • R. T. Rockafellar and R. J.-B. Wets, Variational Analysis, Springer, 1998. (This is a more advanced book and an invaluable reference.)
  • A. Ruszczynski, Nonlinear Optimization, Princeton University Press, 2006.
  • S. J. Wright, Primal-Dual Interior-Point Methods, SIAM, 1997.

Course Outline

This will likely be adadpted as the semester proceeds, but most of the following topics will be covered.

  • Introduction
    • Optimization paradigms and applications
    • Mathematical background: convex sets and functions, linear algebra, topology, convergence rates
  • Smooth unconstrained optimization: Background
    • Taylor's theorem
    • Optimality conditions
  • First-Order Methods
    • Steepest descent. Convergence for convex and nonconvex cases.
    • Accelerated gradient. Convergence for convex case.
    • Line search methods based on descent directions
    • Conjugate gradient methods
    • Conditional gradient for optimization over closed convex sets
  • Higher-order methods
    • Newton's method
    • Line-search Newton
    • Trust-region Newton and cubic regularization
    • Conjugate gradient-Newton
    • Quasi-Newton methods
    • Limited-memory quasi-Newton
  • Stochastic optimization
    • Basic methods and their convergence properties
    • reduced-variance approaches
  • Differentiation
    • Adjoint calculations
    • Automatic differentiation
  • Least-squares and nonlinear equations
    • Linear least squares: direct and iterative methods
    • Nonlinear least squares: Gauss-Newton, Levenberg-Marquardt
    • Newton’s method for nonlinear equations
    • Merit functions for nonlinear equations, and line searches
  • Optimization with linear constraints
    • Normal cones to convex sets
    • Farkas Lemma and first-order optimality conditions (KKT)
    • Gradient projection algorithms


    Homeworks will be posted on Canvas.

Previous Exams

Note that the curriculum has been changed significantly since 2010, so some questions on the earlier exams are not relevant to the current version of the course.

Handouts and Examples

These will be posted as modules on the Canvas site.