CS839 Theoretical Foundations of Deep Learning
Spring 2023
Department of Computer Sciences
University of Wisconsin–Madison
Deep learning has been the main driving force behind many modern intelligent systems and has achieved great success in many applications such as image processing, speech recognition, and game playing. However, the fundamental questions about why deep learning is so successful remain largely open. The goal of this course is to study and build the theoretical foundations of deep learning. Topics covered by this course include but are not limited to: approximation power of neural networks, optimization for deep learning, generalization analysis of deep learning. The instructor will give lectures on the selected topics. Students will present and discuss papers on the reading list, and do a course project.
The course will consist of mostly reading and discussing recent important papers on the theoretical analysis of deep learning, some homework assignments, and a course project.
(CS760 or CS761 or CS861) AND (strong math background in machine learning, statistics and optimization)
The course is intended to be advanced study and will not provide review for setting up the background. In particular, CS760 background is helpful but is not sufficient; additional math background to CS760 is needed. You're expected to be familiar with the analysis tools in the following textbooks (or at similar levels):
Time: Tuesday and Thursday 11:00am - 12:15pm
Location: Engineering Hall 2309
Office hours: Th 2-3pm, CS Building Room 5387
The following weighted sum are used for the final average score:
There will be roughly 5 homework assignments. Homework is required to be written in Latex. Unless indicated otherwise, you can discuss with the other students but must finish the homework by yourself. If you discuss with others, please indicate that in your submission; if you consult external materials like Internet post, please cite the references.
Students are required to do a project in this class, since the goal of the course is to provide the opportunity to explore the frontier in recent theoretical studies of deep learning. A project guideline will be provided to specify the details. Roughly speaking, projects should be proposed by the proposal deadline (this is expected to be around the midterm and will be specified in class). A pdf report (written in Latex) should be submitted by the project deadline. The report should be in the style of a conference paper, providing an introduction/motivation, discussion of related work, a description of your work that is detailed enough that the work could be replicated, and a conclusion.
The topic of the project include but not limited to:
The ideal outcome is a report publishable in major machine learning or theory conferences or journals. Published work of the students cannot be used as the course project.