CS762 Advanced Deep Learning

CS762, Fall 2023
Department of Computer Sciences
University of Wisconsin–Madison


Tentative Schedule (Subject to Change)

Week Date Topic Reading materials Assignments
1 Thursday, September 7 Course overview and introduction
Sign up for paper presentations and scribes (link to Google sheet)
2 Tuesday, September 12 Evolution of Neural Architecture (lecture) D2L Book Chapter 7 & 10
2 Thursday, September 14 Evolution of Neural Architecture II LLM architectures
Attention Is All You Need (deep dive)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
3 Tuesday, September 19 Evolution of Neural Architecture III LLM architectures
Attention Is All You Need
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (deep dive)
3 Thursday, September 21 Evolution of Neural Architecture IV Vision Transformers
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Hierarchical Vision Transformer using Shifted Windows
End-to-End Object Detection with Transformers
HOW DO VISION TRANSFORMERS WORK? (deep dive)
4 Tuesday, September 26 AI Safety and Alignment (lecture)
Team registration (link to Google sheet)
4 Thursday, September 28 AI Safety and Alignment II Distributional shift in the wild
Energy-based Out-of-distribution Detection
How to Exploit Hyperspherical Embeddings for Out-of-Distribution Detection?
Training OOD Detectors in their Natural Habitats
Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection (deep dive)
5 Tuesday, October 3 No class (use the time to discuss the project proposal with team)
Project proposal submission deadline on October 7 Midnight (download proposal latex template here)
5 Thursday, October 5 AI Safety and Alignment III Alignment problem (method)
Fine-Tuning Language Models from Human Preferences
Training Language Models to Follow Instructions with Human Feedback(deep dive)
Constitutional AI: Harmlessness from AI Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Preference Ranking Optimization for Human Alignment
6 Tuesday, October 10 Meetings with Instructor to Review Project Proposals (optional)
6 Thursday, October 12 AI Safety and Alignment IV (no class, self-reading session) Alignment problem (challenges and limitations)
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Scaling Laws for Reward Model Overoptimization
Fundamental Limitations of Alignment in LLMs
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback (self read)
7 Tuesday, October 17 AI Safety and Alignment IV Jailbreaks
Universal and Transferable Adversarial Attacks on Aligned Language Models (deep dive)
Jailbroken: How Does LLM Safety Training Fail?
7 Thursday, October 19 Interpretable Deep Learning I (lecture)
8 Tuesday, October 24 Interpretable Deep Learning II Learning Deep Features for Discriminative Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Locating and Editing Factual Associations in GPT (deep dive)
Mass Editing Memory in a Transformer
8 Thursday, October 26 Foundation models I (Lecture)
9 Tuesday, October 31 Foundation Models II Large-scale pre-training
Exploring the Limits of Weakly Supervised Pretraining
Learning Transferable Visual Models From Natural Language Supervision
Exploring the Limits of Large Scale Pre-training
LLaMA: Open and Efficient Foundation Language Models
Llama 2: Open Foundation and Fine-Tuned Chat Models (deep dive)
9 Thursday, Novermber 2 Foundation Models III Emergent behaviors
Scaling Laws for Neural Language Models
Chain of Thought Prompting Elicits Reasoning in Large Language Models
An Explanation of In-Context Learning as Implicit Bayesian Inference (deep dive)
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
10 Tuesday, Novermber 7 Foundation Models III Parameter efficient adaptation
LoRA: Low-Rank Adaptation of Large Language Models (deep dive)
Prefix-Tuning: Optimizing Continuous Prompts for Generation
The Power of Scale for Parameter-Efficient Prompt Tuning
Learning to Prompt for Vision-Language Models
10 Thursday, November 9 Continual / Lifelong Learning I (Lecture)
11 Tuesday, November 14 Continual / Lifelong Learning II LwF-Learning without Forgetting
iCaRL - Incremental Classifier and Representation Learning
Overcoming catastrophic forgetting in neural networks
Dark Experience for General Continual Learning: a Strong, Simple Baseline
Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning (deep dive)
11 Thursday, November 16 Deep Generative Model I (Lecture) Goodfellow-Bengio-Courville Chapter 20
12 Tuesday, November 21 Deep Generative Model II Foundations
Denoising Diffusion Probabilistic Models (deep dive)
Deep unsupervised learning using nonequilibrium thermodynamics
Generative modeling by estimating gradients of the data distribution
Improved techniques for training score-based generative models
12 Thursday, November 23 No class (Thanksgiving)
13 Tuesday, November 28 Deep Generative Model III Applications
High-Resolution Image Synthesis with Latent Diffusion Models (deep dive)
Hierarchical Text-Conditional Image Generation with CLIP Latents
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
14 Tuesday, December 5 Final project presentation (Part I)
14 Tuesday, December 7 Final project presentation (Part II)
Monday, December 18 Final project written report due (by end of the day)