CS 639 Spring 2026: Foundation Models

Logistics

Time: TR 11:00 AM - 12:15 AM
Location: Chemistry S413
Instructor: Frederic Sala
Instructor Office Hours: Thursday 2:30-4:00 pm
Instructor Office Location: 5514 Morgridge Hall
TAs: Dyah Adila, Sonia Cromp, Samuel Guo, Akshat Singhal, Yichen Wang
TA Office Hours: Calendar
TA Office Location: Check schedule and location in the calendar link above
Piazza: link
Homework Submission: Canvas

Note: for email, please put [CS639] in the subject title. Thanks!

Course Description

Large pretrained machine learning models, also known as foundation models, have taken the world by storm. Models like the GPT family, Claude, and Gemini have astonishing abilities to answer questions, speak with users, and generate sophisticated content. This course covers all aspects of these fascinating models. We will start with a broad review/introduction to modern neural networks and artificial intelligence. We will then learn how foundation models are built, including model architectures, pre-training, post-training, and adaptation. Next, a significant focus is how to use and deploy foundation models, including prompting strategies, providing in-context examples, fine-tuning, integrating into existing data science pipelines, and more. We discuss recent advances in applying foundation and large language models, including reasoning, agents, and systems built on top of LLMs. Finally, we cover the potential societal impacts of these models.

This course will cover the following topics:

Short introduction to machine learning & neural networks: supervised learning, self-supervised learning, deep neural networks, training, evaluation
Building blocks: transformers and attention, emerging non-transformers architectures
Model families & modalities: encoders, decoders, multimodal components
Pretraining & data: training objectives, data, scaling laws
Post-training, alignment, & reasoning: RLHF and variants, chain-of-thought, RL with verifiable rewards for reasoning
Using foundation models: prompting and in-context learning, parameter-efficient tuning, quantization, RAG
Agents: tool use, planning and memory, training and evaluation in interactive environments
Evaluation, safety, and deployment: benchmarks & LLM-as-a-judge, robustness/privacy/red-teaming, societal impacts

A sampling of the papers we will read, understand, and present in this course can be found on the schedule page.

Prerequisites

We assume students have familiarity with basic machine learning. The prerequisites are:

(CS 320 or 400) and (Math 320, 340, 341, 345 or 375), or
CS 540, or
Graduate/Professional Standing

Grading

The grading for the course will be be based on (tentative, subject to change):

Homework Assignments (approximately 6 anticipated): 50%
Midterm Exam: 20%
Final Project: 30%

Project Policies

The projects will be done in groups of about 5-10 students. This will include proposals and check-ins with the instructor. More details will be presented during class.

The goal of the project is to identify a suitable problem in understanding, applying, or extending foundation models and to propose and validate an idea tackling this problem. Note: compute limitations are likely to be a factor; discuss plans with the instructor well in advance :)

General Homework Policies and Academic Misconduct

All homework assignments must be done individually. Cheating and plagiarism will be dealt with in accordance with University procedures (see the Academic Misconduct Guide for Students).