CS 639, Spring 2026
Department of Computer Sciences
University of Wisconsin–Madison
Large pretrained machine learning models, also known as foundation models, have taken the world by storm. Models like the GPT family, Claude, and Gemini have astonishing abilities to answer questions, speak with users, and generate sophisticated content. This course covers all aspects of these fascinating models. We will start with a broad review/introduction to modern neural networks and artificial intelligence. We will then learn how foundation models are built, including model architectures, pre-training, post-training, and adaptation. Next, a significant focus is how to use and deploy foundation models, including prompting strategies, providing in-context examples, fine-tuning, integrating into existing data science pipelines, and more. We discuss recent advances in applying foundation and large language models, including reasoning, agents, and systems built on top of LLMs. Finally, we cover the potential societal impacts of these models.
This course will cover the following topics:
A sampling of the papers we will read, understand, and present in this course can be found on the schedule page.
We assume students have familiarity with basic machine learning. The prerequisites are:
The grading for the course will be be based on (tentative, subject to change):
The projects will be done in groups of about 5-10 students. This will include proposals and check-ins with the instructor. More details will be presented during class.
The goal of the project is to identify a suitable problem in understanding, applying, or extending foundation models and to propose and validate an idea tackling this problem. Note: compute limitations are likely to be a factor; discuss plans with the instructor well in advance :)
All homework assignments must be done individually. Cheating and plagiarism will be dealt with in accordance with University procedures (see the Academic Misconduct Guide for Students).