CS 764 Topics in Database Management Systems

Lectures: Tue/Thu 2:30pm - 3:45pm
Room: Morgridge Hall 2532
Instructor: Xiangyao Yu
Office Hour: Thu 1:00pm-2:00pm (MH 7584)
Teaching Assistant: Devesh Sarda

Course description

This course covers a number of advanced topics in the development of database management systems (DBMS) and the modern applications of databases. The topics discussed include query processing and optimization, advanced access methods, advanced concurrency control and recovery, parallel and distributed data systems, cloud computing for data platforms, and data processing with emerging hardware. The course material will be drawn from a number of papers in the database literature. We will cover one paper per lecture. All students are expected to read the paper before coming to the lecture.

Prerequisites: CS 564 or equivalent. If you have concerns about meeting the prerequisties, please contact the instructor.

Reference Textbook: There is no formal textbook for this course. The reading list is a collection of papers. The following two books will be used as references in this course. Note you don't need to buy the books.

Lecture Format: Each lecture focuses on a classic or modern research paper. Students will read the paper and submit a review to https://wisc-cs764-f25.hotcrp.com before the lecture starts. Here is a sample review for the paper on join processing.

Course projects: A big component of this course is a research project. For the project, you pick a topic in the area of data management systems, and explore it in depth. Here are lists of suggested project topics created in 2020, 2021, 2022, and 2025 (This Year!); but you are encouraged to select a project outside of the list. The course project is a group project, and each group must be of size 2-4. Please start looking for project partners right away. The course project will include a project proposal, a short presentation at the end of the semester, and a final project report. Here are three sample projects from previous years (sample1, sample2, sample3). The presentations are organized as a workshop. The project has the following deadlines:

Computation resources:

Inclusion Statement: In our class we strive to create an environment where everyone willing to do their part can learn and thrive. You should always feel free to ask a question: asking and pondering questions is how we learn. Being confused is unfailingly an opportunity to advance our knowledge. Please, commit to helping create a climate where we treat everyone with dignity and respect. Listening to different viewpoints and approaches enriches our experience, and it is up to us to be sure others feel safe to contribute. Creating an environment where we are all comfortable learning is everyone's job: offer support and seek help from others if you need it, not only in class but also outside class while working with classmates.

Grading
Late submission policy: Reviews must be submitted before the lecture starts in order to be graded. You can skip up to 2 reviews without losing points; otherwise 1% of total grade (up to 15%) is deducted for each missing review. Please discuss with the instructor if you cannot submit project proposal or report before the deadline.


Schedule (WIP)

Lec# Date Topic Reading Slides
1 Thu 9/4 Introduction None L1
Analytical Processing
2 Tue 9/9 Join Leonard Shapiro, Join Processing in Database Systems with Large Main Memories. ACM Transactions on Database Systems, 1986
[optional] Peter Boncz, et al., Database Architecture Optimized for the new Bottleneck: Memory Access. VLDB, 1999
[optional] Spyros Blanas, et al. Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs.SIGMOD, 2011
L2
3 Thu 9/11 Predicate Transfer Yifei Yang, et al., Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries. CIDR 2024
[optional] Junyi Zhao, et al., Debunking the Myth of Join Ordering: Toward Robust SQL Analytics. SIGMOD 2025
[optional] Yifei Yang, Xiangyao Yu, Accelerate Distributed Joins with Predicate Transfer. SIGMOD 2025
L3
4 Tue 9/16 Query Optimization Viktor Leis, et al., How Good Are Query Optimizers, Really?. VLDB, 2015
[optional] Viktor Leis, et al., Still Asking: How Good Are Query Optimizers, Really?. VLDB 2025
[optional] Patricia G. Selinger, et al., Access Path Selection in a Relational Database Management System. SIGMOD, 1979
[optional] Surajit Chaudhuri, An Overview of Query Optimization in Relational Systems. PODS, 1998
L4
5 Thu 9/18 Column Store Mike Stonebraker, et al. C-store: a column-oriented DBMS, VLDB 2005
[optional] Daniel Abadi, et al., Column-stores vs. row-stores: how different are they really?, SIGMOD 2008
L5
6 Tue 9/23 Buffer Management Laurens Kuiper, et al., Robust External Hash Aggregation in the Solid State Age. ICDE 2024
[optional] Hong-Tai Chou, David DeWitt, An Evaluation of Buffer Management Strategies for Relational Database Systems. Algorithmica, 1986
[optional] Jim Gray, Gianfranco R. Putzolu, The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. SIGMOD, 1987
L6
7 Thu 9/25 Parallel Database David DeWitt, Jim Gray, Parallel Database Systems: The Future of High Performance Database Processing. Communications of the ACM, 1992
[optional] Robert Epstein, et al., Distributed Query Processing in a Relational Data Base System. SIGMOD, 1978
L7
8 Tue 9/30 NVIDIA Guest Lecture Title: GPU Databases - The New Modality of Database Analytics
Abstract: This lecture will explore why GPU databases are currently at a tipping point by examining recent hardware and software trends. We will cover Sirius, an open-source, GPU-native SQL engine developed collaboratively at UW-Madison and NVIDIA. Finally, we will review related work in the field and discuss emerging trends in GPU-accelerated database systems.
Bio: Bobbi Yogatama is a Systems Software Engineer at NVIDIA working on GPU databases. He received his PhD from the University of Wisconsin-Madison, where he worked with Prof. Xiangyao Yu as part of the Wisconsin Database Group. He is the recipient of the Anthony C Klug NCR Fellowship in Database Systems and is a two-time NVIDIA Graduate Fellowship Finalist.

Bobbi Yogatama, et al., Rethinking Analytical Processing in the GPU Era. arXiv 2025
[optional] Anil Shanbhag, et al., A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics. SIGMOD, 2020
[optional] Bobbi Yogatama, et al. Scaling your Hybrid CPU-GPU DBMS to Multiple GPUs. VLDB, 2024
[optional] Anil Shanbhag, et al. Tile-based Lightweight Integer Compression in GPU. SIGMOD, 2022
[optional] Bobbi Yogatama, et al. Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS. VLDB 2022
9 Thu 10/2 Snowflake Benoit Dageville, et al., The Snowflake Elastic Data Warehouse. SIGMOD, 2016
[optional] Midhul Vuppalapati, et al., Building An Elastic Query Engine on Disaggregated Storage. NSDI, 2020
[optional] Jan Vincent Szlang, et al., Workload Insights From The Snowflake Data Cloud: What Do Production Analytic Queries Really Look Like?. VLDB 2025
L9
10 Tue 10/7 Snowflake Guest Lecture Title: Dynamic Table Overview
Abstract: In this talk, we will provide an overview of Snowflake Dynamic Table, covering its key features and some implementation details. Finally we will introduce newly a released feature Immutability Constraint.
Bio: Primary speakers are Nikhil Shah (Director, Software Engineering) & Ling Geng (Senior Software Engineer), with support from Nancy Huynh (University Recruiter)
11 Thu 10/9 Pushdown DBMS Yifei Yang, et al., FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. VLDB, 2021
[optional] Yifei Yang, et al., FlexpushdownDB: rethinking computation pushdown for cloud OLAP DBMSs. VLDB Journal, 2024
[optional] Xiangyao Yu, et al., PushdownDB: Accelerating a DBMS using S3 Computation. ICDE, 2020
L11
12 Tue 10/14 HTAP Elena Milkai, Xiangyao Yu, Jignesh Patel, Hermes: Off-the-Shelf Real-Time Transactional Analytics. VLDB, 2025
[optional] Elena Milkai, et al., How good is my HTAP system?. SIGMOD, 2022
L12
Transaction Processing
13 Thu 10/16 Transaction Buffer Management Xinjing Zhou, et al., Two is Better Than One: The Case for 2-Tree for Skewed Data Sets. CIDR 2023
[optional] Viktor Leis, et al., LeanStore: In-Memory Data Management Beyond Main Memory. ICDE 2018
[optional] Justin DeBrabant, et al., Anti-Caching: A New Approach to Database Management System Architecture. VLDB, 2013
[optional] Ahmed Eldawy, et al., Trekking Through Siberia: Managing Cold Data in a Memory-Optimized Database. VLDB 2014
L13
14 Tue 10/21 Blink Tree Philip Lehman, S. Bing Yao, Efficient Locking for Concurrent Operations on B-Trees. ACM Transactions on Database Systems, 1981
[optional] Viktor Leis, et al. Optimistic Lock Coupling: A Scalable and Efficient General-Purpose Synchronization Method. IEEE Data Eng. Bull. 2019
L14
15 Thu 10/23 Adaptive Radix Tree Viktor Leis, et al., The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases. ICDE, 2013
[optional] Yandong Mao, et al., Cache Craftiness for Fast Multicore Key-Value Storage. EuroSys, 2012
L15
16 Tue 10/28 Granularity of Locks Jim Gray, et al., Granularity of Locks and Degrees of Consistency in a Shared Data Base. Modelling in Data Base Management Systems, 1976 L16
17 Thu 10/30 Optimistic CC H. T. Kung, John T. Robinson, On Optimistic Methods for Concurrency Control. ACM Transactions on Database Systems, 1981
[optional Stephen Tu, et al., Speedy transactions in multicore in-memory databases. SOSP, 2013
[optional] Xiangyao Yu, et al., TicToc: Time Traveling Optimistic Concurrency Control. SIGMOD, 2016
L17
18 Tue 11/4 Exam Exam F20, Exam F21 (Solution), Exam F22 (Solution)
19 Thu 11/6 No class
20 Tue 11/11 Isolation Hal Berenson, et al., A Critique of ANSI SQL Isolation Levels. SIGMOD Record, 1995
[optional] Corbett, James C., et al. Spanner: Google's globally distributed database. OSDI, 2012
L20
21 Thu 11/13 ARIES C. Mohan, et al. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. ACM Transactions on Database Systems, 1992
[optional] Philip Bernstein, et al., Concurrency Control and Recovery in Database Systems, Chapter 6. Addison-wesley, 1987
L21
22 Tue 11/18 Two-Phase Commit C. Mohan, et al., Transaction Management in the R* Distributed Database Management System. ACM Transactions on Database Systems, 1986
[optional] Zhihan Guo, et al., Cornus: Atomic Commit for a Cloud DBMS with Storage Disaggregation. arXiv 2102.10185, 2022
[optional] Gray, Jim, and Leslie Lamport. Consensus on transaction commit ACM Transactions on Database Systems (TODS) 31.1 (2006): 133-160.
L22
23 Thu 11/20 Deterministic DBMS Yi Lu, et al., Aria: A Fast and Practical Deterministic OLTP Database. VLDB, 2020
[optional] Alexander Thomson, et al., Calvin: Fast Distributed Transactions for Partitioned Database Systems. SIGMOD, 2012
L23
24 Tue 11/25 Amazon Aurora Alexandre Verbitski, et al., Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. SIGMOD, 2017
[optional] Panagiotis Antonopoulos, et al., Socrates: The New SQL Server in the Cloud. SIGMOD, 2019
25 Thu 11/27 No class
26 Tue 12/2 Workshop
27 Thu 12/4 Workshop
28 Tue 12/9 No class