CS 764 Topics in Database Management Systems

Lectures: Mon/Wed 1:00pm - 2:15pm
Room: Psychology 103
Instructor: Xiangyao Yu
Office Hours: Mon 2:30pm - 3:30pm

Course description

This course covers a number of advanced topics in the development of database management systems (DBMS) and the modern applications of databases. The topics discussed include query processing and optimization, advanced access methods, advanced concurrency control and recovery, parallel and distributed data systems, implications of cloud computing for data platforms, and data processing with emerging hardware. The course material will be drawn from a number of papers in the database literature. We will cover one paper per lecture. All students are expected to read the paper before coming to the lecture.

Prerequisites: CS 564 or equivalent. If you have concerns about meeting the prerequisties, please contact the instructor.

Reference Textbook: There is no formal textbook for this course. The reading list is a collection of papers. The following two books will be used as references in this course. Note you don't need to buy the books.

Lecture Format: Each lecture focuses on a classic or modern research paper. Students will read the paper and submit a review to https://wisc-cs764-f21.hotcrp.com before the lecture starts. Here is a sample review for the paper on join processing.

Course projects: A big component of this course is a research project. For the project, you pick a topic in the area of data management systems, and explore it in depth. Here is a list of suggested project topics, but you are encouraged to select a project outside of the list. The course project is a group project, and each group must be of size 2-4. Please start looking for project partners right away. The course project will include a project proposal, a short presentation at the end of the semester, and a final project report. The presentations are organized as a workshop. Please see the program information for DAWN 2019 to have an idea of what it looks like. The project has the following deadlines:

Computation resources:

Inclusion Statement: In our class we strive to create an environment where everyone willing to do their part can learn and thrive. You should always feel free to ask a question: asking and pondering questions is how we learn. Being confused is unfailingly an opportunity to advance our knowledge. Please, commit to helping create a climate where we treat everyone with dignity and respect. Listening to different viewpoints and approaches enriches our experience, and it is up to us to be sure others feel safe to contribute. Creating an environment where we are all comfortable learning is everyone's job: offer support and seek help from others if you need it, not only in class but also outside class while working with classmates.

Late submission policy: Reviews must be submitted before the lecture starts in order to be graded. You can skip up to 2 reviews without losing points; otherwise 1% of total grade (up to 15%) is deducted for each missing review. Please discuss with the instructor if you cannot submit project proposal or report before the deadline.


Lec# Date Topic Reading Slides
1 Wed 9/8 Introduction None L1
Query Processing and Buffer Management
2 Mon 9/13 Join Leonard Shapiro, Join Processing in Database Systems with Large Main Memories. ACM Transactions on Database Systems, 1986
[optional] Laura Haas, et al., Seeking the Truth About ad hoc Join Costs. JVLDB, 1997
[optional] Jaeyoung Do, Jignesh Patel, Join processing for flash SSDs: remembering past lessons. DaMoN, 2009
3 Wed 9/15 Radix Join Peter Boncz, et al., Database Architecture Optimized for the new Bottleneck: Memory Access. VLDB, 1999
[optional] Spyros Blanas, et al. Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs.SIGMOD, 2011
4 Mon 9/20 Buffer Management Hong-Tai Chou, David DeWitt, An Evaluation of Buffer Management Strategies for Relational Database Systems. Algorithmica, 1986
[optional] Jim Gray, Gianfranco R. Putzolu, The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. SIGMOD, 1987
5 Wed 9/22 Buffer with NVM Xinjing Zhou, et al. Spitfire: A Three-Tier Buffer Manager for Volatile and Non-Volatile Memory. SIGMOD, 2021
[optional] Alexander van Renen, et al., Managing Non-Volatile Memory in Database Systems. SIGMOD, 2018
6 Mon 9/27 Query Optimization Patricia G. Selinger, et al., Access Path Selection in a Relational Database Management System. SIGMOD, 1979
[optional] Surajit Chaudhuri, An Overview of Query Optimization in Relational Systems. PODS, 1998
7 Wed 9/29 Distribution Robert Epstein, et al., Distributed Query Processing in a Relational Data Base System. SIGMOD, 1978
[optional] David DeWitt, Jim Gray, Parallel Database Systems: The Future of High Performance Database Processing. Communications of the ACM, 1992
Advanced Transaction Management
8 Mon 10/4 Granularity of Locks Jim Gray, et al., Granularity of Locks and Degrees of Consistency in a Shared Data Base. Modelling in Data Base Management Systems, 1976
9 Wed 10/6 Isolation Hal Berenson, et al., A Critique of ANSI SQL Isolation Levels. SIGMOD Record, 1995
10 Mon 10/11 Optimistic CC H. T. Kung, John T. Robinson, On Optimistic Methods for Concurrency Control. ACM Transactions on Database Systems, 1981
[optional] Per-Ake Larson, et al., High-Performance Concurrency Control Mechanisms for Main-Memory Databases. VLDB, 2011
11 Wed 10/13 Modern OCC Stephen Tu, et al., Speedy transactions in multicore in-memory databases. SOSP, 2013
[optional] Xiangyao Yu, et al., TicToc: Time Traveling Optimistic Concurrency Control. SIGMOD, 2016
12 Mon 10/18 Guest Lecture TBD
13 Wed 10/20 Blink Tree Philip Lehman, S. Bing Yao, Efficient Locking for Concurrent Operations on B-Trees. ACM Transactions on Database Systems, 1981
14 Mon 10/25 Guest Lecture TBD
15 Wed 10/27 Adaptive Radix Tree Viktor Leis, et al., The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases. ICDE, 2013
Yandong Mao, et al., Cache Craftiness for Fast Multicore Key-Value Storage. EuroSys, 2012
16 Mon 11/1 Durability Philip Bernstein, et al., Concurrency Control and Recovery in Database Systems, Chapter 6. Addison-wesley, 1987
17 Wed 11/3 ARIES C. Mohan, et al. ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging. ACM Transactions on Database Systems, 1992
18 Mon 11/8 Two-Phase Commit C. Mohan, et al., Transaction Management in the R* Distributed Database Management System. ACM Transactions on Database Systems, 1986
[optional] Philip Bernstein, et al., Concurrency Control and Recovery in Database Systems, Chapter 7. Addison-wesley, 1987
19 Wed 11/10 Exam review sample1 (F11), sample2 (F20), sample3
20 Mon 11/15 Exam Take home exam.
21 Wed 11/17 Replication Jim Gray, et al., The Dangers of Replication and a Solution. SIGMOD, 1996
22 Mon 11/22 Deterministic DBMS Yi Lu, et al., Aria: A Fast and Practical Deterministic OLTP Database. VLDB, 2020
[optional] Alexander Thomson, et al., Calvin: Fast Distributed Transactions for Partitioned Database Systems. SIGMOD, 2012
Cloud-Native DBMS
23 Wed 11/24 Project Meetings Each group meets with the instructor to discuss the final project.
24 Mon 11/29 Cloud OLTP Donald Kossmann, et al., An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD, 2010
[optional] Matthias Brantner, et al., Building a Database on S3. SIGMOD, 2008
25 Wed 12/1 Amazon Aurora Alexandre Verbitski, et al., Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. SIGMOD, 2017
[optional] Panagiotis Antonopoulos, et al., Socrates: The New SQL Server in the Cloud. SIGMOD, 2019
26 Mon 12/6 Snowflake Benoit Dageville, et al., The Snowflake Elastic Data Warehouse. SIGMOD, 2016
27 Wed 12/8 Pushdown DBMS Yifei Yang, et al., FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. VLDB, 2021
[optional] Xiangyao Yu, et al., PushdownDB: Accelerating a DBMS using S3 Computation. ICDE, 2020
28 Mon 12/13 DAWN Workshop
29 Wed 12/15 DAWN Workshop