Lectures: Mon/Wed 2:30pm - 3:45pm
Room: ENGR HALL 2255
Instructor: Xiangyao Yu
Office Hour: Mon 4:00pm - 5:00pm (CS 4361)
Modern applications are moving to the cloud for global accessibility, elasticity, high availability, and low cost. Databases are one of the foundational technologies for cloud applications. Compared to traditional on-premises databases, cloud-native databases have unique architectures (e.g., storage-disaggregation), embrace heterogeneous hardware technologies (e.g., GPU, CXL, SmartNIC), and face new application scenarios (e.g., serverless, autoscaling). This seminar course covers recent development in cloud-native databases from both industrial deployment and academic research. Each lecture features presentations from the instructor and students, and group discussions. The course has a final group project.
Prerequisites: CS 564 or equivalent. If you have concerns about meeting the prerequisties, please contact the instructor. There is no formal textbook for this course.
Lecture Format: Each lecture focuses on multiple papers under the same topic. Students will read at least one paper from the pool and submit a review to https://wisc-cs839-f23.hotcrp.com before the lecture starts (if you are presenting in a lecture, no need to submit review for that lecture). The lecture includes a mixture of presentations from both the lecturer and the students and concludes with a group discussion. Please signup paper presentation slots following this link.
Course projects: A big component of this course is a research project. For the project, you pick a topic in the area of data management systems, and explore it in depth. Here are lists of project ideas created for CS764 in previous years 2020, 2021, and 2022; many of these ideas are related to cloud databases. Here is a new list of ideas created for this course. You are also encouraged to select a project outside of the lists. The course project is a group project, and each group must be of size 2-4. Please start looking for project partners right away. The course project will include a project proposal, a short presentation at the end of the semester, and a final project report. Here are three sample projects from previous CS764 (sample1, sample2, sample3); the expectation for CS839 projects will be similar to CS764 projects. The presentations will be organized as a workshop. The project has the following deadlines:
Computation resources:
Inclusion Statement: In our class we strive to create an environment where everyone willing to do their part can learn and thrive. You should always feel free to ask a question: asking and pondering questions is how we learn. Being confused is unfailingly an opportunity to advance our knowledge. Please, commit to helping create a climate where we treat everyone with dignity and respect. Listening to different viewpoints and approaches enriches our experience, and it is up to us to be sure others feel safe to contribute. Creating an environment where we are all comfortable learning is everyone's job: offer support and seek help from others if you need it, not only in class but also outside class while working with classmates.
Lec# | Date | Topic | Reading | Slides |
---|---|---|---|---|
1 | Wed 9/6 | Introduction | None | L1 |
Storage Disaggregation | ||||
2 | Mon 9/11 | Aurora | Alexandre Verbitski, et al., Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases. SIGMOD, 2017 | L2 |
3 | Wed 9/13 | Snowflake | Benoit Dageville, et al., The Snowflake Elastic Data Warehouse. SIGMOD, 2016 | L3 |
4 | Mon 9/18 | Analytical Processing-1 |
Yifei Yang, et al., FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS. VLDB, 2021
Xiangyao Yu, et al., PushdownDB: Accelerating a DBMS using S3 Computation. ICDE, 2020 Cai, Mengchu, et al. Integrated Querying of SQL database data and S3 data in Amazon Redshift. IEEE Data Eng. Bull. 2018 |
L4 (L4-1, L4-2) |
5 | Wed 9/20 | Analytical Processing-2 |
Vuppalapati, Midhul, et al. Building an elastic query engine on disaggregated storage. NSDI, 2020
Melnik, Sergey, et al. Dremel: interactive analysis of web-scale datasets. VLDB, 2010 Melnik, Sergey, et al. Dremel: A decade of interactive SQL analysis at web scale VLDB, 2020 Armbrust, Michael, et al. Lakehouse: a new generation of open platforms that unify data warehousing and advanced analytics. CIDR, 2021 |
L5 (L5-1, L5-2, L5-3) |
6 | Mon 9/25 | Guest Lecture |
Title: S3: an overview of the internal architecture Abstract: S3 is a highly scalable and highly durable object store. In this talk we will present the high-level architecture of S3 and dive into how the storage system achieves these goals while keeping the costs low. We will talk about some of the design philosophies behind S3 and give primer on Reed Solomon Erasure codes. Bio: Jaso has been a software developer for many years. He earned his BS in Computer Science at UW-Madison and his masters at Johns Hopkins. He currently is in his second year as a PhD. student at UW-Madison. Before returning to academia, he spent 17 years at Amazon.com working on various systems from S3, DynamoDB, Timestream and AWS-IOT to name a few. |
|
7 | Wed 9/27 | Transaction Processing-1 |
Panagiotis Antonopoulos, et al., Socrates: The New SQL Server in the Cloud. SIGMOD, 2019
Corbett, James C., et al. Spanner: Google's globally distributed database. OSDI, 2012 Lomet, David, et al. Unbundling transaction services in the cloud. CIDR 2009 |
L7 (L7-1, L7-2) |
8 | Mon 10/2 | Transaction Processing-2 |
Zhou, Jingyu, et al. Foundationdb: A distributed unbundled transactional key value store. SIGMOD, 2021
Guo, Zhihan, et al. Cornus: atomic commit for a cloud DBMS with storage disaggregation. VLDB 2022 Peng, Daniel, and Frank Dabek. Large-scale incremental processing using distributed transactions and notifications. OSDI, 2010 |
L8 (L8-1, L8-2) |
9 | Wed 10/4 | Transaction Processing-3 |
Taft, Rebecca, et al. Cockroachdb: The resilient geo-distributed sql database. SIGMOD, 2020
Yang, Zhenkun, et al. OceanBase: a 707 million tpmC distributed relational database system. VLDB, 2022 Cao, Wei, et al. PolarDB-X: An Elastic Distributed Relational Database for Cloud-Native Applications. ICDE, 2022 |
L9 (L9-1, L9-2, L9-3) |
Serverless | ||||
10 | Mon 10/9 | Project Meetings | Meeting with the instructor to discuss the course project. | |
11 | Wed 10/11 | Database Affiliates Workshop | Attend the Wisconsin Database Affiliates Workshop on 10/12 (optional) and 10/13 (required). | |
12 | Mon 10/16 | Serverless-1 |
Gaffney, Kevin P., et al. Sqlite: past, present, and future. VLDB, 2022
Raasveldt, Mark, and Hannes Muhleisen. Duckdb: an embeddable analytical database. SIGMOD, 2019 |
L12 (L12-1, L12-2) |
13 | Wed 10/18 | Serverless-2 |
Perron, Matthew, et al. Starling: A scalable query engine on cloud functions. SIGMOD, 2020
Muller, Ingo, Renato MarroquĂn, and Gustavo Alonso. Lambada: Interactive data analytics on cold data using serverless cloud infrastructure. SIGMOD, 2020 |
L13 (L13-1, L13-2) |
14 | Mon 10/23 | Serverless-3 |
Sreekanti, Vikram, et al. Cloudburst: Stateful functions-as-a-service. VLDB, 2020
Hellerstein, Joseph M., et al. Serverless computing: One step forward, two steps back. CIDR, 2019 Arun Ulagaratchagan. Introducing Microsoft Fabric: Data analytics for the era of AI. Blog post, 2023 |
L14 (L14-1, L14-2) |
15 | Wed 10/25 | Serverless-4 |
Johann Schleier-Smith. Understanding and Exploring Serverless Cloud Computing (Sections 2.1-2.6). Technical Report No. UCB/EECS-2022-273, 2022
Cao, Wei, et al. Polardb serverless: A cloud native database for disaggregated data centers. SIGMOD, 2021 Jonas, Eric, et al. Cloud programming simplified: A berkeley view on serverless computing. Technical Report No. UCB/EECS-2019-3, 2019 |
L15 (L15-1, L15-2, L15-3) |
16 | Mon 10/30 | DBOS |
Skiadopoulos, Athinagoras, et al. DBOS: a DBMS-oriented Operating System. VLDB, 2022
Kraft, Peter, et al. Apiary: A DBMS-Backed Transactional Function-as-a-Service Framework. arXiv preprint arXiv:2208.13068, 2022 Cafarella, Michael, et al. DBOS: A proposal for a data-centric operating system. arXiv preprint arXiv:2007.11112, 2020 Li, Qian, et al. R3: Record-Replay-Retroaction for Database-Backed Applications VLDB, 2023 |
L16 (L16-1, L16-2, L16-3) |
17 | Wed 11/1 | Auto-scaling |
Zhu, Yiwen, et al. Towards Building Autonomous Data Services on Azure. SIGMOD, 2023
Wu, Chenggang, Vikram Sreekanti, and Joseph M. Hellerstein. Autoscaling tiered cloud storage in Anna. VLDB, 2019 Poppe, Olga, et al. Moneyball: proactive auto-scaling in Microsoft Azure SQL database serverless. VLDB, 2022 Das, Sudipto, et al. Albatross: Lightweight elasticity in shared storage databases for the cloud using live data migration. VLDB, 2011 |
L17 (L17-1, L17-2, L17-3) |
18 | Mon 11/6 | Guest Lecture |
Title: Build an open source, high performance, cloud native time series database Abstract: With the growth of IoT and industrial Internet, time series databases have become more and more popular. Based on the characteristics of time series data, the TDengine team proposed a unique data model of "one table for one data collection point". Benchmark results show that this model dramatically boosts database performance in terms of data ingestion rate, query latency and data compression ratio. In addition, through another innovative concept called the "Super Table", TDengine makes aggregating millions of tables very efficient. Through its native distributed design, storage and computing separation, and RAFT-based data replication, TDengine provides very good scalability, elasticity, and resilience. It can support over one billion connected devices and 100 nodes without any performance deterioration. And with its good observability and cloud deployment tools, TDengine is a true cloud native time series database. TDengine was open sourced in 2019, and the cloud native edition was open sourced in 2022. At present, it has gained over 21,000 stars on GitHub and over 400,000 installations from over 50 countries. It has been widely used in smart manufacturing, clean energy, oil/gas, mining, connected vehicles and more industries. Bio: Jeff Tao is the founder and CEO of TDengine. He has a background as a technologist and serial entrepreneur, having previously conducted research and development on mobile Internet at Motorola and 3Com and established two successful tech startups. Foreseeing the explosive growth of time-series data generated by machines and sensors now taking place, he founded TDengine in May 2017 to develop an open source, high performance, cloud native time series database purpose-built for modern Industry 4.0 and Industrial IoT businesses. |
|
19 | Wed 11/8 | Multi-cloud |
Chasins, Sarah, et al. The sky above the clouds. arXiv preprint arXiv:2205.07147, 2022
Durner, Dominik, Viktor Leis, and Thomas Neumann. Exploiting Cloud Object Storage for High-Performance Analytics. VLDB, 2023 Jain, Paras, et al. Skyplane: Optimizing Transfer Cost and Throughput Using Cloud-Aware Overlays. NSDI, 2023 Yang, Zongheng, et al. SkyPilot: An Intercloud Broker for Sky Computing. NSDI, 2023 What are public, private, and hybrid clouds?. Microsoft, 2023 Flexible, resilient, secure IT for your hybrid cloud. IBM, 2023 Public Cloud vs Private Cloud vs Hybrid Cloud. MongoDB |
L19 (L19-1, L19-2, L19-3) |
20 | Mon 11/13 | Project Meetings | Meeting with the instructor to discuss the course project. | |
21 | Wed 11/15 | Auto-tuning |
Van Aken, Dana, et al. Automatic database management system tuning through large-scale machine learning. SIGMOD, 2017
Pavlo, Andrew, et al. Self-Driving Database Management Systems. CIDR, 2017 Kanellis, Konstantinos, et al. LlamaTune: Sample-Efficient DBMS Configuration Tuning. VLDB, 2022 |
L21 (L21-1, L21-2, L21-3) |
22 | Mon 11/20 | HTAP |
Prout, Adam, et al. Cloud-Native Transactions and Analytics in SingleStore. SIGMOD, 2022
Yang, Jiacheng, et al. F1 Lightning: HTAP as a Service. VLDB, 2020 Chen, Jianjun, et al. ByteHTAP: bytedance's HTAP system with high data freshness and strong data consistency. VLDB 2022 Huang, Dongxu, et al. TiDB: a Raft-based HTAP database. VLDB, 2020 HTAP: HYBRID TRANSACTIONAL AND ANALYTICAL PROCESSING. Snowflake, 2023 |
L22 (L22-1, L22-2, L22-3, L22-4) |
New Hardware | ||||
23 | Wed 11/22 | GPU database |
Anil Shanbhag, et al., A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics. SIGMOD, 2020
Anil Shanbhag, et al. Tile-based Lightweight Integer Compression in GPU. SIGMOD, 2022 Bobbi Yogatama, et al. Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS. VLDB 2022 Cao, Jiashen, et al. Revisiting Query Performance in GPU Database Systems. arXiv 2023 |
L23 (L23-1, L23-2, L23-3, L23-4) |
24 | Mon 11/27 | Memory Disaggregation |
Li, Huaicheng, et al. Pond: CXL-based memory pooling systems for cloud platforms. ASPLOS, 2023
Zhang, Qizhen, et al. Redy: remote dynamic memory cache. VLDB, 2021 Wang, Ruihong, et al. The case for distributed shared-memory databases with RDMA-enabled memory disaggregation. VLDB, 2023 Zhang, Qizhen, et al. Compucache: Remote computable caching using spot vms. CIDR, 2022 Lim, Kevin, et al. Disaggregated memory for expansion and sharing in blade servers. ISCA, 2009 |
L24 (L24-1, L24-2, L24-3) |
25 | Wed 11/29 | RDMA |
Binnig, Carsten, et al. The end of slow networks: It's time for a redesign. VLDB, 2016
Zamanian, Erfan, et al. The end of a myth: Distributed transactions can scal. VLDB, 2017 Barthels, Claude, et al. Rack-scale in-memory join processing using RDMA. SIGMOD, 2015 Rodiger, Wolf, et al. High-speed query processing over high-speed networks. VLDB, 2015 |
L25 (L25-1, L25-2, L25-3) |
26 | Mon 12/4 | SmartNIC |
Lin, Jiaxin, et al. Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs. SIGMETRICS, 2023
Liu, Ming, et al. Offloading distributed applications onto smartnics using ipipe. SIGCOMM, 2019 Schuh, Henry N., et al. Xenic: SmartNIC-accelerated distributed transactions. SOSP, 2021 |
L26 (L26-1, L26-2, L26-3) |
27 | Wed 12/6 | No Lecture | ||
28 | Mon 12/11 | Project Presentation | ||
29 | Wed 12/13 | Project Presentation |