Xiangyao Yu's Homepage

My research actively focuses in three areas: (I) Cloud-native databases, (II) New hardware for databases, and (III) Core DB techniques. Below are some sample projects.

Research Area I: Cloud-Native Databases

Databases are moving to the cloud driven by desirable properties such as elasticity, high-availability, and cost competitiveness. Modern cloud-native databases adopt a unique storage-disaggregation architecture, where the computation and storage are decoupled. This architecture brings new challenges (e.g., network bandwidth bottleneck) and opportunities in DBMS design.

Cloud-native data warehouse:

Cloud-DW [VLDB'19]: Evaluation of several popular cloud-native data warehouse systems that have different architectures.
PushdownDB [code][ICDE'20]: A cloud-native OLAP system that leverages AWS S3 Select to push down selection, projection, and aggregation to speedup query processing.
FlexPushdownDB [code][VLDB'21]: A cloud-native OLAP DBMS that combines caching and pushdown at a fine-granularity in a storage disaggregation architecture.
FlexPushdownDB journal [code][VLDBJ'24]: Extending FPDB to support advanced pushdown operators (e.g., Bloom filter, selection bitmap, and shuffle) and adaptive pushdown, which pushes tasks back to compute servers when storage layer computation is limited.

Cloud-native transaction processing:

Litmus [code][SIGMOD'22]: A DBMS that provides verifiable proofs of atomicity and serializability for transactions, through the codesign of database and cryptographic tools.
Cornus [code][VLDB'22]: An optimized two-phase commit protocol in a cloud-native database. Cornus reduces 2PC latency and eliminates blocking by leveraging the unique architectural features of storage disaggregation.
Epoxy [code][VLDB'23]: Epoxy provides ACID transactions across heterogeneous data stores (e.g., MongoDB, ElasticSearch, GCS, MySQL) to simplify cloud application development.
R^3 [code][VLDB'23]: R3 is a Record-Replay-Retroaction tool that simplifies debugging database-backed applications. R3 can replay an application in the same order as the original execution; it also enables retroaction, allowing the replay to run modified code instead of the original code.

Research Area II: New Hardware for Databases

GPU database: GPU is a promising solution for data analytics, driven by the rapid growth of GPU computation power, GPU memory capacity and bandwidth, and PCIe bandwidth. We investigate techniques that can fully unleash the power of GPU in online analytical processing (OLAP) databases.

Crystal [code][SIGMOD'20]: A library that can run full SQL queries in GPU and saturate GPU memory bandwidth.
GPU-compression [code][SIGMOD'22]: A highly optimized GPU compression scheme that achieves high compression ratio and fast decompression speed.
Mordred [code][VLDB'22]: A heterogeneous CPU-GPU query execution engine that optimizes data placement (i.e., semantic-aware caching) and query execution (i.e., segment-level query execution).
GPU-UDAF [DaMoN@SIGMOD'23]: This work optimizes user-defined aggregate function (UDAF) in cuDF through block-wide execution model and just-in-time (JIT) compilation, achieving 3600x speedup. The work has been fully integrated and released in NVIDIA RAPIDS cuDF version 23.02.

Advanced network technologies: Network is a bottleneck in distributed databases. Emerging network technologies including RDMA, SmartNIC, and programmable switches support different levels of computation within the network and are promising in accelerating distributed databases.

Active-memory [VLDB'19]: Active-memory replication is a new high-availability scheme that leverages RDMA to directly update replica's memory and eliminate the computation overhead of log replay.
SmartShuffle [SIGMETRICS'23]: Accelerate data analytics by offloading various computation tasks into the SmartNIC device. SmartShuffle outperforms Spark RDMA by up to 40% on TPC-H.

Research Area III: Core DB Techniques

Scalable transaction processing on multicore CPUs: Computer architectures are moving towards manycore machines with dozens or even hundreds of cores on a single chip. We develop new techniques for modern database management systems (DBMSs) to make transaction processing scalable for this level of massive parallelism.

DBx1000 [code][VLDB'14]: Scalability analysis of seven classic concurrency control protocols on a simulated 1000-core CPU.
TicToc [code][SIGMOD'16]: A scalable timestamp-based concurrency control protocol that resolves the timestamp allocation bottleneck through data-driven timestamp management.
Taurus [code][VLDB'20]: A lightweight parallel logging scheme that avoids the central logging bottleneck by writing to multiple log streams.
Bamboo [code][SIGMOD'21]: An optimized two-phase locking (2PL) protocol that mitigates hotspot overhead by releasing locks early during transaction execution.
Plor [code][SIGMOD'22]: A technique called pessimistic locking and optimistic reading (Plor) to reduce tail latency for high-contention transactional workloads, while maintaining high throughput.
Blink-Hash [code][VLDB'23]: A new index design that enhances a tree-based index with hash leaf nodes to mitigate the contention of monotonic insertions, a pattern common in time-series workloads.
DeToX [code][OSDI'23]: A caching algorithm that leverages transactional dependencies to make eviction and prefetching decisions more intelligently. DeToX increases transactional hit rate by 1.3x compared to single-object caching policy.
Polaris [code][SIGMOD'23]: Polaris enables priority among transactions for state-of-the-art OCC protocol, Silo, and achieves 17x lower tail latency for high-contention workloads.
Two-Tree [code][CIDR'23]: Two-Tree splits a single index structure (e.g., B-tree) into a top in-memory tree for hot records, and a bottom tree for cold pages, and achieves 1.7x higher throughput than conventional One-Tree design.
Three-Tree [code][SIGMOD'24]: Exploration of OLTP buffer management strategies with two-tier main memory; no existing design can win in all measured dimensions.

Scalable distributed transaction processing: Online transaction processing (OLTP) DBMSs are increasingly deployed on distributed machines. Compared to a centralized systems, distributed DBMSs face new challenges including extra network latency, requirements of high availability and distributed commitment.

Sundial [code][VLDB'18]: A distributed concurrency control protocol that is algorithmically similar to TicToc; Sundial integrates cache coherence and concurrency control into a unified protocol.
STAR [code][VLDB'19]: A distributed DBMS where data replicas use asymmetric architectures (e.g., non-partitioned and partition-based). A transaction is executed in the replica that delivers better performance.
Aria [code][VLDB'20]: A deterministic distributed DBMS that no longer requires knowing transactions' read/write sets before execution. Aria also achieves higher throughput than previous deterministic DBMSs.
Coco [code][VLDB'21]: A distributed OLTP DBMS that mitigates the synchronization overhead of distributed commitment and data replication by committing transactions in epochs.
Lotus[code][VLDB'22]: Optimize multi-partition transactions in a distributed and partitioned database.

Hybrid transactional/analytical processing (HTAP): HTAP systems have gained popularity as they combine OLAP and OLTP processing to reduce administrative and synchronization costs between dedicated systems. This brings new challenges in data freshness and performance isolation between transactional and analytical processing.

HATtrick [code][SIGMOD'22]: A benchmark for HTAP systems that uses two new performance metrics: throughput frontier and freshness score. Three representative systems are evaluated.

Predicate Transfer: Predicate transfer is a method that optimizes multi-join queries by pre-filtering tables to reduce the join input size. Predicate transfer is inspired by the seminal theoretical results by Yannakakis but leverage Bloom Filters to become more practical.

Predicate Transfer [code][CIDR'24]: The original predicate transfer paper.

Peer-Reviewed Publications

Wenjie Hu, Guanzhou Hu, Mahesh Balakrishnan, Xiangyao Yu
Marlin: Efficient Coordination for Autoscaling Cloud DBMS
Proceedings of SIGMOD, June 2026. To Appear.
Elena Milkai, Xiangyao Yu, Jignesh Patel
Hermes: Off-the-Shelf Real-Time Transactional Analytics
Proceedings of the VLDB Endowment, 2025. To Appear.
Junyi Zhao, Kai Su, Yifei Yang, Xiangyao Yu, Paraschos Koutris, Huanchen Zhang
Debunking the Myth of Join Ordering: Toward Robust SQL Analytics
Proceedings of SIGMOD, June 2025
[code][Extended version in arXiv]
Yifei Yang, Xiangyao Yu
Accelerate Distributed Joins with Predicate Transfer
Proceedings of SIGMOD, June 2025
[code]
Xinjing Zhou, Xiangpeng Hao, Xiangyao Yu, Michael Stonebraker
Tiered-Indexing: Optimizing Access Methods for Skew
The VLDB Journal (VLDBJ), May 2025
[code]
Xinjing Zhou, Viktor Leis, Jinming Hu, Xiangyao Yu, Michael Stonebraker
Practical DB-OS Co-Design with Privileged Kernel Bypass
Proceedings of SIGMOD, June 2025
Xinjing Zhou, Viktor Leis, Xiangyao Yu, Michael Stonebraker
OLTP Through the Looking Glass 16 Years Later: Communication is the New Bottleneck
Proceedings of The Conference on Innovative Data Systems Research (CIDR), January 2025
Bobbi Yogatama, Weiwei Gong, Xiangyao Yu
Scaling your Hybrid CPU-GPU DBMS to Multiple GPUs
Proceedings of the VLDB Endowment, September 2024
[code]
Yifei Yang, Xiangyao Yu, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker
FlexpushdownDB: rethinking computation pushdown for cloud OLAP DBMSs
The VLDB Journal (VLDBJ), July 2024
[code]
Xiangpeng Hao, Xinjing Zhou, Xiangyao Yu, Michael Stonebraker
Towards Buffer Management with Tiered Main Memory
Proceedings of SIGMOD, June 2024
[code]
Yifei Yang, Hangdong Zhao, Xiangyao Yu, Paraschos Koutris
Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries
Proceedings of The Conference on Innovative Data Systems Research (CIDR), January 2024
[code]
Qian Li, Peter Kraft, Michael Cafarella, Cagatay Demiralp, Goetz Graefe, Christos Kozyrakis, Michael Stonebraker, Lalith Suresh, Xiangyao Yu, Matei Zaharia
R^3: Record-Replay-Retroaction for Database-Backed Applications
Proceedings of the VLDB Endowment, July, 2023
[code]
Peter Kraft, Qian Li, Xinjing Zhou, Peter Bailis, Michael Stonebraker, Matei Zaharia, Xiangyao Yu
Epoxy: ACID Transactions Across Diverse Data Stores
Proceedings of the VLDB Endowment, July, 2023
[code]
Audrey Cheng, David Chu, Terrance Li, Jason Chan, Natacha Crooks, Joseph M. Hellerstein, Ion Stoica, Xiangyao Yu
Take Out the TraChe: Maximizing (Tra)nsactional Ca(che) Hit Rate
USENIX Symposium on Operating Systems Design and Implementation (OSDI), July 2023.
[code]
Bobbi Yogatama , Brandon Miller, Yunsong Wang, Graham Markall, Jake Hemstad, Gregory Kimball, Xiangyao Yu
Accelerating User-Defined Aggregate Function (UDAF) with Block-wide Execution and JIT Compilation on GPUs
Data Management on New Hardware (DaMoN@SIGMOD), June 2023
Jiaxin Lin, Tao Ji, Xiangpeng Hao, Hokeun Cha, Yanfang Le, Xiangyao Yu, Aditya Akella
Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs
Proceedings of the ACM on Measurement and Analysis of Computing Systems (SIGMETRICS), June 2023
Chenhao Ye, Wuh-Chwen Hwang, Keren Chen, Xiangyao Yu
Polaris: Enabling Transaction Priority in Optimistic Concurrency Control
Proceedings of SIGMOD, June 2023
[code]
Hokeun Cha, Xiangpeng Hao, Tianzheng Wang, Huanchen Zhang, Aditya Akella, Xiangyao Yu
Blink-hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases
Proceedings of the VLDB Endowment, Feburary, 2023
[code]
Xinjing Zhou, Xiangyao Yu, Goetz Graefe, Michael Stonebraker
Two is Better Than One: The Case for 2-Tree for Skewed Data Sets
Proceedings of The Conference on Innovative Data Systems Research (CIDR), January 2023
[code]
Zhihan Guo, Xinyu Zeng, Kan Wu, Wuh-Chwen Hwang, Ziwei Ren, Xiangyao Yu, Mahesh Balakrishnan, Philip Bernstein
Cornus: Atomic Commit for a Cloud DBMS with Storage Disaggregation
Proceedings of the VLDB Endowment, October 2022
[Extended version in arXiv][code]
Xinjing Zhou, Xiangyao Yu, Goetz Graefe, Michael Stonebraker
Lotus: Scalable Multi-Partition Transactions on Single-Threaded Partitioned Databases
Proceedings of the VLDB Endowment, July 2022
[code]
Bobbi Yogatama, Weiwei Gong, Xiangyao Yu
Orchestrating Data Placement and Query Execution in Heterogeneous CPU-GPU DBMS
Proceedings of the VLDB Endowment, July 2022
[code]
Elena Milkai, Yannis Chronis, Kevin Gaffney, Zhihan Guo, Jignesh Patel, Xiangyao Yu
How good is my HTAP system?
Proceedings of SIGMOD, June 2022
[code][slides]
Anil Shanbhag*, Bobbi Yogatama*, Xiangyao Yu, Samuel Madden
Tile-based Lightweight Integer Compression in GPU
Proceedings of SIGMOD, June 2022
*Equal contribution
[code]
Yu Xia, Xiangyao Yu, Matthew Butrovich, Andrew Pavlo, Srinivas Devadas
Litmus: Towards a Practical Database Management System with Verifiable ACID Properties and Transaction Correctness
Proceedings of SIGMOD, June 2022
[code][website][video]
Youmin Chen, Xiangyao Yu, Paraschos Koutris, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Jiwu Shu
Plor: General Transactions with Predictable, Low Tail Latency
Proceedings of SIGMOD, June 2022
[code]
Jin Zhang, Xiangyao Yu, Zhengwei Qi, Haibing Guan
Falcon: A Timestamp-based Protocol to Maximize the Cache Efficiency in the Distributed Shared Memory
International Parallel & Distributed Processing Symposium (IPDPS), May 2022
Sujay Yadalam, Nisarg Shah, Xiangyao Yu, Michael Swift
ASAP: A Speculative Approach to Persistence
Proceedings of the International Symposium on High Peformance Computer Architecture (HPCA), April 2022
[video]
Yifei Yang, Matt Youill, Matthew Woicik, Yizhou Liu, Xiangyao Yu, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker
FlexPushdownDB: Hybrid Pushdown and Caching in a Cloud DBMS
Proceedings of the VLDB Endowment, July 2021
[code]
Zhihan Guo, Kan Wu, Cong Yan, Xiangyao Yu
Releasing Locks As Early As You Can: Reducing Contention of Hotspots by Violating Two-Phase Locking
Proceedings of SIGMOD, June 2021
[Extended version in arXiv] [code]
Yi Lu, Xiangyao Yu, Lei Cao, Samuel Madden
Epoch-based Commit and Replication in Distributed OLTP Databases
Proceedings of the VLDB Endowment, January 2021
[code]
Yu Xia, Xiangyao Yu, Andrew Pavlo, Srinivas Devadas
Taurus: Lightweight Parallel Logging for In-Memory Database Management Systems
Proceedings of the VLDB Endowment, October 2020
[Extended version in arXiv] [code]
Yi Lu, Xiangyao Yu, Lei Cao, Samuel Madden
Aria: A Fast and Practical Deterministic OLTP Database
Proceedings of the VLDB Endowment, July 2020
[code]
Anil Shanbhag, Samuel Madden, Xiangyao Yu
A Study of the Fundamental Performance Characteristics of GPUs and CPUs for Database Analytics
Proceedings of SIGMOD, June 2020
[Extended version in arXiv] [code]
Xiangyao Yu, Matt Youill, Matthew Woicik, Abdurrahman Ghanem, Marco Serafini, Ashraf Aboulnaga, Michael Stonebraker
PushdownDB: Accelerating a DBMS using S3 Computation
Proceedings of 36th International Conference on Data Engineering (ICDE), April 2020
[Extended version in arXiv] [code ]

Junjay Tan, Thanaa Ghanem, Matthew Perron, Xiangyao Yu, Michael Stonebraker, David DeWitt, Ashraf Aboulnaga, Marco Serafini, Tim Kraska
Choosing A Cloud DBMS: Architectures and Tradeoffs
Proceedings of the VLDB Endowment, August 2019
Erfan Zamanian, Xiangyao Yu, Michael Stonebraker, Tim Kraska
Rethinking Database High Availability with RDMA Networks
Proceedings of the VLDB Endowment, July 2019
Yi Lu, Xiangyao Yu, Samuel Madden
STAR: Scaling Transactions through Asymmetric Replication
Proceedings of the VLDB Endowment, July 2019
[code]
Yu Xia, Xiangyao Yu, Willian Moses, Julian Shun, Srinivas Devadas
LiTM: A Lightweight Deterministic Software Transactional Memory System
International Workshop on Programming Models and Applications for Multicores and Manycores (PMAM@PPoPP), February 2019
Xiangyao Yu, Vijay Gadepally, Stan Zdonik, Tim Kraska, and Michael Stonebraker
FastDAWG: Improving Data Migration in the BigDAWG Polystore System
VLDB workshop on Polystores and other Systems for Heterogeneous Data (POLY@VLDB), August 2018
Xiangyao Yu, Yu Xia, Andrew Pavlo, Daniel Sanchez, Larry Rudolph, Srinivas Devadas
Sundial: Harmonizing Concurrency Control and Caching in a Distributed OLTP Database Management System
Proceedings of the VLDB Endowment, June 2018
[code]
Xiangyao Yu, Chris Hughes, Nadathur Satish, Onur Mutlu, Srinivas Devadas
Banshee: Bandwidth-Efficient DRAM Caching via Software/Hardware Cooperation
Proceedings of the 50th International Symposium on Microarchitecture (MICRO), October 2017
[code]
Xiangyao Yu, Hongzhe Liu, Ethan Zou, Srinivas Devadas
Tardis 2.0: Optimized Time Traveling Coherence for Relaxed Consistency Models
Proceedings of the 25th International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2016
Xiangyao Yu, Andrew Pavlo, Daniel Sanchez, Srinivas Devadas
TicToc: Time Traveling Optimistic Concurrency Control
Proceedings of SIGMOD, June 2016
[code]
Xiangyao Yu, Christopher Hughes, Nadathur Satish, Srinivas Devadas
IMP: Indirect Memory Prefetcher
Proceedings of the 48th International Symposium on Microarchitecture (MICRO), December 2015
Xiangyao Yu, Srinivas Devadas
Tardis: Time Traveling Coherence Algorithm for Distributed Shared Memory
Proceedings of the 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2015
Presented in the Best Paper Session
Xiangyao Yu, Syed Kamran Haider, Ling Ren, Christopher Fletcher, Albert Kwon, Marten van Dijk, Srinivas Devadas
PrORAM: Dynamic Prefetcher for Oblivious RAM
International Symposium on Computer Architecture (ISCA), June 2015
Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, Michael Stronebraker
Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores
Proceedings of the VLDB Endowment, November 2014
[code]
Rachata Ausavarungnirun, Chris Fallin, Xiangyao Yu, Kevin Chang, Greg Nazario, Reetuparna Das, Gabriel Loh, and Onur Mutlu
Design and Evaluation of Hierarchical Rings with Deflection Routing
Proceedings of the 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), October 2014
Christopher Fletcher, Ling Ren, Xiangyao Yu, Marten van Dijk, Omer Khan, and Srinivas Devadas
Suppressing the Oblivious RAM Timing Channel While Making Information Leakage and Program Efficiency Trade-offs
Proceedings of the International Symposium on High Peformance Computer Architecture (HPCA), February 2014
Xiangyao Yu, Christopher Fletcher, Ling Ren, Marten Van Dijk, and Srinivas Devadas
Generalized External Interaction with Tamper-Resistant Hardware with Bounded Information Leakage
Proceedings of the Cloud Computing Security Workshop (CCSW), November 2013
Emil Stefanov, Marten van Dijk, Elaine Shi, Christopher Fletcher, Ling Ren, Xiangyao Yu, and Srinivas Devadas
Path ORAM: An Extremely Simple Oblivious RAM Protocol
Proceedings of the 20th Computer and Communication Security Conference (CCS), November 2013
Best Student Paper Award
Ling Ren, Christopher W. Fletcher, Xiangyao Yu, Marten van Dijk, and Srinivas Devadas
Integrity Verification for Path Oblivious-RAM
Proceedings of the 17th IEEE High Performance Extreme Computing Conference (HPEC), September 2013
Ling Ren, Xiangyao Yu, Christopher W. Fletcher, Marten van Dijk, and Srinivas Devadas
Design Space Exploration and Optimization of Path Oblivious RAM in Secure Processors
40th International Symposium on Computer Architecture (ISCA), Jun, 2013
Yuan Lin Yeoh, Bo Wang, Xiangyao Yu, Tony Tae Hyoung Kim
A 0.4V 7T SRAM with Write Through Virtual Ground and Ultra-fine Grain Power Gating Switches
IEEE International Symposium on Circuits and Systems (ISCAS), May 2013
Chris Fallin, Greg Nazario, Xiangyao Yu, Kevin Chang, Rachata Ausavarungnirun, Onur Mutlu
MinBD: Minimally-Buffered Deflection Routing for Energy-Efficient Interconnect
Proceedings of the 6th ACM/IEEE International Symposium on Networks on Chip (NOCS), May 2012
One of the five papers nominated for the Best Paper Award by the Program Committee

Patents

Xiangyao Yu, Christopher J. Hughes, Nadathur Rajagopalan Satish
Hardware Prefetcher for Indirect Access Patterns
US Patent Granted, US9582422B2, June 2016
Thomas Moscibroda, Zhengping Qian, Mark Eugene Russinovich, Xiangyao Yu, Jiaxing Zhang, Feng Zhao
Service Allocation in a Distributed Computing Platform
US Patent Granted, US9419859B2, August 2016

Thesis

Xiangyao Yu
Logical Leases: Scalable Hardware and Software Systems through Time Traveling
Ph.D. Dissertation, September 2017
George M. Sprowls Awards for Best Ph.D. Thesis in Computer Science
Xiangyao Yu
An Evaluation of Concurrency Control with One Thousand Cores
Master Thesis, February 2015