Back to index

F1: A Distributed SQL Database That Scales

Jeff Shute, Radek Vingralek, Bart Samwel, Ben Handy, Chad Whipkey, Eric Rollins, Mircea Oancea, Kyle Littlefield, David Menestrina, Stephan Ellner, John Cieslewicz, Ian Rae, Traian Stancescu, Himani Apte
Google, UW-Madison
Summary by: Zuyu Zhang

One-line Summary

F1 (Filial 1 hybrid) is a distributed relational database system that provides the scalability of NoSQL systems, along with five novel features: “ distributed query engine, consistent secondary indexes, asynchronous schema changes, optimistic transactions, and automatic history logging, tracking and publishing ”.

Overview/Main Points

  • Schema Changes
  • Change History
  • Deployment and Results (latency & throughput)
  • Relevance

    Future work

    1. How to make global indexes that uses 2PC more scalable without compromising consistency?
    2. How to mitigate slowdown of partitioned consumers due to the streaming nature of F1 queries and horizontal dependency caused by frequent hash repartitioning? (Use disk-backed buffering to break the dependency and allow clients to proceed indepedently)
    3. How to process a query involving in PB field parsing (decoding) and selection efficiently? (Pushing operations into Spanner? Use other indexes in Spanner?)
    4. Aim at adding checkpointing for some intermediate results of distributed queries executing longer than one hour, without hurting latency in the normal case without failures.
    5. How to improve CPU efficiency (an order of magnitude more CPU than the MySQL counterpart) of F1 server, with respect to the process of a query including data compressed on disk and go through several layers, decompressing, processing, recompressing, and sending over the network.