Project-1: Implementation and Measurement

Must be done in groups of 2-3. Deadline to submit artifacts: 2/17. Meetings to follow soon.

The goal of this project to get some experience building a simple distributed system with a server and a few clients. Specifically, you will build a non-replicated key-value service and provide a set of simple operations to clients. You will learn to use an RPC library, think about how to design a server that can recover correctly and produce correct answers, and perform measurements using simple benchmarks.


Basics

The server should provide a simple key-value store service to clients. The server need not be replicated, i.e., a single server holds the data and the service will remain unavailable if the server crashes. Both keys and values are simple strings; keys are usually 128 bytes and values do not have any size limits (they can be as large as a few megabytes). We suggest you use c++ or Go for your implementation.

API

The API to your service is straightforward; it must support only the following three operations.

  • set(string key, string value) - sets the value of the given key
  • get(string key) - returns the value of a given key
  • getPrefix(string prefixKey) - returns a list of values whose keys start with prefixKey
  • note: the results returned by the server could potentially be large; you must take care of such cases.
Communication

You will use the GRPC library for communication between the clients and the server. Your server must export the above-mentioned methods and a stat method to retrieve some stats from the server (see below). You must use only RPCs for communication in this project. However, you have the choice to use a different RPC library than GRPC (e.g., Apache Thrift); please talk to us if you would like to do so.

Failure Recovery and Durability

Your key-value store must be able to recover from failures and remain as consistent and durable as possible. Specifically, clients should never be able to notice any data loss and always see the latest writes even when failures arise. Thus, the server must be able to tolerate process crashes, OS crashes, and power failures. You can assume that data once persisted on the device won't be arbitrarily corrupted. However, you might have to handle cases where the machine fails in the middle when you are trying to write a large value etc. You might want to look into this talk and paper for more details on how to do such things correctly.

Server Storage

As noted above, the key-value store must be persistent; the system must not lose any data. You can organize the storage on the server in whatever way you think is best for achieving this goal. In real systems such as MongoDB, developers use something like RocksDB or WiredTiger as their storage layer and configure them for different durability guarantees. However, for simplicity, in this project, you will develop your own storage layer. You can also design and optimize the storage layer in any ways you like; for example, you could have an in-memory hash table to speed up reads etc.

Handling Clients

Your server must be capable of supporting many concurrent clients. One issue to think about here is data races: how do you prevent many concurrent requests (with some of them being writes) from accessing the same items simultaneously? An obvious way to solve this problem would be to use locks. However, be careful to avoid unnecessary locking (e.g., locking the entire hash table); otherwise your performance might suffer.

Measurements

You will measure your system's performance using the following three experiments.

First, you will set the value size to 512B, 4KB, 512KB, 1MB, 4MB and measure the end-to-end latency for the following two workloads using a single client

  • a read-only workload
  • a 50% reads and 50% writes workload

For all workloads, the key distribution must be uniform, i.e., all keys are equally likely to be read and updated.

Second, you will initialize the store with many key-value pairs (for example, 10M pairs). Set value size to be 4KB for this experiment. Then, you will take the server down and restart it. You will measure the time it takes for the server to restart and start serving read requests. You will also compare the latencies of such "cold" get requests against "normal" reads (where the server keeps running and has potentially cached a bunch of items.)

For the third experiment, you will set the number of clients to 1, 2, 4, 8, 16, 32 ... and measure the latency and throughput of the system for

  • a read-only workload
  • a 50% reads and 50% writes workload

You can stop increasing the clients once your throughput does not increase any further. You must plot the average latency against the throughput to show the results of this experiment (see figure 5 in this paper).

In addition to performance, you must also validate correctness, particularly, that your store provides strong durability in the presence of failures. A simple way to do this would be randomly crash the server when a workload is running and checking if the server can recover all written data. You are free to develop any other sophisticated ways of doing this testing. You must also test that concurrent writers and readers don't lead to race conditions.

Which Machines to Use

We have created a project on CloudLab for our class for measurements. However, you don't need any additional machines than your personal laptops/desktops for development and testing. You could also do the initial measurements on a single machine, i.e., both the server and the clients can be just different processes on the same machine. You can conduct your final measurements on CloudLab.

Collaboration Policy

It is okay for a group to discuss high-level ideas and problems with other groups. However, sharing/copying code is not okay.

Submissions and Meetings
  • You must turn in your source code. Your folder should contain all necessary files and a Makefile that compiles everything. In addition, you must include a "group.txt" file that lists the members of your group. Your server binary must be named "kvserver" and your client "kvclient". You should also include a simple test script that initializes the server and then starts a few clients (e.g., 10) that perform some writes and reads (e.g., 10K). The script then must print the following stats on the console: server start time, #total_sets done, #total_gets done, #total_getprefixes done. Finally, the script must stop the server. The deadline for submitting your code is 2/17 at 9.59 am central time. We will create a directory on AFS for this purpose and let you know the details soon.
  • We will have an in-person meeting in which you will show/describe three things. First, you will describe/argue why your store is strongly durable (what is your write protocol and what you do upon recovery etc). Second, you will show some graphs from your measurements and explain the trends. High performance is not a goal for this project; instead, you should be able to understand and explain why your system has a particular performance behavior. Finally, you should say what's one thing that's similar and different about the RPC package you used compared to what's described in the Xerox RPC paper. Each group will get a 20-minute slot; so, prepare in advance what you want to show. We will have these meetings the week after code submission.