We will read about 3-4 papers per week. You will have to write a short review for each paper (or answer a question) and submit it to the review-submission site before class.



For paper reviews, you must follow these guidelines: 1. summarize the paper briefly in your own words in 2-3 sentences. 2. provide a brief description of the problem the authors were trying to solve. 3. explain one or two key/interesting ideas of the paper 4. what are the positives of the paper? 5. what are some of the flaws of this paper? 6. What is one thing you found confusing or did not understand? For questions, just answer the question.



1/24:Review - Submit your review of the RPC paper.

1/29:Question - In his classical paper on time, clocks, and ordering, Lamport defines two events A and B to be concurrent if A does not happen before B and B does not happen before A. Is concurrency transitive? i.e., if A and B are concurrent events, and B and C are concurrent events, then are A and C concurrent? Why or why not? Please submit your answers on the review site before 10 am.

1/31:Question - (i) Can you think of a real application scenario where a taking global snapshot can be useful? Provide an example not mentioned in the paper. (ii) Consider the following scheme for taking snapshots: all the processes maintain logical clocks according to the scheme mentioned in the Lamport's paper. Given a logical time t, all the processes include all the events whose logical time is <= t in its local snapshot. For this question, ignore capturing the network channels. The global snapshot is constructed by combining all local snapshots. Does this global state correspond to a consistent cut? Why or why not?

2/05: - no reviews/questions. Just skim the papers and focus on project-1.

2/07:Question - The paper states that some systems use tens of seconds as timeouts. Is this the case in widely used systems such as ZooKeeper, MongoDB? What are the implications of using large and small timeout values?

2/12:Question - The paper states that in primary-backup (PB) systems, queries incur high latencies caused by prior writes that are still being processed. Must reads always wait for prior writes to complete in PB? If no, explain why not; if yes, explain why. How does chain replication enable low-latency reads?

2/19:Well done on P1! No question or review.

2/21:Question - Does single-decree Paxos (as described in the paper) guarantee liveness? i.e., will a value be chosen in a timely fashion if enough nodes are alive? If yes, explain how. If not, explain why not.

2/24:Question - Raft uses a leader-based approach (there is only one leader/proposer at a time), whereas Paxos allows multiple proposers to propose values concurrently. What are the advantages of a single-leader approach? What are the disadvantages? (Hint: think about performance.)

3/2:No question or review.

3/4:Question - The Chubby paper describes the different ways in which their lock service was used within Google (electing a primary, metadata store, etc.). Can you think of a use case for Chubby that is not mentioned in the paper? If you are unable to think one, among the use cases discussed in the paper, which did you find to be the most interesting? Why?

3/11:Question - Compare 2PC to Paxos. What is different? What is similar? Will you ever need both together in a system?

4/01:Question - Does Raft in itself provide exactly-once semantics offered by linearizability? If yes, describe how. If no, describe at a high-level what must be done to achieve exactly-once semantics.

4/08:Question - Describe one workload where AFS performs better than NFS. Describe one where NFS performs better than AFS.

4/10:Question - GFS uses a chunksize of 64MB. Describe the trade-offs in choosing a chunksize.

4/14:Question - Both Bayou and Dynamo aim to provide eventual consistency. One problem with eventual consistency is that there might be conflicting writes to different replicas. To ensure eventual consistency, the system must perform conflict resolution. What are the similarities between Bayou and Dynamo in the way they resolve conflicts? How do they differ?