Modern distributed storage systems employ complex protocols to update replicated data. In this paper, we study whether such update protocols work correctly in the presence of correlated crashes. We find that the correctness of such protocols hinges on how local filesystem state is updated by each replica in the system. We build PACE, a framework that systematically generates and explores persistent states that can occur in a distributed execution. PACE uses a set of generic rules to effectively prune the state space, reducing checking time from days to hours in some cases. We apply PACE to eight widely used distributed storage systems to find correlated crash vulnerabilities, i.e., problems in the update protocol that lead to user-level guarantee violations. PACE finds a total of 26 vulnerabilities across eight systems, many of which lead to severe consequences such as data loss, corrupted data, or unavailable clusters.
Beyond Storage APIs: Provable Semantics for Storage Stacks
Applications are deployed upon deep, diverse storage stacks that are constructed on-demand. Although manystorage stacks share a common API to allow portability, application behavior differs in subtle ways depending upon unspecified properties of the underlying storage stack. Currently, there is no way to test whether an application will behave correctly on a given storage stack: corruption or data loss could occur at any point in the application lifetime. We argue that we require an expressive language for specifying the complex storage guarantees required by different applications. The same language can be used to write a high-level specification capturing the design of different storage-stack layers. Given the required guarantees, and the storage-stack specifications, we can prove that stacks constructed dynamically (by composing different storage-stack layers) provide the guarantees required by the application.
All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications
We present the first comprehensive study of applicationlevel crash-consistency protocols built atop modern file systems. We find that applications use complex update protocols to persist state, and that the correctness of these protocols is highly dependent on subtle behaviors of the underlying file system, which we term persistence properties. We develop a tool named BOB that empirically tests persistence properties, and use it to demonstrate that these properties vary widely among six popular Linux file systems. We build a framework named ALICE that analyzes application update protocols and finds crash vulnerabilities, i.e., update protocol code that requires specific persistence properties to hold for correctness. Using ALICE, we analyze eleven widely-used systems (including databases, key-value stores, version control systems, distributed systems, and virtualization software) and find a total of 60 vulnerabilities, many of which lead to severe consequences. We also show that ALICE can be used to evaluate the effect of new filesystem designs on application-level consistency
Sherlock: Under the hood of distributed data stores - Contact me via email for a copy of the report.
Classifying Ephemeral vs Evergreen Content on the Web - 2014 - CS760 course project.
Design, implementation and evaluation of classification algorithms to classify web content as ephemeral or evergreen. Our techniques stood at rank 21 among 700 submissions in Kaggle.
Cirana - A highly consistent non replicated key value store - 2014 - CS739 course mini-project.
Design and implementation of a simple non-replicated key-value store called Cirana. Cirana is designed to be strictly consistent even in the event of server crashes. Cirana uses a slightly novel persistent hash table implementation that can overcome high cold get latencies. It also has an improvement in the key distribution technique that uses a variation of Consistent Hashing. This persistent hash table implementation eliminates get latencies and also does transparent load balancing to maintain get latencies within a given bound without compromising get/put throughput.
Uncovering Twilio: Insights into Cloud Communication Services - 2014 - CS740 course project.
Abstract: Cloud communication service (CCS) with its simplicity and lower investment cost is becoming increasingly popular.
In contrast to its growing popularity, very little is known about the internals of CCS with respect to its architecture and protocols. To gain insights into CCS, we study a popular cloud communication service Twilio using gray box techniques. We provide insights into the Twilio ecosystem, its components, the interaction among components and the protocols. We also measure some guarantees provided by Twilio and show how the measurements fair against what is promised. Our analysis unveils a number of interesting aspects about the Twilio ecosystem and has strong implications for developers who build applications atop Twilio APIs.
In this work, we studied problems in using the Block interface at all layers of the storage and computer system. Specifically, we analyzed 'small write' traffic in block devices attached to network. We implemented a 'cache & diff' mechanism on top of NBD driver and NBD server to reduce write traffic to remote storage. Experimentally we showed a reduction of 70% in write traffic for certain workloads. Please mail email@example.com if you wish to take a look at the report.
Secure Distributed Association Rule Mining - 2010. BTP - Guided By: Dr. Rajalakshmi.
Problem - Mining of association rules from 'N' distributed sites with no site willing to reveal sensitive information to other sites. Used Elliptic Curve Cryptography to improve security for mining horizontally partitioned data.
Implemented the Frequent Pattern Tree Growth mining alogrithm to determine association rules, underlying socket level communication using Java socket libraries and reporting interfaces using Java Swing.