Content-based Network Protocols and Architectures    




Overview

The Internet's protocols and hardware were designed mainly to support host-oriented communication. In recent years, there has been an increasing push toward content-oriented communication, wherein content is a first-class entity that the network directly acts on. Content-oriented operations require us to rethink several existing mechanisms, including address lookup, routing, transport, and the Web. We are actively involved in building architectural support for content. For instance, in the BufferHash project, we have been developing schemes for performing fast lookups over billions of content addresses which may be updated in a streaming fashion. Applications like WAN optimization and data deduplication require large hashtables for their operation. One option is to maintain these huge datastructures on disk. But high access times might hurt performance. Another option is to store them in memory, but it might prove to be very costly. We propose an alternate hashtable design using a combination of small DRAM and flash memory. We show that Bufferhash achieves a significantly higher hashtable operations per second per dollar compared to traditional naive approaches.

The core idea behind Bufferhash is to group insert operations in a small hashtable called an 'incarnation' in memory, and push the incarnation's hashtable to flash in one go. Incarnations are appended to the flash memory as and when they are flushed from memory. Batching writes significantly improves the average insert operation cost, since a single write requires erasing a complete flash block, which is slow. The figure shows the organization of incarnations in the memory and the flash.
Bufferhash structure

Now, < key,value > pairs are stored in incarnation hashtable bins in the order in which they were inserted. A naive lookup would examine each incarnation by age for the key. This would be very slow. To speed this process, we maintain a bloom filter of keys corresponding to each incarnation as shown in the figure. These bloom filters are stored in memory and are consulted during lookup of a key. If a bloom filter indicates the presence of the key in an incarnation, we get the corresponding value from the hashtable stored in flash.

Papers

Talks

People

Graduate students: Ashok Anand, Aaron Gember, Shan-Hsiang Shen.
Alumni: , Chitra Muthukrishnan (May 10) , Steven Kappes (Masters, May 09).
Collaborators: Suman Nath.