Research
Following are the major research projects I have been working on.Redundancy in Packet Contents on Internet Routers
A study of Malware Evolution
Sequoia: Internet Path Metrics for Network Aware Distributed Applications [research.microsoft.com]
Application Buffer Cache Management
Sources of Internet Spam
Popular Internet content is sent repetitively to multiple users over the Internet. This work eliminates the transmission of repeated content in packets. Bandwidth requirements can be reduced by 15-50% over Internet links. High speed packet matching algorithms implemented for the Click Software Router. Implementation can support > 1 Gigabits/sec. Redundant content is tokenized at upstream router and is regenerated at the downstream router. Deployment scenario: Across the Internet.
We develop a novel graph-pruning algorithm to establish the most likely inheritance relationships between different malware. Analysis done over a large corpus of malware meta data provided by McAfee.
Sequoia aims to make distributed applications network-aware. That is, enable applications to take advantage of the characteristics of the underlying network such as proximity, bandwidth capacity, and topology. It intends to achieve this through the key concept of prediction trees, a virtual topology of the network, where virtual nodes representing routers connect real end hosts, and carefully computed edge weights model path properties such as latency, loss rate, and bandwidth.
A very large scale network traffic measurement system based on the Round Robin Database experienced scalability issues due to default readahead behavior and buffer-cache management of the OS. Unnecessary disk reads were a performance bottleneck. We developed tools to expose the readahead and buffer-cache behaviors of the OS that are hidden from the user. Solutions included application advice to the kernel and application level caching. System scaled to universitys entire network (half a million data points). Awarded best paper at USENIX Large Installation System Administration Conference (LISA'07).
