Redundancy Elimination


A large amount of popular content is transferred repeatedly across network links in the Internet. In recent years, protocol-independent redundancy elimination, which can remove duplicate strings from within arbitrary network flows, has emerged as a powerful technique to improve the efficiency of network links in the face of repeated data. Many vendors (e.g., Riverbed, Cisco WAAS, Juniper, Bluecoat, etc.) offer such redundancy elimination middleboxes to improve the effective bandwidth of enterprise, data center and ISP links alike. We have conducted a large scale trace-driven study of protocol independent redundancy elimination mechanisms, driven by several terabytes of packet payload traces collected at the access link of University of Wisconsin and of 11 enterprise networks of different sizes. We have found that these redundancy elimination mechanisms can give up to 60% bandwidth savings.

Motivated by these benefits, we explore the benefits of deploying packet-level redundant content elimination as a universal primitive on all Internet routers. Our initial assumption is that all future network elements will have the ability to strip redundant content from network packets on the fly, by comparing packet contents against those recently-forwarded packets which are stored in a cache. Network elements immediately downstream can reconstruct full packets from their own cache. Applying this technology at every link will remove duplicate content and provide immediate performance benefits by reducing the overall load on network. We evaluate the ideal benefits of RE and show that it can lower link loads by 10-50%. Our Click software prototype can run at OC48 speeds. However, realizing network-wide RE in practice is challenging. RE involves several resource intensive operations, and an ideal approach must explicitly account for the resource constraints imposed on network elements. We have designed SmartRE, a practical and efficient architecture for network-wide RE. We show that SmartRE can enable more effective utilization of the available resources at network devices, and thus can magnify the overall benefits of network-wide RE.

In addition to the network, end-hosts can also play an active role in RE. Unlike in-network RE, such an approach will benefit both end-to-end encrypted traffic as well as traffic on last-hop wireless links to mobile devices. In addition to bandwidth savings, this approach will provide energy savings on mobile devices. Motivated by these benefits, we have also investigated deploying protocol independent redundancy elimination as an end system redundancy elimination service} (EndRE) . The design of EndRE has interesting challenges. EndRE needs to be fast, adaptive and parsimonious in memory usage in order to opportunistically leverage resources on end hosts.

Finally, we argue that far more significant network-wide benefits can be derived by redesigning network routing protocols to leverage the universal deployment. We have developed ``redundancy-aware'' intra- and inter-domain routing algorithms and show that they enable better traffic engineering, reduce link usage costs, and enhance ISPs' responsiveness to traffic variations.




Graduate students: Ashok Anand and Chitra Muthukrishnan. Alumni: Archit Gupta
Collaborators: Srinivasan Seshan, Vyas Sekar, Ramachandran Ramjee, George Varghese and Scott Shenker.