Next: Traces and Simulations Up: Summary Cache: A Scalable Previous: Summary Cache: A Scalable

Introduction

As the tremendous growth of the World-Wide Web continues to strain the Internet, caching has been recognized as one of the most important techniques to reduce bandwidth consumption [29]. In particular, caching within Web proxies has been shown to be very effective [14,33]. To gain the full benefits of caching, proxy caches behind a common bottleneck link should cooperate and serve each other's misses, thus further reducing the traffic through the bottleneck. We call the process ``Web cache sharing.''

Web cache sharing was first proposed in the context of the Harvest project [26,12]. The Harvest group designed the Internet Cache Protocol (ICP) [18] that supports discovery and retrieval of documents from neighboring caches. Today, many institutions and many countries have established hierarchies of proxy caches that cooperate via ICP to reduce traffic to the Internet [25,30,41,5,14].

Nevertheless, the wide deployment of web cache sharing is currently hindered by the overhead of the ICP protocol. ICP discovers cache hits in other proxies by having the proxy multicast a query message to all other proxies whenever a cache miss occurs. Thus, as the number of proxies increases, both the communication and the CPU processing overhead increase quadratically.

A number of alternative protocols have been proposed to address the problem, for example, a cache array routing protocol that partitions the URL space among proxies [46]. However, such solutions are often not appropriate for wide-area cache sharing, which is characterized by limited network bandwidth among proxies and non-uniform network distances between proxies and their users (for example, each proxy might be much closer to one user group than to others).

In this paper, we address the issue of scalable protocols for wide-area Web cache sharing. We first examine the benefits of Web cache sharing by analyzing a collection of Web access traces. We show that sharing cache contents among proxies significantly reduces traffic to Web servers, and that simple cache sharing, which imposes no coordination among cache replacements of proxies, suffices to obtain most of the benefits of fully coordinated caching. We also quantify the overhead of the ICP protocol by running a set of proxy benchmarks. The results show that even when the number of cooperating proxies is as low as four, ICP increases the inter-proxy traffic by a factor of 70 to 90, the number of network packets received by each proxy by 13% and higher, and the CPU overhead by over 15%. In the absence of inter-proxy cache hits (also called remote cache hits), the overhead can increase the average user latency by up to 11%.

We then propose a new cache sharing protocol called Summary Cache. Under this protocol, each proxy keeps a compact summary of the cache directory of every other proxy. When a cache miss occurs, a proxy first probes all the summaries to see if the request might be a cache hit in other proxies, and sends a query messages only to those proxies whose summaries show promising results. The summaries do not need to be accurate at all times. If a request is not a cache hit when the summary indicates so (a false hit), the penalty is a wasted query message. If the request is a cache hit when the summary indicates otherwise (a false miss), the penalty is a higher miss ratio.

We examine two key questions in the design of the protocol: the frequency of summary updates and the representation of summary. Using trace-driven simulations, we show that the update of summaries can be delayed until a fixed percentage (for example, 1%) of cached documents are new, and the hit ratio will degrade proportionally (For the 1% choice, the degradation is between 0.02% to 1.7% depending on the traces).

To reduce the memory requirements, we store each summary as a ``Bloom filter'' [6]. This is a computationally very efficient hash-based probabilistic scheme that can represent a set of keys (in our case, a cache directory) with minimal memory requirements while answering membership queries with 0 probability for false negatives and low probability for false positives. Trace-driven simulations show that with typical proxy configurations, for N cached documents represented within just N bytes, the percentage of false positives is 1% to 2%. In fact, the memory can be further reduced at the cost of an increased false positive ratio. (We describe Bloom filters in more detail later.)

Based on these results, we design the Summary-Cache Enhanced ICP protocol and implement a prototype within the Squid proxy. Using trace-driven simulations as well as experiments with benchmarks and trace-replays, we show that the new protocol reduces the number of inter-proxy messages by a factor of 25 to over 60, reduces the network bandwidth consumption (in terms of bytes transferred) by over 50%, and eliminates 30% to 95% of the protocol CPU overhead. Compared with no cache sharing, our experiments show that the protocol incurs little network traffic and increases CPU time only by 5% to 12% depending on the remote cache hit ratio. Yet, the protocol achieves a cache hit ratio similar to the ICP protocol most of the time.

The results indicate that the Summary Cache Enhanced ICP protocol can scale to a large number of proxies. Thus, it has the potential to significantly increase the deployment of Web cache sharing and reduce Web traffic on the Internet. Toward this end, we are making our implementation publicly available [15] and are in the process of transferring it to the ICP user community.

Next: Traces and Simulations Up: Summary Cache: A Scalable Previous: Summary Cache: A Scalable

Pei Cao
7/5/1998