next up previous
Next: Drifts in Document Hot Up: On the Implications of Previous: Page Request Interarrival Times

Revisiting Cache Replacement Algorithms

 
  
Figure 4: Hit-ratio and byte hit-ratio for four algorithms for a one-week portion of DEC traces and a two-day portion of University of California at Berkeley traces.
\begin{figure*}
\includegraphics {hit.jgr.ps}\end{figure*}

The model that we have discussed so far is called an independent reference model  [GCD73] in the early operating system paging studies [Den80]. It is well known in the operating system caching community that if (i) the requests are independent and have a fixed probability and (ii) the pages have the same size then the optimal replacement algorithm is to keep those pages with the highest probabilities in the cache [GCD73]. In practice, an online algorithm detects those documents by keeping track of the number of references to each document and keeping those documents with the highest reference count in the cache. In other words, the best online algorithm under the independent reference model is the Least-Frequently-Used (LFU) algorithm.

However, there are two versions of LFU that are often confused in the literature: In-Cache-LFU, and Perfect-LFU. To make a clear distinction between the two policies, Perfect-LFU remembers page counters even when a page is evicted from the cache, while In-Cache-LFU removes the page counter together with the evicted page. Clearly, Perfect-LFU incurs more overhead and is less practical than In-Cache-LFU. However, Perfect-LFU is a better realization of the optimal cache replacement algorithm under the model.

The question we wish to answer is: which of the four replacement algorithms--Perfect-LFU, In-Cache-LFU, LRU and GD-Size--performs the best in terms of hit-ratio? Note that LRU is the most widely-used Web cache replacement algorithm, while GD-Size is a new algorithm that takes both document size and locality into account and was shown to outperform existing algorithms in terms of hit-ratio [CI97]. We answer the above question using trace-driven simulations.

Figure 4 shows the hit-ratios and byte hit ratios for the four algorithms--In-Cache-LFU, Perfect-LFU, LRU, and GD-Size--under a one-week portion of DEC traces [DEC96, 8/29/96 to 9/4/96] and a two-day portion of University of California at Berkeley (UCB) traces [UCB96, 9/9/96 to 9/10/96]. Although we found that other traces produced similar results, we present only the results from these two traces.

Figure 4 show that, as predicted by the independent reference model, Perfect-LFU performs best in terms of byte hit-ratios in most cases. GD-Size still performs best in terms of hit-ratios for small cache sizes because GD-Size takes document size into account (that is, small documents are favored over large ones) whereas the Perfect-LFU does not. Note that when cache sizes are large, Perfect-LFU performs comparatively to GD-Size in hit-ratio and much better in byte hit-ratio. The figure also shows that In-Cache-LFU performs the worst and consequently is a poor choice for Web cache replacement algorithms.

The main drawback of Perfect-LFU is that it requires additional space to maintain the counts for documents that are evicted from the cache. In addition, it fails to take document size into account. We are currently designing an algorithm that has a bounded space requirement and considers both reference counts and document size.


next up previous
Next: Drifts in Document Hot Up: On the Implications of Previous: Page Request Interarrival Times
Pei Cao
6/2/1998