Prefetching can be applied in three ways in the Web contexts: between Web servers and clients, between Web servers and proxies, and between proxies and clients.
Early studies have mostly focused on prefetching between Web servers and clients, since relatively few proxies were deployed then. Padmanabhan and Mogul [17] analyze the latency reduction and network traffic of prefetching using Web server traces and trace-driven simulation. The prediction algorithm they used is also based on the PPM data compressor, but with order of 1. The study shows that prefetching from Web servers to individual clients can reduce client latency by as high as 45%, at the expense of doubling the network traffic. Bestavros el al [2] presents a model for speculative dissemination of World Wide Web documents. The work shows that reference patterns observed at a Web server can be used as an effective source of information to drive prefetching. They reached similar conclusions in [17]. Finally, Cuhan and Bestavros [5] uses a collection of Web client traces and studies how effectively a user's future Web accesses can be predicted from his or her past Web accesses. They show that a number of models work well and can be used in prefetching.
We found that these early studies do not answer our questions about the performance of prefetching between clients and proxies. First, the studies investigate prefetching between user clients and Web servers; caching proxies were not considered or modelled. There are reasons to believe that a proxy's capability to predict users' Web accesses is different from the server's. Since a proxy only sees the Web requests to its users, its ability to predict is limited by the smaller sampling size. On the other hand, a proxy can predict documents across Web servers, increasing its chances at reducing client latency. Second, most of the studies use Web server traces which do not accurately capture user idle time. From a particular Web server's point of view, a user's accesses to other Web servers appear as idle time. Thus, it is difficult to estimate from Web server traces the true user idle time that can be exploited to push documents to users over the modem lines. Lastly, Cuhan and Bestavros [5] uses the trace of client requests that are not filtered by the browser cache; in practice, a proxy only sees requests after they are filtered by the browser cache. The presence of a browser cache often diminish the effect of prefetching: a predicted document may already be in cache and there is no need to prefetch it. Thus, it is not clear whether the conclusions from this particular study are applicable to our investigation.
As proxies become more widespread, a number of studies investigated prefetching between Web servers and proxies [14,16,10,11]. Kroeger el al [14] investigates the performance limits of prefetching between Web servers and proxies, and shows that combining perfect caching and perfect prefetching at the proxies can at best reduce the client latency by 60% for relatively high-bandwidth clients (i.e. those that accesses the Web through a LAN instead of modems). Markatos and Chronaki [16] proposes that Web servers regularly push their most popular documents to Web proxies, which then push those documents to the clients. They evaluate the performance of the strategy using several Web server traces and find that the technique can anticipate more than 40% of a client's request. The technique requires cooperation from the Web servers. Furthermore, since the server traces do not capture each user's idle time, the study did not evaluate the client latency reduction resulted from the technique. Wcol [10] is a proxy software that prefetches documents from the Web servers. It parses HTML files and prefetches links and embedded images. The proxy, however, does not pre-push the documents to the clients. Finally, Gwertzman and Seltzer [11,12] discussed a technique called Geographical Push-Caching where a Web server selectively sends its documents to the caches that are closest to its clients. The focus of the study is on deriving reasonably accurate network topology information and using the information to select caches.
In terms of prefetching between proxies and clients, Loon and Bharghavan [15] has proposed the same idea as we described here. However, the study did not address the potential and limits of the approach, or which prediction algorithms are the most successful. Rather, the study presents a design and an implementation of a proxy system that performs the prepushing. In this paper, we use recent, large-scale modem client traces to explore the design space and evaluate the tradeoffs in pre-pushing.
Outside the Web contexts, prefetching has been studied extensively in file systems and memory systems. Several studies investigate application-controlled prefetching in the file system contexts [3,4,13,19], where an application gives prediction of what it might access next. Other studies also investigate the speculative approach to file prefetching [9,20,18]. The prediction algorithms used there are also derived from data compressors, while the compressor and the parameters vary.
Most of the algorithms, including the one studies here, are inspired by an important early study by Krishnan and Vitter demonstrating the relationship between data compression and prediction [21,6]. Their study shows that if the source behind a reference string can be modelled as an M-order Markov source, then the compressor-driven prediction algorithm will converge to the optimal predictor given a long enough reference string. Our algorithms extend the algorithms described in [21] and [6] by considering prediction depth of more than 1, that is, not only predicting which documents might be used next, but also predicting which documents will be used after the immediate next ones. Our experience show that for best performance, the prediction depth needs to be larger than 1.