Ever since the World-Wide Web becomes World-Wide Wait, reducing client latency has been among the primary concerns of the Internet research and development community. For a majority of the Internet population, who access the World-Wide Web through modem connections, the low modem bandwidth is a primary contribution to the client latency. For example, studies using commercial ISP traces shows that even if a Web proxy caching system is employed and has perfect performance, the client latency can only be reduced by 3% to 4% [7]. Other studies have shown that when image distillation techniques is used to reduce the document sizes, the client latency can be improved by a factor of 4 to 5, further demonstrating that the slow modem links are the bottleneck.
In this paper, we study one technique to reduce latency for modem users: prefetching between caching proxies and clients. The proxy, being exposed to the Web accesses of multiple users, can often predict what documents a user will access next. The modem link to the user often has idle periods as the user is reading the current Web document. Thus, a proxy can utilize the idle periods to ``push'' documents to the user's machine. If the user accesses any of the pushed documents, the latency perceived by the user is reduced. Since the proxy pushes the documents to the user side instead of the client fetching the documents from the proxy, we called this technique proxy-side pre-push. In the rest of the paper, we use the terms ``pre-push'' and ``prefetch'' interchangably.
We focus on the potential, limits and prediction mechanisms of the technique in reducing client latency. Using traces of modem users' Web accesses, we first study the upper bound on latency reduction from pre-push, assuming a perfect predictor at the proxy side. We analyze how this upper bound is determined by the inherent idle time distribution between requests from the same user, modem bandwidth, and client-size buffers.
We then study the prediction mechanisms for the pre-push approach. We investigate a family of prediction algorithms that are based on the Prediction by Partial Matching (PPM) data compression algorithm. We analyze the accuracy and bandwidth wastes of the algorithms as the algorithms' parameters vary. We also evaluate the impact of practical constraints such as pruning the history structure and interacting with the client-side buffers.
Finally, we evaluate the latency reduction under the prediction algorithms using trace-driven simulation. Our simulator takes into account of proxy latency and user request sequence. Our results show that with perfect predictors proxy-based Web pre-pushing can reduce user observed latency by over 20% and with real predictors can reduce user observed latency by nearly 10%.