The predictor uses the history structure to identify the set of pages a user is likely to access in the near future. It looks at a user's recent m accesses. For the sequence of the last l requests, where l=m,...,1, it finds the corresponding tree and the node in the history structure. The predictor then follows all paths from that node in the tree for k levels down to calculate the relative probabilities of those pages being accessed in the near future. This produces a list of candidate pages for prefetching. Each list first sorts its pages by their probabilities, then deletes the pages whose probability is below certain threshold t, where .Finally, the lists are merged by putting lists corresponding to longer sequences first.
The final list is then sent to the proxy. The proxy pushes the documents on the lists in the order specified as long as the modem link is idle. When a new request from the user arrive, the ongoing push (if any) is stopped. A new round of prediction and push then starts again.
Clearly, there are three parameters for the prediction mechanism:
The algorithm is similar and yet different from previous proposed prefetching algorithms. Papadumanta and Mogul [17] studies the same algorithm, but with m always equal to 1. Pk and Vitter [6] also studies the PPM-based algorithm as one of their prefetching algorithms, but always have l equal to 1. The reason for considering algorithms with m>1 is that more context can help improve the accuracy of the prediction. The reason for considering algorithms with l>1 is that sometimes, a URL may not be always requested as the immediate next request after another URL, but rather within the next few requests. Thus, in some sense, our algorithm is the more comprehensive version among the existing algorithms.