There are a couple undesired effects if the history structure is allowed to continuously grow. First, the structure will grow very large which makes implementing the prefetch engine more costly. Second, users access patterns change over time and the history should dynamically adapt. Both of these issues are solved by what we refer to as aging.
All the nodes in the tree have a count, and we add to them a time of the last update. Periodically, in our case every hour, we initiate a pruning of the tree. When the tree is pruned all leaf nodes in the history structure are checked. For every eight hours (the threshold time used is arbitrary) of not being updated we half the count value by shifting the value right by one. We propagate these changes up by having the count of internal nodes be equal to the count of their immediate children. When a nodes count reaches zero we delete the node.
Choosing the threshold is complicated. If we make the threshold time used for pruning to short we will remove information from the tree that is still relevant and important for making predictions. If we make the threshold to long we have the problem that old information stays around to long. Old information can be a problem because we may issue a prefix for a page based on old information, or we may fail to issue a good prefetch for a page because with the old information the new dominate page to prefetch does not meet the prefetch threshold.
In case aging cannot remove enough history nodes to keep the structure under certain size, a ``pruning'' mechanism is triggered. The nodes that are least-recently traversed are replaced until the history structure is small enough.