Active Cache: Caching Dynamic Documents

Why Active Cache

Today's Web contents are becoming more dynamic. Web servers routinely set cookies, change advertizing banners upon each user request, answer queries, tailor news information for each individual user, and perform many other tasks to make their content interesting and to learn about their viewers. Recent studies have shown that over 30% of user requests carry cookies. In addition, an increasing number of popular HTML documents are tailored for each client and dynamically generated (for example, the page).

As the Web becomes dynamic, current proxy caches lose their capabilities to save bandwidth. Existing caches can only handle static documents. They cache HTTP responses as datagrams, and forward them upon request. There is no support for access control from the server[*], content variation upon each request, or per-client state (i.e. cookies). A Web server essentially loses control of a document once it is cached, and has to resort to ``cache busting'' to keep the control. With the trend toward more dynamic and more client-tailored contents, the percentage of contents that a proxy can cache dwindles.

Caching has been a key technology at solving the inherent conflicts between the tremendous growth of the Web and the slowly improving infrastructure of the Internet. If one does not want the World-Wide Web keep being the ``World-Wide Wait,'' one must find ways to cache dynamic contents. The key in caching dynamic contents is to not treat documents as datagrams, but rather as objects with specific processing upon request.

What is Active Cache

The WisWeb group present Active Cache, a novel proxy caching paradigm that allows a server to migrate part of its processing for each user request to the proxy.

A Web server can attach a Java application, which we call a ``cache applet,'' with a document. The role of the cache applet is to be invoked when there is a cache hit to the document. In other words, if a proxy wants to cache the document, it should fetch the corresponding cache applet. When a user request hits on the cached copy and the proxy would like to service the request, the proxy must invoke the cache applet with the user request and other information as arguments. The cache applet then decides what the proxy will send back to the user:

Furthermore, the applet can deposit information in a log object, which is sent back to the server periodically and when the applet or the document is purged from the cache.

Cache applets allow servers to obtain the benefit of proxy caching without losing the capability to track user accesses and tailor the content presentation dynamically. They can perform a variety of functions, for example, logging user accesses, rotating advertising banners, checking access permissions, constructing client-specific Web pages, etc. They also enables proxies to be more than just caches of static information, but rather caches of objects, i.e., data with a method that is invoked when the data is supplied from caches. In essence, they turn Web documents from datagram to objects.

The proxy, when caching a document with cache applets, has the full freedom to not invoke the applet but send the user request directly to the server. The proxy promises to not send back a cached copy of a document without invoking the corresponding cache applet. On the other hand, if a document is cached but the corresponding applet consumes too much resource, the proxy can simply send the request to the Web server. Furthermore, just as the proxy is not obligated to cache any document, it is also not obligated to cache any applet. The proxy agrees to not service a cache hit if the corresponding applet is not in cache.

The Active Cache Protocol

Example Applications

Where Can I Get a Prototype

The WisWeb research group have implemented a prototype of Active Cache proxy in the CERN httpd proxy daemon implementation. Download a copy of ActiveCache1.0 here.

Example Cache Applets

Pei Cao