next up previous
Next: Server Processes Up: Wisconsin Proxy Benchmark Previous: Master Process

Client Processes

The client process runs on the client machine and issues HTTP requests one after another with no thinking time in between. This means that clients send requests as fast as the proxy can handle them. The client process takes the following parameters as command line arguments: URL address of the proxy server (e.g. cowb05.cs.wisc.edu:3128/), number of HTTP requests to issue, seed for the random number generator, and name of the configuration file specifying the Web servers to which the client should send requests. Currently, the clients send HTTP requests in the format of ``GET http://server_name:port_number/dummy[filenum].html HTTP/1.0'', for example, ``GET http://cowb06.cs.wisc.edu:8005/dummy356.html''. The server_name, port_number and filenum vary from requests to requests. Clearly, our client code does not yet include other types of HTTP requests or HTTP 1.1 persistent connection. We plan to fix it soon, once we learn more about the typical mix of HTTP requests from the clients and the characteristics of most persistent connections.

The client process varies the server_name, port_number and filenum of each request so that the request stream has a particular inherent hit ratio and follows the temporal locality pattern observed in most proxy traces. The client process sends requests in two stages. During the first stage, the client sends N requests, where N is the command line argument specifying the number of requests need to be sent. For each request, the client picks a random server, picks a random port at the server, and sends an HTTP request with the filenum increasing from 1 to N. Thus, during the first stage there is no cache hit in the request stream, since the file number increases from 1 to N. These requests serve to populate the cache, and also stress the cache replacement mechanisms in the proxy. The requests are all recorded in an array that is used in the second stage.

During the second stage, the client also sends N requests, but for each request, it picks a random number and takes different actions depending on the random number. If the number is higher than a certain constant, a new request is issued. If the number is lower than the constant, the client re-issues a request that it has issued before. Thus, the constant is the inherent hit ratio in the request stream. If the client needs to re-issue an old request, it chooses the request it issue t requests ago with probability proportional to $\frac{1}{t}$. More specifically, the client program maintains the sum of $\frac{1}{t}$ for t from 1 to the number of requests issued (call it S). Everytime, it has to issue an old request, it picks a random number from 0 to 1 (call it r), calculates r*S, and chooses t where ${\displaystyle \sum_{i=1}^{t-1} \frac{1}{i}} \;\; < \;\; r*S \;\;
< \;\; {\displaystyle \sum_{i=1}^{t} \frac{1}{i}} $. In essence, t is chosen with probability $\frac{1}{S*t}$.

The above temporal locality pattern is chosen based on a number of studies on the locality in Web access streams seen by the proxy. (We have inspected the locality curves of the requests generated by our code and found it to be similar to those obtained from traces. ) Note here that we only capture temporal locality, and do not model spatial locality at all. We plan to include spatial locality models when we have more information.

Finally, the inherent hit ratio in the second stage of requests can be specified in the configuration file. The default value is 50%.


next up previous
Next: Server Processes Up: Wisconsin Proxy Benchmark Previous: Master Process
Pei Cao
4/13/1998