Characterization of Web Proxy Traffic and Wisconsin Proxy Benchmark 2.0
(Position Paper for W3C Web Characterization Workshop)

Pei Cao
Department of Computer Sciences
University of Wisconsin, Madison
1210 West Dayton Street
Madison, WI 53706 USA
cao@cs.wisc.edu

Introduction

The WisWeb research group at University of Wisconsin-Madison are focusing on two aspects of Web characterization: study of Web proxy traffic and building a Web proxy benchmark. This paper reports our progress and current plans.

Analysis of Web Proxy Traffic

Using six traces from proxies at academic institutions, corporations and ISPs, we have studied a range of characteristics of requests seen by the proxies. The traces include a 26-day proxy log from DEC, a 19-day trace from UC Berkeley, a three-month trace from CS Dept. in Universita di Pisa, Italy, a 7-day trace from Questnet (which operates parent proxies serving child proxies in Australia), a one-day log from NLANR's proxies, and a 10-day log from FUNET, a regional ISP for academic and research communities in Finland. Our main findings are:

These results are reported in more detail in our paper ``Web Caching and Zipf-like Distributions: Evidence and Implications'', available at http://www.cs.wisc.edu/ cao/papers/zipf-implications.html. Due to space limitation we do not elaborate further here.

Building a Proxy Benchmark

We have developed a simple proxy benchmark called Wisconsin Proxy Benchmark (WPB) 1.0 in fall 1997, and used it to compare a variety of commercial and free-ware proxy software [1]. The benchmark has also been used by others in measuring proxy performance and projecting the performance benefits of proxy caching. The benchmark emulates server delays and models temporal locality in the request stream. However, the use of the benchmark also exposed its weaknesses, including the overhead at client end, failure to model persistent connections and HTTP 1.1, and failure to capture spatial locality and URL path length.

We are in the process of developing Wisconsin Proxy Benchmark (WPB) 2.0. It uses the core engine of httperf [3], a very lightweight Web server benchmarking and measurement tool. The benchmark already supports persistent connection and HTTP 1.1, and supports trace replay with as much accuracy as possible at user level. We are in the process of adding in temporal locality, spatial locality and a variety of other features described below.

Requirements of a Proxy Benchmark

Through our experience of using WPB 1.0 to compare proxy products, we find that a proxy benchmark should at least reflect the following characteristics of real-life proxy traffic:

Finally, the benchmark should measure not only the client latency, outgoing traffic, errors, but also fairness of the proxy. We have seen that process-based proxies can introduce significant unfairness in client latency, whereas event-driven proxies such as Squid treat requests much more fairly.

Wisconsin Proxy Benchmark (WPB) 2.0

We are in the process of constructing WPB 2.0, consisting of a client-side code and a server-side code. All request generation and distribution fittings are done at the client-side code. In other words, the client code generates a request, sets its URL, then generates its response status code, type, size and latency. The server part of the benchmark is a simple pseudo-server that generates a number of random bytes with the specified status code, document type and size, and emulates packet delays based on the specified latency.

Our client and server codes are built through modifications of the httperf tool. httperf is extremely lightweight, using no threads or processes, but rather using an event-driven architecture. It handles various scalability bottlenecks at client side, including limitations of enumpheral ports (see  [3]). It supports persistent connections and HTTP 1.1 range requests. The original httperf implements only the client part. We have changed httperf extensively to provide a server counterpart.

Our benchmark can replay proxy logs faithfully. The client-side code can read the trace and generate a request carrying specifications of size, latency etc. The server-side code then responds properly.

The trace replay tool offers a valuable service to any institution wanting to evaluate the benefit of caching proxies. The institution can replay a portion from their log and immediately obtain numbers such as user latency reduction and Internet traffic reduction.

We are now working on the modeling part of the client-side code, hoping to incorporate all of the items listed above.

Summary

We have described our current results through analyzing six Web proxy traces and our plan on building the next version of the Wisconsin Proxy Benchmark. A few data items needed to build a realistic benchmark are still missing, including the average URL path component length, average number of requests serviced by persistent connections, the percentage of persistent connections, etc. New traces that can provide such information would be highly appreciated.

References

1
Jussara Almeida and Pei Cao.
Measuring proxy performance with the wisconsin proxy benchmark.
Technical report, Technical Report 1373, Computer Science Department, Unive rsity of Wisconsin-Madison; Presented at the 3rd Web Caching Workshop, February 1998.
URL http://www.cs.wisc.edu/ cao/papers/cao-wpb/index.html.

2
Paul Barford and Mark Corvella.
Generating representative web workloads for network and server performance evaluation.
In Proceedings of the SIGMETRICS/Performance'98, June 1998.
Can be found at http://www.cs.bu.edu/faculty/crovella/papers.html.

3
David Mosberger and Tai Jin.
httperf--a tool for measuring web server performance.
In Proceedings of the 1998 SIGMETRICS Workshop on Internet Ser ver Performance, June 1998.
URL http://www.cs.wisc.edu/ cao/WISP98/html-versions/davidm/httpe rf/index.html.

4
Luigi Rizzo and Lorenzo Vicisano.
Replacement policies for a proxy cache.
Technical Report RN/98/13, University College London, Department of Computer Science, Gower Street, London WC1E 6BT, UK, 1998.
http://www.iet.unipi.it/~luigi/caching.ps.gz.

5
Lixia Zhang.
Personal communication, 1998.

About this document ...

Characterization of Web Proxy Traffic and Wisconsin Proxy Benchmark 2.0
(Position Paper for W3C Web Characterization Workshop)

This document was generated using the LaTeX2HTML translator Version 97.1 (release) (July 13th, 1997)

Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html -no_navigation -split 0 position.tex.

The translation was initiated by Pei Cao on 10/12/1998


Pei Cao
10/12/1998