As the World Wide Web continues to grow, caching proxies become a critical component to handle Web traffic and reduce both network traffic and client latency. However, despite its importance, there has been little understanding of how different proxy servers perform compared with each other, and how they behave under different workloads. The design and implementation of a proxy benchmark is the first step to allow one easily to test and understand the performance characteristics of a proxy server. A benchmark allows customers not only to test the performance of a proxy running on different software and hardware platforms, but also to compare different proxy implementations and choose one that best matches the customer's requirements. The Wisconsin Proxy Benchmark (WPB) has been developed in an attempt to provide a tool to analyze and predict performance of different proxy products in real-life situations.
The main feature of WPB is that it tries to replicate the workload characteristics found in real-life Web proxy traces. WPB consists of Web client and Web server processes. First, it generate server responses who sizes follow the heavy tailed Pareto distribution described in [4]. In other words, it includes very large files with a non-negligible probability. This is important because heavy-tail distribution of file sizes does impact proxy behavior. as it must handle (and store in the local cache) files with a wide range of sizes. Second, the benchmark generate a request stream that has the same temporal locality as those found in real proxy traces. Studies have shown that the probability that a document is requested t requests after the last request to it is proportional to 1/t [8,3]. The benchmark replicates the probability distribution and measures the hit ratio of the proxy cache. Third, the benchmark emulates Web server latency by letting the server process delay sending back responses to the proxy. This is because the benchmark is often run in a local area network, and there is no natural way to incur long latencies when fetching documents from the servers. However, Web server latencies affect the resource requirements at the proxy system, particular network descriptors, and must be modelled. Thus, the benchmark supports configurable server latencies in testing proxy systems.
The main performance data collected by the benchmark are latency, proxy hit ratio and byte hit ratio, and number of client errors. There is no single performance number since different environments weight the four performance metrics differently. Proxy throughput is estimated by dividing the request rate by the request latency.
Using the benchmark, we compare the performance of four popular proxy servers - Apache, Cern, Squid and a Cern-derived commercial proxy - running on the same hardware and software platforms. We find that caching incur significant overhead in terms of client latency in all proxy systems. We also find that the different implementation styles of the proxy software result in quite different performance characteristics, including hit ratio, client latency and client connection errors. In addition, the proxy softwares stress the CPU and disks differently.
We then use the benchmark to analyze the impact of adding one extra disk on the overall performance of proxies. We can only experiment with Squid and the Cern-derived proxy as neither Apache nor Cern allows one to spread the cache storage over multiple disks. These results show that disk is the main bottleneck during the operating of busy proxies. However, although this bottleneck is reduced when one extra disk is added to the system, the overall proxy performance does not improve as much as we expected. In fact, Squid's performance remains about the same.
Finally, using an client machine that can emulate multiple modem connections [2], we analyze the behavior of the four proxies when they must handle requests sent by clients connected through very low bandwidth connections. These results show that, in this case, transmission delays are the main component of latency and the low bandwidth effect clearly dominates the overall performance. As a consequence, client latency increases by more than a fact of two, and caching does not reduce client latency significantly.
This paper is organized as follows. Section 2 presents a detailed description of the design and implementation of WPB. Section 3 shows a performance comparison of four popular proxy servers. Section 5 and Section 6 give some insight on the effect of multiple disks and low bandwidth connections on proxy performance. Section 7 discusses related work. Finally, section 8 shows the conclusions and future work.