Next: References Up: Measuring Proxy Performance with Previous: Related Work

Conclusion

In this paper, we have described the design of the Wisconsin Proxy Benchmark and the result of using the benchmark to compare four proxy implementations. We also use the benchmark to study the effect of extra disk arms and modem client connections.

Our main findings are the following:

Disk is the main bottleneck during the operation of busy proxies: disk is busy up to 90% of time while CPU is idle for more 70% of time. Adding an extra disk reduces the bottleneck in the disk. However, for Squid, this reduction did not reflect in an improvement in the overall performance of the proxy. For proxy N, an improvement of 10% was achieved.
When a proxy must handle requests sent throught very low bandwidth connections, the time spent in the network dominates. Both disk and cpu remains idle for more than 70% of time. As a consequence, proxy throughtput decreases and client latency increases by more than a factor of two.
The performances of Cern and Squid are comparable, despite their vast differences in implementations. Squid mainly suffers from not being able to use the extra processor in the multi-processor system. Cern, on the other hand, uses a process-based structure and utilize two processors. In addition, CERN takes advantage of the file buffer cache, which seems to perform reasonably well.
Process-based proxy implementations must take care to avoid client connection errors. Both Cern and proxy N (Cern derived proxy) can not handle all requests and many errors occur. These errors reflect the fact that many requests were dropped due to overflow in the pending connection queue. For proxy N, this can be explained by the fact that the number of proxy processes handling all requests is fixed. For Cern, the overhead of forking a new process for each request may be an explanation. The master process (responsible for spawning new processes) can not handle all the requests. On the other hand, no error was observed for Apache. The dynamical adjustment in Apache on the number of processes seems to be successful. Squid does not suffer from this problem because of its event-driven architecture.
In terms of latency, Apache has the worst performance, probably due to the two-phase store, that introduces extra overhead. Proxy N has a slightly better performance overall. However, this may be a consequence of the great number of errors. Since a smaller number of requests are effectively handled, delays due to contention are reduced.
In terms of hit ratios, Squid and Apache maintains roughly constant hit ratios across the load. For both Cern and proxy N, hit ratio decreases significantly as the number of client increases.

Clearly, much more work remains to be done. First of all, the Wisconsin Proxy Benchmarks should model spatial locality and HTTP 1.1 persisten connections. We are currently working on this issue. Second, the performance of Squid is baffling and we are instrumenting the code to gain a better understanding. Third, we need to perform more experiments to better understand the impact of slow modem client connections and ways to improve proxy performance in those contexts. Lastly, we plan to investigate the effect of application-level main-memory caching for hot documents and its proper implementation (e.g., avoid double buffering with the operating system's file buffer cache).

Next: References Up: Measuring Proxy Performance with Previous: Related Work

Pei Cao
4/13/1998