Computer Sciences Dept.

Cristian Estan

Thumbnail portrait
New Directions in Traffic Measurement and Accounting: Focusing on the Elephants, Ignoring the Mice
Cristian Estan, George Varghese
ACM Transactions on Computer Systems, August 2003

Accurate network traffic measurement is required for accounting, bandwidth provisioning and detecting DoS attacks. These applications see the traffic as a collection of flows they need to measure. As link speeds and the number of flows increase, keeping a counter for each flow is too expensive (using SRAM) or slow (using DRAM). The current state-of-the-art methods (Cisco's sampled NetFlow) which count periodically sampled packets are slow, inaccurate and resource-intensive. Previous work showed that at different granularities a small number of ``heavy hitters'' accounts for a large share of traffic. Our paper introduces a paradigm shift by concentrating the measurement process on large flows only -- those above some threshold such as 0.1% of the link capacity.

We propose two novel and scalable algorithms for identifying the large flows: sample and hold, and multistage filters, which take a constant number of memory references per packet and use a small amount of memory. If M is the available memory, we show analytically that the errors of our new algorithms are proportional to 1/M; by contrast, the error of an algorithm based on classical sampling is proportional to 1/sqrt(M), thus providing much less accuracy for the same amount of memory. We also describe further optimizations such as early removal and conservative update that further improve the accuracy of our algorithms, as measured on real traffic traces, by an order of magnitude. Our schemes allow a new form of accounting called threshold accounting in which only flows above a threshold are charged by usage while the rest are charged a fixed fee. Threshold accounting generalizes usage-based and duration based pricing.

Paper in PDF and Postscript. The conference version of this paper is shorter and it also has a presentation. The technical report version of this paper covers a few more issues, but is less polished.

 
Computer Sciences | UW Home