MFC

Remote Profiling of Resource Constraints in Web Servers Using Mini-Flash Crowds [pdf]


Pratap Ramamurthy, Vyas Sekar, Aditya Akella, Balachander Krishnamurthy and Anees Shaikh.
Usenix Annual Technical Conference 2008, Boston, MA.
I'm going to USENIX '08

Abstract: Unexpected surges in Web request traffic can exercise server-side resources (e.g., access bandwidth, processing, storage etc.) in undesirable ways. Administrators today do not have requisite tools to understand the impact of such "flash crowds" on the their servers. Most Web servers either rely on over-provisioning and admission control, or use potentially expensive solutions like CDNs, to ensure high availability in the face of flash crowds. A more fine-grained understanding of the performance of individual server resources under emulated but realistic and controlled flash crowd-like conditions can aid administrators to make more efficient resource management decisions. In this paper, we present miniflash crowds (MFC) — a light-weight profiling service that reveals resource bottlenecks in a Web server infrastructure. MFC uses a set of controlled probes where an increasing number of distributed clients make synchronized requests that exercise specific resources or portions of a remote Web server. We carried out controlled labbased tests and experiments in collaboration with operators of production servers. We show that our approach can faithfully track the impact of request loads on different server resources and provide useful insights to server operators on the constraints of different components of their infrastructure. We also present results from a measurement study of the provisioning of several hundred popular Web servers, a few hundred Web servers of startup companies, and about hundred phishing servers.

Google Techtalk by Aditya Akella: watch video here (Opens in a new window)

Do you want us to test your Web server?

If you would like to test your server, please email me with the following details
  • URL of the webserver (example "http://pages.cs.wisc.edu/")
  • IP address of the server
  • Any specific objects that you would like us to use (example "http://pages.cs.wisc.edu/~pratap/index.html")
  • Cooperative/Black-box: Please let me know, if there will be any cooperation from the webmaster. If so, can you make your logs (anonymized if necessary) available? Can you briefly describe your server configuration?
  • Response time threshold (by default we stop the experiment once we see a 100ms increase in median response time
  • Is your website crawler friendly?
  • Date and time of testing. Some prefer to run the experiment at night time so that, the background traffic is minimal. Others would like to monitor the stats in real-time, so prefer daytime.