Back to index
Finding latent performance bugs in systems implementations
Charles Killian, Karthik Nagaraj, Salman Pervez, Ryan Braud, James W. Anderson, and Ranjit Jhala
Purdue University and UCSD
One-line Summary
MACE Performance Checker is a model checking tool to detect latent performance bugs in the event-based distributed systems.
Overview/Main Points
- Goal: detect latent performance bugs in corner cases of an atomic event-driven state machine model
- High level idea: explore different timing to expose latent bugs
- How to identify a latent bug?
- What is provided by developers?
- explore different timings: use a per-node clock ‘node, event, start, duration’
- The algorithm
- Inputs
- event queue: ‘node, event, start’
- timedEvent queue: ‘node, event, start, duration’
- realTimes queue: ‘event, realTime’
- Training
- Event Duration Distributions (EDD) training
- Overall performance training
- Abnormal detection
- Divergence detection
- Frequency correlation
- Performance problems in distributed systems
Relevance
Flaws