Applications of Static Analysis and Program Structure in Statistical Debugging
Piramanayagam Arumuga Nainar
Abstract
Software testing is costly, time-consuming, and often incomplete.
Statistical debugging is a new domain of research that extends the
testing phase into deployment. Post-deployment monitoring and
statistical analyses are used to find program behaviors, called
predicates, that are correlated with failures. Prior works rely
on dynamic analysis, which focuses solely on the runtime behavior of
programs, for bug isolation. This dissertation demonstrates the use of
static analysis, a counterpart of dynamic analysis, to improve
statistical debugging. Static analysis discovers a program properties
without running it.
The contributions are evaluated in the context of the Cooperative Bug
Isolation (CBI) project that studies statistical debugging techniques.
Predicates instrumented by CBI test the program's state at particular
program locations. At a high level, predicates are considered useful
for fault localization if the set of executions in which they are
observed true closely matches the set of failed executions. However,
complex bugs manifest only when multiple factors co-occur. Simple
predicates may not be accurate predictors of complex bugs. We propose
that compound predicates, which are propositional combinations of simple
predicates, could be better failure predictors. Compound predicates are
the most accurate predicates of failure in 93% of our experiments. We
use offline estimation, mathematical upper-bounds on failure
predictivity, and static program dependences to tractably incorporate
compound predicates into statistical debugging. These optimizations
reduce analysis time from 20 minutes to just 1 minute.
Statistical debugging searches for needles in a haystack: over 99.996%
of predicates are not failure predictive. The CPU, network, and storage
resources used to collect these predicates could be better utilized. We
develop an adaptive bug isolation technique that uses static program
dependences and prior feedback to selectively monitor those predicates
that are likely to be useful. Predicates that are irrelevant to failure
are never instrumented, or are removed after their relevance is
ascertained. We characterize this adaptive predicate selection as a
forward analysis on the program-dependence graph. We also develop a
backward analysis that uses a crash location as a starting point. Our
approach finds the best predicate found by complete instrumentation
after exploring, on average, just 40% of the program locations. More
importantly, very few locations are instrumented at any time, yielding
imperceptibly low overheads of less than 1%.
While debugging, a developer first uses the list of predicates in CBI's
output to find root causes of failures, and then develops a fix. We aid
the developer in the first task by augmenting CBI's output with
source-level changes that are likely causes of failures. We extend the
well-studied problem of change impact analysis to compute the likelihood
that a source-level change impacts a program location. Our technique,
called conditional coverage profiles, uses symbolic evaluation, the
program's control-flow graph, and runtime profiles to narrow the scope
of impacted program locations by an order of magnitude compared to prior
work. It identifies failure-inducing changes with a precision of 89%
and a recall of 55%.
Our contributions, while seemingly orthogonal in their goals, are
unified by their application of static analysis to connect dynamic
behavior. Static program structure is used to prune less useful
compound predicates, adaptively instrument likely bug predictors, and
associate predicates with failure-inducing changes. Overall, this
dissertation improves the monitoring efficiency and fault localization
of the state-of-the-art in statistical debugging.
Full text: pdf
Suggested Citation: BibTeX
Back to Piramanayagam's home page
|