This page contains full formulations and more complete examples for the paper ‘‘Recovering Execution Data from Incomplete Observations,’’ published in the 13th International Workshop on Dynamic Analysis (WODA 2015).

The example problem (from the WODA paper) is as follows. First, we are given the following control-flow graph (CFG):

The remainder of our failure report is as follows:

The goal of problem is to compute three sets: (containing those nodes that occur on all paths through our CFG consistent with the failure report), (containing those nodes that occur on no paths matching the failure report), and (containing those nodes that occur on some but not all such paths). The following two sections provide the full formulations to solving this problem that we describe in the WODA paper.

This section provides the full collection of automata used for Section 2.1 of the paper.

The first collection of automata encode constraints from the failure report example from the paper. The intersection of these automata forms , an automaton that accepts only strings corresponding to executions that match the provided failure report.

This first automaton is a direct encoding of our example Control-Flow Graph (CFG). Note that each CFG edge contributes one transition to the automaton, labeled with the target node's symbol.

This second automaton encodes the crashing requirement. Specifically, it forces accepted executions to end with CFG node .

The third automaton encodes the constraints. That is, it
forces accepted executions to *not* contain CFG nodes in . In
this case, it ensures that no matched execution contains CFG node .

The final automaton encodes an constraint. For our example from the paper, there is only one such constraint. However, in the general case, we would require one automaton for each vector contained in . In this case, we are encoding the constraint that all matched executions must pass through CFG node , followed by CFG node , followed by another occurrence of .

To actually compute , , and ,
we first create two automata for each node in our control-flow graph, .
Checking the intersection of each automaton (independently) with for
emptiness indicates whether the respective node is *possible* and/or
*necessary*. Here, we give the automata corresponding to probes for node .

The first, corresponds to the automaton
for the single-element vector, . If
, then is *possible*.

The second automaton, corresponds to the automaton for
. If
, then is *necessary*.

If a node, is necessary, then . If is not possible, then . Otherwise, .

This section provides the full formulations for the logic programs used in Section 2.2 of the paper.

The failure report example described in the paper is here encoded in ASP format:

% CFG representation entry(a). node(a). node(b). node(c). node(d). node(e). node(f). node(g). node(h). node(i). node(j). node(k). edge(a, b). edge(b, c). edge(b, j). edge(c, d). edge(c, i). edge(d, e). edge(d, f). edge(e, h). edge(f, g). edge(f, h). edge(g, h). edge(h, i). edge(i, b). edge(j, k). % Observations crash(j). anyObsYes(b). anyObsYes(c). order(b, c). order(c, b). obsNo(f).

The base encoding of the failure report consists of several rules which specify how paths through the described graph function. To do so, we introduce the concept of (which in the encoding means any node which is visited by a possible path of execution) and (here meaning that two nodes are connected during a possible path of execution). These concepts are encoded as follows:

0 { visited(X) } 1 :- node(X). visited(X) :- entry(X). visited(X) :- crash(X). -visited(X) :- obsNo(X). visited(X) :- anyObsYes(X).

Roughly, any node can *potentially* be ; the
entry and crash nodes (as well as any in )
are, by definiton, . Any node observed to,
specifically, **not** be visited (i.e., those in )
are explicitly not .

connected(X, X) :- visited(X). connected(X, Z) :- visited(X), edge(X, Y), connected(Y, Z).

A node is to itself if (and only if) it is ; as well, inductively, a node is to another node if it is both and has an edge to another node which is to the specified node.

To actually compute , , and , we conjoin our complete set of constraints from a failure report with two probes. This section describes that process, and what SAT and UNSAT mean in each context.

To define in our ASP encoding, we first define
a new concept, , which tests if two
nodes are **without** the path of execution
passing through a specific third node.

connected_excluding(X, Y, E) :- node(E), edge(X, Y), E != Y, E != X, not obsNo(Y). connected_excluding(X, Z, E) :- connected_excluding(X, Y, E), connected_excluding(Y, Z, E). connected_excluding(X, X, E) :- node(E), node(X), X != E.

With this concept defined, defining within ASP becomes straightforward:

exeYes(X) :- crash(X). exeYes(X) :- entry(X). exeYes(X) :- obsYes(X).

The crash node, entry node, and any explicitly observed nodes are contained in .

exeYes(X) :- exeYes(O), crash(C), node(X), not connected_excluding(O, C, X). exeYes(X) :- entry(E), exeYes(O), node(X), not connected_excluding(E, O, X).

All possible paths of execution through the program must start at
the entry node and end at the crash node. Any additional nodes
we know to have executed–whether through explicit observation or
deduction–must then have a path both from the entry and to the crash.
As such, any nodes that such a path **must** pass through are also
known to have been executed.

The known ordering of nodes during a path also figure in to the encoding:

exeYes(X) :- order(Y, Z), node(X), not connected_excluding(Y, Z, X). exeYes(X) :- node(X), order(A, B), order(C, D), A != C, not connected_excluding(B, C, X), not connected_excluding(D, A, X).

Roughly, any node that must be executed when two nodes known to have been executed in a specific order are in , as well as any which must be executed to connect multiple ordered nodes together.

The calculation of membership of is somewhat simpler:

pospair_from(X, O) :- connected(O, X), connected(X, C), crash(C), exeYes(O). pospair_to(X, O) :- entry(E), connected(E, X), connected(X, O), exeYes(O). exeNo(X) :- node(X), exeYes(O), not pospair_to(X, O), not pospair_from(X, O).

If, for any node known to be executed, we cannot find a path
from the entry to that node through a specified node **and** we
cannot find a path from the known-executed node through the
specified node to the crash, that specified node cannot have
been executed.

If we know and , then is a simple choice of “none of the above”:

maybe(X) :- node(X), not exeNo(X), not exeYes(X).

The ASP encoding of the failure report allows easy direct queries of the
membership of the defined sets by adding the membership of a node to a
proposed set as a fact along with the encoding of the observations. For
example, if we wish to directly test , we
can add `exeNo(G)` to the input to the ASP grounder, and then determine
its truth by the SAT solver's output: a result of SAT tells us our
assertion was true with UNSAT proving our assertion false.