Execution Recovery Examples and Formulations

This page contains full formulations and more complete examples for the paper ‘‘Recovering Execution Data from Incomplete Observations,’’ published in the 13th International Workshop on Dynamic Analysis (WODA 2015).

Example Problem

The example problem (from the WODA paper) is as follows. First, we are given the following control-flow graph (CFG):

Example control-flow graph 

The remainder of our failure report is as follows:

 begin{split} mathit{crash} &= J  mathit{obsYes} &= {langle{}B, C, Brangle{}}  mathit{obsNo} &= {F} end{split}

The goal of problem is to compute three sets: mathit{exeYes} (containing those nodes that occur on all paths through our CFG consistent with the failure report), mathit{exeNo} (containing those nodes that occur on no paths matching the failure report), and mathit{maybe} (containing those nodes that occur on some but not all such paths). The following two sections provide the full formulations to solving this problem that we describe in the WODA paper.

Finite-State Automata

This section provides the full collection of automata used for Section 2.1 of the paper.

Automata for A_G^C

The first collection of automata encode constraints from the failure report example from the paper. The intersection of these automata forms A_G^C, an automaton that accepts only strings corresponding to executions that match the provided failure report.

CFG automaton 

This first automaton is a direct encoding of our example Control-Flow Graph (CFG). Note that each CFG edge contributes one transition to the automaton, labeled with the target node's symbol.

Crash automaton 

This second automaton encodes the crashing requirement. Specifically, it forces accepted executions to end with CFG node J.

obsNo automaton 

The third automaton encodes the mathit{obsNo} constraints. That is, it forces accepted executions to not contain CFG nodes in mathit{obsNo}. In this case, it ensures that no matched execution contains CFG node F.

obsYes automaton 

The final automaton encodes an mathit{obsYes} constraint. For our example from the paper, there is only one such constraint. However, in the general case, we would require one mathit{obsYes} automaton for each vector contained in mathit{obsYes}. In this case, we are encoding the constraint that all matched executions must pass through CFG node B, followed by CFG node C, followed by another occurrence of B.

Probes to compute exeYes, exeNo, and maybe

To actually compute mathit{exeYes}, mathit{exeNo}, and mathit{maybe}, we first create two automata for each node in our control-flow graph, G. Checking the intersection of each automaton (independently) with A_G^C for emptiness indicates whether the respective node is possible and/or necessary. Here, we give the automata corresponding to probes for node D.

Probe automaton to check if node D may have executed 

The first, mathit{obsYes}_D corresponds to the mathit{obsYes} automaton for the single-element mathit{obsYes} vector, langle{}Drangle{}. If A_G^C cap mathit{obsYes}_D neq emptyset, then D is possible.

Probe automaton to check if it is possible that node D did not execute 

The second automaton, mathit{obsNo}_D corresponds to the automaton for mathit{obsNo} = {D}. If A_G^C cap mathit{obsNo}_D = emptyset, then D is necessary.

If a node, n is necessary, then n in mathit{exeYes}. If n is not possible, then n in mathit{exeNo}. Otherwise, n in mathit{maybe}.

Answer-Set Programming

This section provides the full formulations for the logic programs used in Section 2.2 of the paper.

Failure Report Constraints

The failure report example described in the paper is here encoded in ASP format:

% CFG representation
node(a). node(b). node(c). node(d). node(e). node(f).
node(g). node(h). node(i). node(j). node(k).

edge(a, b). edge(b, c). edge(b, j). edge(c, d). edge(c, i).
edge(d, e). edge(d, f). edge(e, h). edge(f, g). edge(f, h).
edge(g, h). edge(h, i). edge(i, b). edge(j, k).

% Observations
anyObsYes(b). anyObsYes(c). order(b, c). order(c, b).

Background Constraints

The base encoding of the failure report consists of several rules which specify how paths through the described graph function. To do so, we introduce the concept of mathit{visited} (which in the encoding means any node which is visited by a possible path of execution) and mathit{connected} (here meaning that two nodes are connected during a possible path of execution). These concepts are encoded as follows:

0 { visited(X) } 1 :- node(X).

visited(X) :- entry(X).
visited(X) :- crash(X).
-visited(X) :- obsNo(X).
visited(X) :- anyObsYes(X).

Roughly, any node can potentially be mathit{visited}; the entry and crash nodes (as well as any in mathit{anyObsYes}) are, by definiton, mathit{visited}. Any node observed to, specifically, not be visited (i.e., those in mathit{obsNo}) are explicitly not mathit{visited}.

connected(X, X) :- visited(X).
connected(X, Z) :- visited(X), edge(X, Y), connected(Y, Z).

A node is mathit{connected} to itself if (and only if) it is mathit{visited}; as well, inductively, a node is mathit{connected} to another node if it is both mathit{visited} and has an edge to another node which is mathit{connected} to the specified node.

Probes to compute exeYes, exeNo, and maybe

To actually compute mathit{exeYes}, mathit{exeNo}, and mathit{maybe}, we conjoin our complete set of constraints from a failure report with two probes. This section describes that process, and what SAT and UNSAT mean in each context.


To define mathit{exeYes} in our ASP encoding, we first define a new concept, mathit{connected_excluding}, which tests if two nodes are mathit{connected} without the path of execution passing through a specific third node.

connected_excluding(X, Y, E) :-
	node(E), edge(X, Y),
	E != Y, E != X,
	not obsNo(Y).
connected_excluding(X, Z, E) :-
	connected_excluding(X, Y, E),
	connected_excluding(Y, Z, E).
connected_excluding(X, X, E) :-
	node(E), node(X), X != E.

With this concept defined, defining mathit{exeYes} within ASP becomes straightforward:

exeYes(X) :- crash(X).
exeYes(X) :- entry(X).
exeYes(X) :- obsYes(X).

The crash node, entry node, and any explicitly observed nodes are contained in mathit{exeYes}.

exeYes(X) :-
	exeYes(O), crash(C), node(X),
	not connected_excluding(O, C, X).

exeYes(X) :-
	entry(E), exeYes(O), node(X),
	not connected_excluding(E, O, X).

All possible paths of execution through the program must start at the entry node and end at the crash node. Any additional nodes we know to have executed–whether through explicit observation or deduction–must then have a path both from the entry and to the crash. As such, any nodes that such a path must pass through are also known to have been executed.

The known ordering of nodes during a path also figure in to the encoding:

exeYes(X) :-
	order(Y, Z), node(X),
	not connected_excluding(Y, Z, X).

exeYes(X) :-
	order(A, B),
	order(C, D),
	A != C,
	not connected_excluding(B, C, X),
	not connected_excluding(D, A, X).

Roughly, any node that must be executed when two nodes known to have been executed in a specific order are in mathit{exeYes}, as well as any which must be executed to connect multiple ordered nodes together.


The calculation of membership of mathit{exeNo} is somewhat simpler:

pospair_from(X, O) :-
	connected(O, X),
	connected(X, C),

pospair_to(X, O) :-
	connected(E, X),
	connected(X, O),

exeNo(X) :-
	not pospair_to(X, O),
	not pospair_from(X, O).

If, for any node known to be executed, we cannot find a path from the entry to that node through a specified node and we cannot find a path from the known-executed node through the specified node to the crash, that specified node cannot have been executed.


If we know mathit{exeNo} and mathit{exeYes}, then mathit{maybe} is a simple choice of “none of the above”:

maybe(X) :-
	not exeNo(X),
	not exeYes(X).


The ASP encoding of the failure report allows easy direct queries of the membership of the defined sets by adding the membership of a node to a proposed set as a fact along with the encoding of the observations. For example, if we wish to directly test mathit{G}in mathit{exeNo}, we can add exeNo(G) to the input to the ASP grounder, and then determine its truth by the SAT solver's output: a result of SAT tells us our assertion was true with UNSAT proving our assertion false.