Three important questions are:
The answers are provided by the framework first defined by Kildall.
The next section provides background on lattices;
the section after that presents Kildall's framework.
Background
Partially ordered sets
Definition:
Note: "partial" means it is not necessary that for all x,y in S, either x ⊆ y or y ⊆ x.
Example 1: The set S is the set of English words, and the ordering ⊆ is substring (i.e., w1 ⊆ w2 iff w1 is a substring of w2). Here is a picture of some words and their ordering (having an edge w1 → w2 means w1 > w2).
candy then / \ / \ v v v v annual and can the hen \ | / \ / \ | / v v v v v he an | v aNote that the "substring" ordering does have the three properties required of a partial order:
Example 2: S is the set of English words, and the ordering ⊆ is "is shorter than or equal to in length".
candy | v ____ then / / | \ v v v v can and the hen \ \ / / \ \ / / v v v v an | v aDoes this ordering have the three properties?
Example 3: S is the set of integers, and the ordering ⊆ is "less than or equal to". This is a poset (try verifying each of the three properties).
Example 4: S is the set of integers and the ordering ⊆ is "less than". This is not a poset, because the ordering is not reflexive.
Example 5: S is the set of all sets of letters and the ordering is subset. This is a poset.
Definition:
z / \ z is the least upper bound of x and y v v y x z w |\/| |/\| z is NOT the least upper bound of x and y vv vv (they have NO least upper bound) y zThe idea for the meet operation is similar, with the reverse orderings.
Note: Every finite lattice (i.e., S is finite) is complete.
Note: Every complete lattice has a greatest element, "Top" (written
as a capital T) and a least
element "Bottom" (written as an upside-down capital T).
They are the least-upper and the
greatest-lower bounds of the entire underlying set S.
Monotonic functions
Definition:
Here is an important theorem about lattices and monotonic functions:
Theorem:
We can create new lattices from old ones using cross-product: if L1, L2, ..., Ln are lattices, then so is the cross-product of L1, L2, ..., Ln (which we can write as: L1 x L2 x ... x Ln). The elements of the cross-product are tuples of the form:
The ordering is element-wise: <e1, e2, ..., en> ⊆ <e1', e2', ..., en'> iff:
If L1, L2, ..., Ln are complete lattices, then so is their
cross-product.
The top element is the tuple that contains the top elements
of the individual lattices:
<top of L1, top of L2, ... , top of Ln>, and the
bottom element is the tuple that contains the bottom elements of
the individual lattices:
<bottom of L1, bottom of L2, ... , bottom of Ln>.
Recall that our informal definition of a dataflow problem included:
Kildall addressed this issue by putting some additional requirements
on D, ⌈⌉ , and fn.
In particular he required that:
Given these properties, Kildall showed that:
In 1977, a paper by Kam and Ullman (Acta Informatica 7, 1977)
extended Kildall's results to show that,
given monotonic dataflow functions:
To show that the iterative algorithm computes the greatest
solution to the set of equations, we can "transform" the
set of equations into a single, monotonic function L → L
(for a complete lattice L) as follows:
Consider the right-hand side of each equation to be a "mini-function".
For example, for the two equations:
Define the function that corresponds to all of the equations to be:
Note that every fixed point of f is a solution to the set of
equations!
We want the greatest solution. (i.e., the greatest fixed point)
To guarantee that this solution exists we need to know that:
To show (1), note that the each individual value in the tuple
is an element of a complete lattice. (That is required by Kildall's
framework.)
So since cross product (tupling) preserves completeness,
the tuple itself is an element of a complete lattice.
To show (2), note that the mini-functions that define each
n.after value are monotonic (since those are the dataflow
functions, and we've required that they be monotonic).
It is easy to show that the
mini-functions that define each n.before value are monotonic, too.
For a node n with k predecessors, the equation is:
base case k=1
We must show that given: a ⊆ a', f(a) ⊆ f(a').
For this f, f(a) = a, and f(a') = a', so this f is monotonic.
base case k=2
We must show that given: a1 ⊆ a1' and a2 ⊆ a2',
f(a1, a2) ⊆ f(a1', a2').
Induction Step
Assume that for all k < n
Given that all the mini-functions are monotonic, it is easy to show that
f (the function that works on the tuples that represent the nodes'
before and after sets) is monotonic;
i.e., given two tuples:
We now know:
Therefore:
Return to
Dataflow Analysis table of contents.
Go to the previous section.
Go to the next section.
Summary of lattice theory
If L is a complete lattice and f is monotonic,
then f has a greatest and a least fixed point.
If L has no infinite descending chains then we can compute the
greatest fixed point of f via iteration ( f(T), f(f(T)) etc.)
Kildall's Lattice Framework for Dataflow Analysis
and that our goal is to solve a given instance of the problem
by computing "before" and "after" sets for each node of the control-flow
graph.
A problem is that, with no additional information about the domain D, the
operator ⌈⌉ , and the dataflow functions fn, we can't say, in
general, whether a particular algorithm for computing the before and
after sets works correctly (e.g., does the algorithm always halt?
does it compute the MOP solution? if not, how does the computed solution
relate to the MOP solution?).
He also required (essentially) that the iterative algorithm initialize
n.after (for all nodes n other than the enter node) to the lattice's
"top" value.
(Kildall's algorithm is slightly different from the iterative algorithm
presented here, but computes the same result.)
It is interesting to note that, while his theorems are correct,
the example dataflow problem that he uses (constant propagation)
does not satisfy his requirements;
in particular, the dataflow functions for constant propagation
are not distributive (though they are monotonic).
This means that the solution computed by the iterative algorithm
for constant propagation will not, in general be the MOP solution.
Below is an example to illustrate this:
1: enter
|
v
2: if (...)
/ \
v v
3: a = 2 4: a = 3
| |
v v
5: b = 3 6: b = 2
\ /
v v
7: x = a + b
|
v
8: print(x)
The MOP solution for the final print statement is the set {(x,5)},
since x is assigned the value 5 on both paths to that statement.
However, the greatest solution to the set of equations for this program
(the result computed using the iterative algorithm) finds that
x is not constant at the print statement.
This is because the equations require that n.before be the
meet of m.after for all predecessors m;
in particular, they require that the
"before" set for node 7 (x = a + b) be empty since the "after"
sets of the two predecessors are {(a,2), (b,3)} and {(a,3), (b,2)},
and the meet operator is intersection.
Given that 7.before is empty, the equations also require that
7.after (and 8.before) be empty, since that value is defined to be
f7( 7.before ), and f7 only determines that
x is constant after node 7 if both a and b are constant before node 7.
n3.before = n1.after meet n2.after
The two mini-functions, g11 and g12 are:
n3.after = f3( n3.before )
g11(a, b) = a meet b
g12(a) = f3( a )
f( <n1.before,n1.after,n2.before,n2.after ...> ) =
<g11(..),g12(...),g21(..), g22(...),...>
Where the (...)s are replaced with the appropriate arguments to those
mini-functions. In other words, function f takes one argument that is
a tuple of values. It returns a tuple of values, too. The returned
tuple is computed by applying the mini-functions associated with each
of the dataflow equations to the appropriate inputs (which are part
of the tuple of values that is the argument to function f).
n.before = m1.after meet m2.after meet ... meet mk.after
and the corresponding mini-function is:
f(a1, a2, ..., ak) = a1 meet a2 meet ... meet ak
We can prove that these mini-functions are monotonic by induction on k.
f(a1) = a1
f(a1, a2) = a1 meet a2
a1 ⊆ (a1 meet a2), and
a2 ⊆ (a1 meet a2)
(a1 meet a2) ⊆ a1 ⊆ a1' implies (a1 meet a2) ⊆ a1', and
(a1 meet a2) ⊆ a2 ⊆ a2' implies (a1 meet a2) ⊆ a2'
a1 ⊆ a1'
and a2 ⊆ a2'
and ... and an-1 ⊆ a'n-1 =>
Now we must show the same thing for k=n
f(<a1, ..., an-1) ⊆ f(<a1',... a'n-1>)
we need to show: x meet an ⊆ x' meet an'
t1 = <e1, e2, ..., en>, and
such that: t1 ⊆ t2, we must show f(t1) ⊆ f(t2).
Recall that, for a cross-product lattice, the ordering is element-wise;
thus, t1 ⊆ t2 means: ek ⊆ ek', for all k.
We know that all of the mini-functions g are monotonic, so for all k,
gk(ek) ⊆ gk(ek').
But since the ordering is element-wise, this is exactly what it means for
f to be monotonic!
t2 = <e1', e2', ..., en'>
This is not quite what the iterative algorithm does,
but it is not hard to see that it is equivalent to one that does
just this: initialize
all n.before and n.after to top, then on each iteration,
compute all of the "mini-functions" (i.e., recompute
n.before and n.after for all nodes) simultaneously,
terminating when there is no change.
The actual iterative algorithm presented here is an optimization
in that it only recomputes n.before and n.after for a node n
when the "after" value of some predecessor has changed.