In addition to reading these notes, I suggest that you read
Chapter 1 of Neil Jones's book
for additional background material, and chapter 13 of that book
for more discussion of the applications of partial evaluation.
Partial evaluation (PE) is a technique for program specialization.
The idea is to optimize a program by specializing it with respect to
some of its inputs.
In particular, given
Note that we already have an operation, namely currying, that produces
a residual program with this property.
For example:
Intuitively, partial evaluation of a program (or of a function)
is most likely to be useful (to speed up execution) when
Motivation and Overview
partial evaluation produces a residual program Ps
such that
[[Ps]](d) = [[P]](s,d)
i.e., running the residual program on the dynamic inputs
produces the same result as running the original program
on all of the inputs.
plus = λx.λy.x+y
And if we apply plusx=2 to any value y, we get the
same result as applying plus to the two arguments, 2 and y.
However, the only optimization provided by currying is to
reduce the number of beta reductions.
We want to do more than that.
The partial evaluation that we will look at will do (more or less)
the following optimizations:
plusx=2 = λy.2+y
The book
on partial evaluation by Neil Jones et al includes a section on
Partial Evaluation in Practice (section IV, page 261), which
includes some examples.
They say that partial evaluation has been applied successfully to
the following kinds of problems:
An interesting application of partial evaluation (though one that I think has not turned out to be of practical value) is its use for automatic program generation. This was originally defined by Futamura (and thus, the following are known as the Futamura projections):
+----+ interp --> | PE | --> interpP P --> | | +----+producing interpP such that
+----+ PE --> | PE | --> PEinterp interp --> | | +----+We get PEinterp, which takes one input, a program P, and produces a compiled version of P. So PEinterp is a compiler.
+----+ PE --> | PE | --> PEPE PE --> | | +----+We get a program whose input is an interpreter and whose output is a compiler (for the language of the interpreter):
First we will consider how to do partial evaluation of
a simple, imperative language.
Then we will see what needs to change to handle a
simple, functional language.
For now, we will assume that the input to Partial Evaluator PE has
three parts:
In these notes,
we may write code in this low-level language, or we may write the
code in a higher-level language (e.g., with loops), with the
understanding that PE really works on the low-level form.
Below is code that searches two "parallel" name and value lists for
a given name that is known to be in the first list.
When the name is found (in position j in the name list),
it returns the associated value (the value in position j
in the value list).
Here is a high-level version of the code:
Here's the basic idea of how PE works to produce a residual program:
The PE algorithm is given below.
It includes calls to (undefined) function reduce.
That function takes an expression and the current values of
the static variables, and returns a new version of
the expression simplified via constant folding.
Here is a table that traces the execution of the PE algorithm on the
example program.
Partial Evaluation of a Simple, Imperative Language
Note that conditional and unconditional gotos can only be at the
ends of basic blocks.
Example
read nameList
read valueList
read name
while (name != car(nameList)) {
valueList = cdr(valueList)
nameList = cdr (nameList)
}
print car(valueList)
And here is the corresponding low-level version (shown as a control-flow
graph, using basic blocks, gotos, and labels):
+-------------------------------+
| Enter |
+-------------------------------+
|
v
+-------------------------------+
| L1: |
| read nameList |
| read valueList |
| read name |
| goto L2 |
+-------------------------------+
|
v
+-------------------------------+
| L2: |
| if (name != car(nameList)) |---------+
| then goto L3 | |
| else goto L4 |<--+ |
+-------------------------------+ | |
| | |
v | |
+-------------------------------+ | |
| L3: | | |
| nameList = cdr(nameList) | | |
| valueList = cdr(valueList) |---+ |
| goto L2 | |
+-------------------------------+ |
|
+--------------------+
|
|
v
+-------------------------------+
| L4: |
| print car(valueList) |
| goto Exit |
+-------------------------------+
|
v
+-------------------------------+
| Exit: |
| return |
+-------------------------------+
Given
partial evaluation (as defined below) will produce the following
residual program containing just one basic block:
<L1, ("ann", ["susan", "john", "ann"])>: read (valueList)
valueList = cdr(valueList) valueList = cdr(valueList) print
car(valueList)
The PE Algorithm
PE(program, division, vs0) {
pending = { (pp0, vs0) } // pp0 is the label of the first block
marked = { }
while (pending is not empty) {
remove one pair (pp, vs) from pending // process next basic block
add (pp, vs) to marked
emit code label <pp, vs>
bb = lookup(pp, program) // bb is a copy of the block labeled pp
while (bb is not empty) {
remove the next statement S from bb
switch (kind(S)) {
case "read var":
if (var is dynamic) {
emit code: read var
}
case "x = exp":
if (x is static) {
update vs with x's new value
} else {
emit code: x = reduce(exp, vs)
}
case "goto pp'":
// we're at the end of the current basic block
// do not emit a goto
// instead, start processing the code from the target
// this may cause code duplication (discussed later)
bb = lookup(pp', program)
case "print exp":
emit code: print reduce(exp, vs)
case "if exp then goto pp1 else goto pp2":
// this must be the last stmt in the current basic block
if (exp is static) {
// similar to unconditional goto above
// don't emit a goto
// instead, start processing the code from the target
if (reduce(exp, vs)) {
bb = lookup(pp1, program)
} else {
bb = lookup(pp2, program)
}
} else {
// exp uses a dynamic variable
// if we already generated code for pp1 and/or pp2 with
// current values of static vars, then don't put those
// labels in "pending" or we might never terminate!
if ((pp1, vs) not in marked) {
insert (pp1, vs) into pending if not already there
}
if ((pp2, vs) not in marked) {
insert (pp2, vs) into pending if not already there
}
emit code: if reduce(exp) then goto <pp1, vs> else goto <pp2, vs>
}
} // end switch
} // end iterating through current basic block
} // end while pending set is non-empty
}
Stmt S | Current bb | vs | Emitted Code |
---|---|---|---|
<L1, ("ann", ["susan", "john", "ann"])> | |||
read nameList | L1 | "ann", ["susan", "john", "ann"] | |
read valueList | read valueList | ||
read name | |||
goto L2 | |||
if (name != car(nameList) then goto L3 else goto L4 | L2 | ||
nameList = cdr(nameList) | L3 | ||
valueList = cdr(valueList) | "ann", ["john", "ann"] | valueList = cdr(valueList) | |
goto L2 | |||
if (name != car(nameList) then goto L3 else goto L4 | L2 | ||
nameList = cdr(nameList) | L3 | ||
valueList = cdr(valueList) | "ann", ["ann"] | valueList = cdr(valueList) | |
goto L2 | |||
if (name != car(nameList) then goto L3 else goto L4 | L2 | ||
print car(valueList) | L4 | print car(valueList) |
Consider the following program, which computes ax, for a>0 and x>=0.
read a, x ans = 1 while (x > 0) { ans = ans * a x = x - 1 } print ans
Part (a): Write the corresponding low-level program.
Part (b): Assume that the division is (x: static, a, ans: dynamic), and that x has the value 2. Trace the execution of the PE algorithm and produce the residual program.
Consider the following code fragment:
Fortunately, it is not difficult to avoid this problem.
If we do standard live-variable analysis, we will know
which variables are live at the start of each block.
When processing
When we first defined partial evaluation, we said that we are
given a classification of the program's inputs as either
static or dynamic.
However, when we specified the inputs to PE, we said that
we are given a division of all variables into
static / dynamic (not just its inputs).
The process of computing a division from a specification of
static/dynamic just for inputs is called a binding time analysis.
In the Jones book (section 4.4.6 pages 83-84), a very simple
binding time analysis is defined as follows:
This binding time analysis produces a congruent division.
However, the division is not safe in the sense that
it can cause partial evaluation to fail to terminate, even
for a program that always terminates.
Here's an example:
Given the initial classification x:dynamic, the binding time
analysis given above will classify y as static.
Let's consider what the PE algorithm will do.
It will start by generating the following code:
Clearly, the algorithm will never terminate, because it keeps
generating new instances of the loop for larger and larger values
of y.
Jones was aware of this problem, and he investigated alternatives,
some of which are discussed in his book.
However, we will consider a simpler way to ensure that a division
is safe;
i.e.,
if a program terminates, then the PE algorithm will, too.
Our approach uses a representation of programs called the
Program Dependence Graph, or PDG.
It is more straightforward to define PDGs and how to use them
to do binding time analysis using a program's high-level form,
so that is what we will do below.
The PDG for a procedure has the same nodes as the procedure's
control-flow graph (CFG), except that the PDG has no exit node.
(We're talking here about a CFG that has one node for each statement
and each condition, not one that has one node for each basic block.)
The edges of the PDG represent the procedure's flow and
control dependences.
Flow-Dependence Edges:
Flow-dependence edges are the same as def-use chains: there is a
flow-dependence edge m→n iff all of the following
hold:
Draw the CFG for the example program given above (and repeated below
in its high-level form), then draw the PDG with the flow-dependence edges.
Control-Dependence Edges:
The source of a control-dependence edge is always a condition,
(and the enter node is considered to be a condition that always
evaluates to true).
Having a control-dependence edge m→n means
that condition m controls whether and how often n
executes.
For the simple language that we are considering, a PDG's
control-dependence edges reflect the PDG's nesting structure:
Add the control-dependence edges to the PDG that you drew
for the previous exercise.
Binding Time Analysis:
The problem with the simple binding time analysis given
above, is that while it takes into account flow
dependences, it ignores control dependences.
To fix the problem, we can use
the PDG can be used to find a safe, congruent division as follows:
The original simple binding time analysis defined by Jones
is equivalent to the above technique if when computing the
transitive closure in the PDG we follow only flow-dependence
edges.
Including control-dependence edges, too, makes the division safe.
So far, we have assumed that there is just one
division that is valid at all program points.
This is called a uniform division.
The advantages of using a uniform division are that it is
simpler to compute and to use than a non-uniform division.
However, it has one, potentially major disadvantage:
it permits less optimization than a non-uniform division.
For example, consider the following program:
An alternative to using a uniform division is to use
a pointwise division, which provides one division for each
basic block or even for each statement.
An unsafe pointwise division can be computed using standard
dataflow analysis techniques.
The analysis is similar to constant propagation:
the dataflow facts at each program point are the variables that
are dynamic at that point.
The initial dataflow fact is the set of variables specified
as dynamic in the initial classification.
The dataflow function for an assignment x = exp
adds x to the set of dynamic variables if exp
includes a dynamic variable, and otherwise it removes x from
the set of dynamic variables.
However, how to include the effects of control dependences (to make the
division safe) is an interesting challenge.
The following program loops through an array;
it adds the even values, subtracts the odd
values, and prints the final result.
Write the corresponding low-level program.
Then, assuming that the initial classification of the inputs is
Now let's consider how to do partial evaluation of a simple
functional language.
We'll assume that our language has the following features:
We'll also assume that all of main's formal parameters are
dynamic.
The only potential static variables in a program will be the formal parameters
of some other function.
Here is the example program we used before (the one that searches
name and value lists for a given name), this time written in our
functional language.
In this example, as in the original example,
the values of the name and the name list are provided.
So while main's valueList parameter and
find's vList parameter are dynamic,
find's name and nameList parameters are static.
As for the procedural case, partial evaluation of a functional
program involves two basic steps:
An expression is dynamic if it is any of the following:
NOTE: The above rules are not quite right: they are like
the simple binding time analysis defined above
for the procedural case.
Both ignore the effects of control dependences
This issue is explored further in a Test Yourself
exercise below.
For the example program given above, the binding-time analysis would
create the following division:
Dead Static Variables
x = 10
if dynamic-expression then goto L2 else ...
x = 20
if dynamic-expression then goto L2 else ...
L2:
x = 30
...
Assume that x has been classified as static.
Then partial evaluation of the above code will produce
the following residual code:
if dynamic-expression then goto <L2, 10> else ...
if dynamic-expression then goto <L2, 20> else ...
<L2, 10>:
x = 30
...
<L2, 20>:
x = 30
...
Note that PE has created two copies of block L2 that differ
only in their (new) labels.
If block L2 did not start with an assignment to x,
and if it includes a use of x, then we would want
two copies, because they would be specialized differently
based on the different initial values of x.
It is the fact that x is dead at the start
of block L2 that causes the two copies to be identical, and
to cause useless blow-up of the residual program.
if dynamic-expression then goto L2 else ...
instead of inserting (L2, vs) into pending,
we can insert (L2, vsLive),
where vsLive is the subset of vs that includes
values only for the variables that are live at L2.
How to Compute a Division
x = exp
such that x:static is in division B, and exp includes
some variable v such that v:dynamic is in B,
then replace x:static with x:dynamic in B.
L1:
read x
y = 0
goto L2
L2:
if (x > 0) then goto L3 else goto L4
L3:
y = y + 1
x = x - 1
goto L2
L4:
print y
<L1, ?>:
read x
if (x > 0) then goto <L3, 0> else goto <L4, 0>
and putting the pairs (L3, 0) and (L4, 0) in the pending set.
Processing the pair (L3, 0) causes the current value of y to
be updated from 0 to 1;
the following code is generated:
<L3, 0>:
x = x - 1
if (x > 0) then goto <L3, 1> else goto <L4, 1>
and the pairs (L3, 1) and (L4, 1) are added to the pending set.
read x
y = 0
while (x > 0) {
y = y + 1
x = x - 1
}
print y
Uniform vs Pointwise Divisions
read x
read y
z = x + y
y = x * 2
w = z/y
print w
If the initial classification says that x is static (with
value 10) and y is dynamic, then the division will
be ({x}:static, {y, z, w):dynamic), and the residual program will
be as follows:
read y
z = 10 + y
y = 20 // why include this assignment?
w = z/y // why isn't this "w = z/20" ?
print w
read array
read len
n = 0
result = 0
while (n < len) {
item = array[n]
if (even(item)) result = result + item
else result = result - item
n++
}
print result
array: dynamic
compute a (uniform) division of the variables,
trace the execution of the PE algorithm, and produce the residual program.
len: static value = 3
Partial Evaluation of a Simple, Functional Language
Example
(define (main valueList)
(call find "ann" ["susan", "john", "ann"] valueList)
)
(define (find name nameList vList)
(if (= (car nameList) name)
then (car vList)
else (call find name (cdr nameList) (cdr vList))
)
)
The PE Algorithm
Step 1: Binding Time Analysis
Given a functional program, the basic rules for computing a division
are as follows:
Static in main | Dynamic in main | Static in find | Dynamic in find |
---|---|---|---|
"ann" | valueList | name | vList |
["susan" "john" "ann"] | call find... | nameList | car vList |
main's return | car nameList | cdr vList | |
= ... | call find | ||
if ... | find's return | ||
cdr nameList |
Procedural | Functional | |
---|---|---|
what BTA classifies | variables | variables (formals), function returns, expressions |
what's in the worklist | labels of the form <oldLabel><list of static-vars' values> | fn names of the form <origName><list of static-parameters' values> |
what's processed (specialized) | each statement of the current basic block | each (sub)expression of the current fn |
what may be copied and specialized "in place" | a basic block that is the target of an unconditional goto, or of a conditional goto w/ a static condition | a fn whose formals and return are all static |
what may be copied and have multiple specialized versions with new names | a basic block that is the target of a conditional goto w/ a dynamic condition | a fn w/ a dynamic return |
For our example functional program, the specialization phase would create the residual program shown below. In this case, there are three instances of new versions of functions whose names include the values of the static formals, and there is no instance of a function being specialized "in place" (because function find has a dynamic return). In this example, there is only one call to each of the three new functions. In general, it is possible to have multiple calls to new functions, just as, in the procedural case, it is possible to have multiple jumps to a new label.
(define (main valueList) (call find-"ann"-["susan" "john" "ann"] valueList) ) (define (find-"ann"-["susan" "john" "ann"] vList) (call find-"ann"-["john" "ann"] (cdr vList)) ) (define (find-"ann"-["john" "ann"] vList) (call find-"ann"-["ann"] (cdr vList)) ) (define (find-"ann"-["ann"] vList) (car vList)) )
Below is the functional version of the procedural program that loops through an array, adding even values and subtracting odd values. (Since our functional language doesn't include arrays, we assume that the array parameter is actually a list.)
(define (main array) (call process array 0 0 3)) (define (process array n result len) (if (n < len) then (if (even (car array)) then (call process (cdr array) (n + 1) (result + (car array)) 3) else (call process (cdr array) (n + 1) (result - (car array)) 3) ) else result )
Compute a division for this program, then use specialization to produce the residual program.
The rules for computing a division of a functional program don't take control dependences into account. Find an example that illustrates the problem: i.e., an example of a program that does terminate when run, but for which partial evaluation using the given, unsafe rules for computing a division does not terminate.
Then give a new rule for computing a division that solves the problem.