Interprocedural Analysis: Computing Jump Functions

Overview
Jump functions
Using summary info
Return jump functions

Overview

The ideas presented here are from a paper called Interprocedural constant propagation, by D. Callahan, K. Cooper, K. Kennedy, and L. Torczon, published in the Proceedings of the Symposium on Compiler Construction in 1986. The basic idea is to compute a jump function J_s,f for each call site s and each formal f of the called procedure. Each jump function summarizes the (dataflow) effects of all paths from the start of the procedure that contains the call to call site s. The paper by Callahan et al assumes that the problem of interest is constant propagation.

For example, given the following call to p and header of p:

    call site s in proc q                  called proc p
    s: call p(a1, a2, ..., an)           procedure p(f1, f2, ..., fn)

there would be n jump functions: J_s,f1 J_s,f2 ... J_s,fn. Jump function J_s,fk takes as inputs the values that q's formals are guaranteed to have at the start of procedure q, and produces as output the (single) value that p's formal fk will have for this call (i.e., the value of actual parameter ak).

Jump functions are used to determine what is true at procedure entry (not how a procedure call affects dataflow information) by combining values from all call sites.

Example

procedure main() {      procedure q(f1, f2, f3) {        procedure p(f4, f5, f6) {
  s1: call q(1,2,3);        int y = f3*2;                       ...
  s2: call q(3,2,1)         f1 = 0;                      }
}                       s3: call p(f1, f2, y);
			    f2 = y;
                        }

For this example, the best jump functions would be:

J_s1,f1() = 1
J_s1,f2() = 2
J_s1,f3() = 3

J_s2,f1() = 3
J_s2,f2() = 2
J_s2,f3() = 1

J_s3,f4(f1,f2,f3) = 0
J_s3,f5(f1,f2,f3) = f2
J_s3,f6(f1,f2,f3) = f3*2

A single, general algorithm can be defined that uses jump functions to determine what value each formal f must have at procedure entry (for all procedures). Code for this algorithm is given below. Then we will talk about three different ways to compute the jump functions. Note that the possible values for each formal form the following lattice:

                          Top
                        / | | \
                   ... -1 0 1 2 ...
                        \ | | /
                         Bottom

A formal that is determined to have value "bottom" is not constant. The "Top" value is used for initialization; a formal will only end up with value "Top" if its procedure is never called. Here's the general algorithm:

// determine, for each formal parameter f, what value f must have at procedure entry

// initialize
for each formal parameter f in the program: Val(f) = top

for each call site s
   for each formal f of the called procedure
      Val(f) = Val(f) meet J_s,f( values of the formals of the calling procedure )

put all formals on a worklist

// iterate until the greatest fixed point is found
while the worklist is not empty {
   remove a formal f from the worklist
   let p be the procedure whose formal f is
   for each call site s in p {
      for each formal x of the called procedure such that
               f is used in jump function J_s,x {
         tmp = Val(x) meet J_s,x( values of p's formals )
         if ( tmp < Val(x) ) {
            Val(x) = tmp;
            add x to the worklist
         }
      }
   }
}

Note that if the program has no recursive procedures, then the "iterate" loop is not needed. Formals can be given values by handling procedures in topological order on the call graph (give values to procedure p's formals only after giving values to the formals of all procedures that call p). For the example given above, the formals' values would be:

f1 = bottom
f2 = 2
f3 = bottom
f4 = 0
f5 = 2
f6 = bottom

Jump Functions

Now we'll consider three ways to compute jump functions:

All or Nothing
Pass Through
Symbolic Execution

In each case, the examples will be based on constant propagation; however it should be clear how to use the same techniques for other dataflow problems.

Note: The Callahan et al paper on which these notes are based seems to assume that alias analysis (to determine the possible aliases of reference formals and globals) has already been done. Thus, we know, for every definition and use of a variable x what other variables might be defined or used, and we assume that use/def information is used when appropriate (e.g., when doing constant propagation or reaching defs analysis).

All or Nothing

For each procedure, do normal constant propagation:
- Start with all formals mapped to bottom.
- Treat each "call p(a1, a2, ..., an)" as setting all actuals and all globals to bottom.
If the dataflow fact just before call site s maps actual a to a constant value c, then define J_s,f = c (where f is the formal of the called procedure that corresponds to actual a). Otherwise, define J_s,f = bottom.

For our example program, if we do constant propagation on procedure q, the dataflow fact just before call site s3 is:

f1 -> 0
f2 -> bottom
f3 -> bottom
y -> bottom

So the jump function J_s3,f4(f1, f2, f3) = 0, while the jump functions for f5 and f6 at this call site just return bottom.

Pass Through

The pass-through approach is a way to enhance the results of the "all or nothing" approach, by taking into account cases where a procedure's formal parameter is "passed through" unchanged to another procedure (e.g., procedure p has a formal parameter f, which is not modified by p and is passed as an actual parameter in a call to q).

The pass-through approach involves using the results of reaching-definitions analysis (as well as the constant propagation used for the all-or-nothing approach):

For each procedure, do reaching definitions analysis.

Treat the enter node for a procedure p with formals f1..fn as defining all formals.
Treat each call node, "call p(a1, a2, ..., an)" as possibly defining each actual.

For each call site s, for each actual a:

one

_s,f

In our example, at call site s3 the second actual parameter (f2) is only reached by the definition at the start of q; therefore, the corresponding jump function is: J_s3,f5(f1, f2, f3) = f2

Symbolic Execution

For each call site s:

Make a fresh copy of the enclosing procedure
For each actual a at call site s, eliminate all code that cannot affect the value of a at s (i.e., take a backward slice with respect to a at s).
Replace each "call p(a1, a2, ..., an)" in the reduced code with: a1=a2=...an=bottom
This version of the code is itself the appropriate jump function.

In our example, the slice of procedure q with respect to actual y at call site s3 is:

y = f3*2

so J_s3,f6(f1,f2,f3) = f3*2

Improving Jump Functions by Using Summary Information

It is sometimes possible to get better jump functions by taking GMOD information into account. The idea is to consider a call like "call p(a1, a2, ..., an)" a definition of actual ak or global g only if it is in the GMOD set for that call site.

Return Jump Functions

A return jump function is similar to a jump function, but instead of summarizing the effects of paths from procedure entry to a call site, it summarizes the effects of paths from procedure entry to procedure exit. Therefore, the return jump function for procedure p can be used to define a better dataflow function for each call to p than can be done just using GMOD.

Like jump functions, return jump functions can be computed using any of the three approaches (all-or-nothing, pass-through, or symbolic execution). For our example program, if we assume that procedure p does not modify any of its formals, then the return jump functions for the formals of q would be as shown in the table below for the three different approaches (always assuming that summary information is used to determine that the call at s3 has no effect on q's formals). Note that the return jump function for formal f of procedure p is called R_p,f.

Return Jump Function All-or-Nothing Pass-Through Symbolic Execution
R_q,f1(f1,f2,f3) 0
R_q,f2(f1,f2,f3) bottom bottom f3*2
R_q,f3(f1,f2,f3) bottom f3

Return to Interprocedural Analysis table of contents.

Go to the previous section.

Return Jump Function	All-or-Nothing	Pass-Through	Symbolic Execution
R_q,f1(f1,f2,f3)	0
R_q,f2(f1,f2,f3)	bottom	bottom	f3*2
R_q,f3(f1,f2,f3)	bottom	f3