CS 704 (Partial Evaluation, Part 2)

Partial Evaluation, Part 2

Basic Principle of Partial Evaluation
A Partial Evaluation Algorithm
Generating Extensions
Partial Evaluation of Functional Programs
Basic Organization
Annotation of the Abstract Syntax Tree
Binding-Time Analysis
Lifting
Specialization
An Example
Another Look at Binding-Time Analysis

Basic Principle of Partial Evaluation

Example: A summation loop

The program:

          read(N)
          i := 1
          sum := 0
          goto loop
   loop:  if i > n goto end else body   
   
   body:  sum := sum + i
          i := i + 1
          goto loop

   end:   print sum

Convert to:

  ( List of reads
    List of blocks )

We get:

  
  (
   (
      (read N)
   )
   (
    ( block nil
      (assign i 1)
      (assign sum 0)
      (goto loop)
    )
    ( block loop
        (cond ...)
    )
   )
  )

The Partial Evaluation Principle: Reparenthesization

  interp()

    lab     \
             >-- local variables
    store   /

  at a halt:  (pp, store)    // this is a compuation state
                             // pp = program point
                             // halt only before a basic block

  partial computation state:  (pp,vs)   // vs = values of statically
                                        // determined variables

Note: these states become labels on the basic blocks of the residual program.

Original      Specialized
program:      program:

   pp: ~~~  / (pp,vs1): ~~~    \
       ~~~ /            ~~~    |
       ~~~<   (pp,vs2): ~~~    } We do not want an infinite number of these!
       ~~~ \            ~~~    |
            \ (pp,vs3): ~~~    /
                        ~~~

               \__  __/
                  \/ 
                 poly = the set of specialized program points

Example

      read(z)
      y := 1
  q:  if y < 3 then goto r else s
  r:  y := y + 1
      z := z + 1
      goto q
  s:  print z

Consider a trace of a ``symbolic execution'' of the program. We introduce a name (z) for the unknown input value in computation states of the trace:

      (q,(1,z'))         // z' is whatever z was, where z is a value.
      (r,(1,z'))         // z' was "z-bar" in class
      (q,(2,z'+1))
      (r,(2,z'+1))
      (q,(3,z'+2))
      (r,(3,z'+2))

comp state = (pp, (vals of static vars, vals of dynamic vars))

           =~ ((pp, vals of static vars), vals of dynamic vars)
               \_______________________/
                           |
            these will be the program points of P'.

GOAL:  P  ==> P' (the residual program)
        i

We guarantee we have preserved the semantics of the original program. New states can be mapped back to original states very easily.

Back to the example ...

The specialized program:

          read(z)
  (q,1):  goto(r,1)     --->  ((q,1),z')
  (r,1):  z := z + 1    --->  ((r,1),z')
          goto(q,2)     
  (q,2):  goto(r,2)     --->  ((q,2),z'+1)
  (r,2):  z := z + 1    --->  ((r,2),z'+1)
          goto(q,3)
  (q,3):  goto(s,3)
  (s,3):  print(z)

Compression of transitions:

          read(z)
  (r,1):  z := z + 1
  (r,2):  z := z + 1
  (2,3):  print(z)

  Note 1: This cannot be mapped directly back to original program.
  Note 2: This is called "converting data into control" (i.e., y was 
          built in to the control flow).

We need a program to keep the static portion around, while spitting out fragments of the residual program.

How to Perform Partial Evaluation

We extend the notion of static and dynamic inputs to other entities in the program.

Definition: A division is a classification of each variable as either S (``static'' or ``supplied'') or D (``dynamic'' or ``delayed'').

Definition: A division is uniform if each variable v has the same classification at every point in the program; it is non-uniform otherwise.

It would make no sense to have the following:

       V : D  (V is dynamic)
         .
         .  --->  W := f(V)    // Because V is dynamic, we cannot allow
         .                     // W to be static.
       W : S

So, dependent variables (W is dependent on V) must remain in the same division class.

Definition: Variables classified as S only depend on variables that are S (or, ``any variable that depends on a D variable must also be D''; or, ``D begets D'').

The example program above has uniform division and is congruent:

       y : S
       z : D

Comments:

infinite loops are a problem (change "y < 3" to "y > 0")
what does "depend" mean? Does it introduce any behaviors that were not present in the original program?

Static Infinite Behavior

A dynamic test that always leads to a particular part of the program:

            ( dynamic )         If the B branch is always taken,
                /\              after a dynamic test, it cannot be 
               /  \             statically checked that block A
              /    \            will never be executed (this would be
             A      B           tantamount to solving the halting 
                                problem).

Back to the example ...

     (2, z'+1)
     (3, z'+2)
         \__/
           \_  none of these affect 
               program conversion

If at any time z' would have affected the x values, then we must reflect that in the conversion.

A congruent division says we can never cross from D --> S, but it is ok to go S --> D. In other words:

       S  D                S  D        // going down
        \                    /  
         \       NOT        /
       S  D                S  D

We are doing full execution on the static part. We generate program points of the form (pp,vs)

Definition: The set of all specialized program points (pp,vs) that are reachable (via symbolic execution) from the start of the program is called poly.

An Algorithm for Partial Evaluation

Find a congruent division (Binding Time Analysis - BTA)
Generate poly and residual program
Compress transitions

Note: it is possible to merge steps (2) and (3).

Some auxiliary subroutines:

  eval(exp,vs)   :  evaluate exp on vs (assuming that exp contains
                    only static variables)

  simplify(exp,vs) :  simplification of exp

Look at (2):

Generating poly and the residual program

      poly := { (pp ,vs ) }     // pp = start of program
                   0   0             0

                                // vs = info we are given about input                    
                                     0
      
      while (poly contains an unmarked (pp,vs) do
        select an unmarked (pp,vs) and mark it
        generate code (pp,vs)
        poly := poly U successors (pp,vs)
      end

Recaping so far:

We have states in a program: state(pp,store)
We will want to work with partial states: partial_state(pp,vs), where pp is the program point, and vs is the vals of just static variables.
A division maps variables to (S)tatic and (D)ynamic, e.g.:
```
       x->S
       y->D
       
```
A division is uniform if it holds everywhere in the program.
In a congruent division, S depends on S, and D begets D.

Reparenthesization:

    Original       Specialized
    program:       program:

       pp: ~~~  / (pp,vs1): ~~~    \
           ~~~ /            ~~~    |
           ~~~<   (pp,vs2): ~~~    } We do not want an infinite number of these!
           ~~~ \            ~~~    |
                \ (pp,vs3): ~~~    /
                            ~~~
                   \__  __/
                      \/ 
                     poly

Transition compression: Removing jumps to jumps during unrolling process.

A Partial Evaluation Algorithm

Find a division
Generate poly and code
Perform transition compression

This will be a worklist algorithm. Start by performing the following operations:

  poly := { (pp0, vs0) };
  new_program := empty_program
  insert reads for vars - {vars of vs0}

Now, the worklist part. We assume that poly and new_program are global variables:

  while poly contains an unmarked (pp,vs) do
     Select and mark an unmarked (pp,vs) from poly
     Generate(pp, vs)  /* Generate code for basic block that starts at pp,
                        * inserting successors into poly */
  od

Now, to generate code, we use the function Generate():

Generate( pp, vs )
  new_blk := empty_blk
  for command:=pp; exists more stmts in the block; command:=next(command) do
    /* Here there should be a switch statement based on the type of
     * command being worked on.  The actions taken for each type of cmd
     * are shown here in tabular form:
     */
      +-------------------------------------------------------------------------------+
      |cmd type   | condition   | Perform action     | Append to          | Insert    |
      |           |             |                    | new_block          | into poly |
      +-------------------------------------------------------------------------------|
      |x := exp   | x is Static | vs :=              |        -           |     -     |
      |           |             | vs[x->eval(exp,vs)]|                    |           |
      |           |-------------------------------------------------------------------|
      |           | x is Dynamic| simplified_exp :=  | "x:="              |     -     |
      |           |             |  simplify(exp,vs)  | simplified_exp";"  |           |
      +-------------------------------------------------------------------------------|
      |return exp |     -       |         ''         | "return("          |     -     |
      |           |             |                    | simplified_exp");" |           |
      +-------------------------------------------------------------------------------|
      |goto pp'   |     -       |         -          | "goto" (pp',vs)    | (pp',vs)  |
      +-------------------------------------------------------------------------------|
      |if exp     | exp dyn.    | simplified_exp :=  | "if" simplified_exp| (pp',vs)  |
      |  goto pp' |             |   simplify(exp,vs) |   "goto"(pp',vs)   | (pp'',vs) |
      |else pp''  |             |                    | "else" (pp'',vs)   |           |
      |           |-------------------------------------------------------------------|
      |           | exp static  |                    | "goto"(pp', vs)    | (pp',vs)  |
      |           |eval(exp,vs)=|         -          |                    |           |
      |           |     T       |                    |                    |           |
      |           |-------------------------------------------------------------------|
      |           |eval(exp,vs)=|         -          | "goto"(pp'',vs)    | (pp'',vs) |
      |           |     F       |                    |                    |           |
      +-------------------------------------------------------------------------------+
   od /* end for-loop */
   attach new_block to new_program

Some notes on compressing transitions. We would like to compress transitions on the fly. Consider the case where we lay down the code "goto(pp',vs)". What if (pp',vs) winds up being another goto? The idea for avoiding this is to not emit the unconditional goto's, but to simply change the current command to be the command at the beginning of (pp',vs). The idea is that we want to suck up that basic block (pp',vs) and start working on that block with our current vs. For example, in the table above consider the case where we have an if statement, and exp is static and evaluates to T. The new action for transition compression on-the-fly would be

  command := lookup(prog.,pp')

In addition, we would not append anything to new_block, nor insert (pp',vs) into poly (because (pp',vs) will not label a block).

This material is in Chapter 4 of [Jones et al. 93].

We may want to patially evaluate an interpreter int, running a program p:

  [| pe |] int p

The interpreter has available all program variables. It manipulates the environment -- so it knows all values p{i} where i is the value of variable p.

Consider the classic structure for an environment:

           env  <= env is dynamic  
           /\ 
          /  \
         /\   \
        /  \   \
      name val  \
       S    D   /\
               /  \
              /\   \

S indicates static and D indicates dynamic variables.

This implementation of the environment destroys the ability to partially evaluate the interpreter (with our simple binding-time analysis) because the whole environment is considered dynamic. An alternative implementation of the environment side-steps this problem. If we unzip the environment and keep separate name and value lists, then the name list will be static:

   name_list (S)         val_list (D)
       /\                    /\
      /  \                  /  \
     a   /\                5   /\
        /  \                  /  \
       w   /\                17  /\
          /  \                  /  \
         b   /\                3   /\
            /  \                  /  \
           d   nil               1   nil

Now consider an example tha could appear in some interpreter. (Variables are marked (S)tatic and (D)ynamic in C-style comments):

parameters: name, namelist, valuelist
    search: if name /* S */ = hd(namelist /* S */) goto found else continue
            /* This code needs to be massaged to check for 
             * a nil namelist and take appropriate action. */
  continue: valuelist /* D */ = tl(valuelist)
            namelist  /* S */ = tl(namelist)
            goto search
     found: value /* D */ = hd(valuelist)

We will work through p.e. of this code with name = z and namelist = (x,y,z):

Selected from poly     | Generated                   | Inserted into poly
(also labels for new   | (bodies of new blocks)      |
 program pts)          |                             |
------------------------------------------------------------------------------
(search(z,(x y z)))    : goto(cont(z,(x y z)))       | (continue, (z,(x y z)))
------------------------------------------------------------------------------
(continue,(z,(x y z))) : valuelist = tl(valuelist)   |
                         goto(search(z,(y z)))       | (search (z,(y z)))
------------------------------------------------------------------------------
(search (z,(y z)))     : goto(cont(z,(y z)))         | (continue (z, (y z)))
------------------------------------------------------------------------------
(continue, (z, (y z))) : valuelist = tl(valuelist)   |
                         goto(search (z, (z)))       | (search (z, (z)))
------------------------------------------------------------------------------
(search (z, (z)))      : goto(found, (z,(z)))        | (found (z, (z)))
------------------------------------------------------------------------------
(found (z, (z)))       : value = hd(valuelist)       |

\_______________________  ___________________________/
                        \/ 
            Together, these form new_prog

This example is simple because the branch is always static. We also generated jumps to jumps. After transition compression, we would have:

  valuelist = tl(valuelist)
  valuelist = tl(valuelist)
  value = hd(valuelist)

Or, if we had great transition compression:

  value = hd(tl(tl(valuelist)))

Some dangers of transition compression

A: ~~~~~~~~ (vs1)    B: ~~~~~~~~ (vs3)
   ~~~~~~~~   |         ~~~~~~~~   |
   ~~~~~~~~   |         ~~~~~~~~   |
   ~~~~~~~~   |         ~~~~~~~~   |
   ~~~~~~~~   v         ~~~~~~~~   v
   goto foo (vs2)       goto foo (vs2)

foo: ~~~~~~~~
     ~~~~~~~~
     ~~~~~~~~
     ~~~~~~~~
     ~~~~~~~~

From the foo block, we may generate:

(foo, vs2): ~~~~~~~~
            ~~~~~~~~
            ~~~~~~~~
            ~~~~~~~~
            ~~~~~~~~

We could also generate:

(A, vs1): ~~~~~~~~         (B, vs3): ~~~~~~~~
          ~~~~~~~~                   ~~~~~~~~
          ~~~~~~~~                   ~~~~~~~~
          ~~~~~~~~                   ~~~~~~~~
          goto (foo,vs2)             goto (foo,vs2)

If we are doing transition compression on the fly, we would not end (A,vs1) and (B,vs3) with goto (foo, vs2). Instead, we would copy (foo,vs2) to the location of the goto. This strategy has the drawback that the size of the code can blow up. If (foo,vs2) contains more jumps, we could have more code blow-up -- we could potentially have quadratic code blow-up. We might want to control this somehow.

Another problem can arise. Consider the set vs2 at the end of the original block A. For the end of A, we wanted to generate goto (foo, vs2). Therefore, we needed to generate the block (foo, vs2). What if there is another set vs4 at the end of another block that jumps to (foo, vs4). Then we may need to generate the new block (foo,vs4). However, we do not want to generate to different blocks (foo,vs2) and (foo,vs4) if the only difference between vs2 and vs4 is in the values of dead variables.

A concrete example of this potential problem is

Start: if 
       then a:=1; ...... a .......; goto continue  /* This will become 
                                                    * goto (cont, (1)) */
       else a:=2; ...... a .......; goto continue  /* This will become
                                                    * goto (cont, (2)) */

continue: ~~~~~ \
          ~~~~~  > No use of a
          ~~~~~ /
          a:=3
          ~~~~~
          ~~~~~
          ~~~~~

Naively, we might generate:

(continue,(1)): ~~~~~            (continue,(2)): ~~~~~
                ~~~~~                            ~~~~~
                ~~~~~                            ~~~~~
	        a:=3                             a:=3
                ~~~~~ <---- stores the --------> ~~~~~
                ~~~~~       same at this         ~~~~~
                ~~~~~       point                ~~~~~

These basic blocks are identical; we do not want two copies of them.

There are two possible solutions:

If we are using a non-uniform division, change a to be a dynamic variable at the appropriate point.
If we are using a uniform division, introduce a wildcard value, so that we would get the program point (continue,(_))

Generating Extensions

We now discuss an alternative approach to creating specialized programs—called generating extensions. First, recall the specification of a partial-evaluator, pe:

[[pe]][q,s] = q_s, such that [[q_s]][d] = [[q]][s,d].

In the second Futamura projection, we applied this equation to a partial evaluator pe and an interpreter int to create a compiler, pe_int:

[[pe]][pe,int] = pe_int, such that [[pe_int]][q] = [[pe]][int,q] = int_q,

where pe_int is a compiler/translator (because when given a program q, it produces int_q, which is a compiled/translated version of q).

Instead of using int in the previous equation, let's try using some other program p:

[[pe]][pe,p] = pe_p, such that [[pe_p]][s] = [[pe]][p,s] = p_s.

We say that pe_p is a generating extension for program p: when pe_p is run on input s, it produces a version of p that is specialized for input s. (In other words, the ``compiler'' pe_int is a generating extension for int.)

Notice that I said ``a'' generating extension for p, not ``the'' generating extension for p. In general, there are many generating extensions for p, of which pe_p is just one. Moreover, we can state the constraint for a program ge-p to qualify as a generating extension for p, as follows:

[[ge-p]][s] = p_s.

In other words, a generating extension ge-p is a program that when given input s produces a version of p specialized for input s.

How do we create generating extensions? In fact, we already know one method—namely, use a self-applicable partial evaluator to create [[pe]][pe,pe] = pe_pe. The program pe_pe is a tool for creating generating extensions because, for all p, [[pe_pe]][p] = [[pe]][pe,p] = pe_p! In other words, pe_pe is a generating-extension generator.

However, there are other methods for constructing generating extensions that do not involve such heavy-weight machinery as self-applicable partial evaluation, which is what we explain next.

Implementation of generating extensions using operator overloading

If we are working in a language that supports operator overloading, like C++, we can harness overloading to create generating extensions via a simple overloading trick. Moreover, the size of the generating extension ge-p that we create via this method is about the same as the size of p. The reason is that each basic block in p will turn into exactly one basic block in ge-p.

Consider the following straight-line-code example Q:

  int a; int b;
  int x; int y; int z;
  int m; int n; int o; int r;
  read a;
  read b;
  x = a*a;
  y = a*b;
  z = b*b;
  m = x + 3*a;
  n = y + 4*a;
  o = z + 5*a;
  r = m*n*o;
  return(r);

We first do binding-time analysis as usual, and obtain the following division: static variables = {a, x, m}; dynamic variables = {b, y, z, n, o, r}. We then create the generating extension for Q as the following piece of code, which declares the dynamic variables to be of type INT and the static variables to be of type int:

  int a; INT b("b");
  int x; INT y("y"); INT z("z");
  int m; INT n("n"); INT o("o"); INT r("r");
  read a;
  emit("read b;")
  x = a * a;
  y = a * b;
  z = b * b;
  m = x + 3 * a;
  n = y + 4 * a;
  o = z + 5 * a;
  r = m * n * o;
  emit("return(r);")

Above, we've used underlined symbols to denote INT operations and non-underlined symbols to denote int operations.

Class INT is a class that redefines each of the operators that can be used in straight-line code: arithmetic operators, shift operators, bit-manipulation operators, and assignment. When each such operator is evaluated, the result is an abstract-syntax tree (of type AST, say), rather than a value of type int. For instance, class INT looks something like the following:

  class INT {   // Reinterpretation of type int that yields integer abstract-syntax trees
    INT(int n);                            // Constructor, which converts the value n from int to INT
    INT& operator=(const INT &x);          // Build an assignment AST and append it to the tail of a global list of ASTs
    INT& operator+();                      // Build a UnaryPlusExp node
    INT& operator+(INT& x);                // Build a PlusExp node
    friend INT& operator+(int& x, INT& y); // Build a PlusExp node
    friend INT& operator (INT& x, int& y); // Build a PlusExp node
    INT& operator-();                      // Build a UnaryMinusExp node
    INT& operator-(INT& x);                // Build a MinusExp node
    INT& operator*(INT& x);                // Build a TimesExp node
    ...
    INT(std::string s);                    // Constructor that builds a leaf node that represents a dynamic variable; a declaration AST is appended to the tail of a global list
    AST tree;
  };

When the generating extension for Q is executed with the input 9 for a, the result is a list of ASTs. When the list is unparsed (pretty-printed), we obtain the following program:

  int b;
  int y; int z;
  int n; int o; int r;
  read b;
  y = 9 * b;
  z = b * b;
  n = y + 36;
  o = z + 45;
  r = 108 * n * o;
  return(r);

The example above shows how a basic block is handled. In general, for a basic block BB we will refer to the version of BB transformed (roughly) as above with INT variables as BB_INT.

To create a generating extension for a program with branches (including loops), we need to address two issues:

In general, the generating-extension program needs to be able to execute its basic blocks (and emit the resulting code) multiple times.
We need a way for the generating-extension program to
- capture the current state (i.e., of the non-INT variables)
- "execute" both branches of a dynamic conditional

To see how to do these things, let's consider the following program P, which has four basic blocks, one of which ends with a branch:

  l1: if exp goto l2 else goto l3   // Basic block BB1
  l2: BB2
      goto l4
  l3: BB3
      goto l4
  l4: BB4

The generating extension will contain four basic blocks—one for each of the four basic blocks in program P—together with a fifth ``control block,'' which has the following structure:

  control_block:
    if (Pending contains no unmarked pair)
      goto exit
    else
      goto control_block1

  control_block1:
    Select an unmarked ⟨label,σ⟩ pair from Pending
    Mark ⟨label,σ⟩
    ASTList = emptyList
    Change the state to σ   // (re-)establish the state of the non-INT variables
    goto label              // Use a computed goto or switch

For basic block BB1, the generating extension contains the following code:

  l1: genlabel("l1",state);
      tmp = exp;           // yields a simplified AST of class ExpNode
      state = snapshot();  // capture the state of the non-INT variables after evaluating exp in case exp contains side-effects
      generate("if (");
      tmp.Unparse();
      generate(")");
      genjmp(l2,state);
      generate("else");
      genjmp(l3,state);
      Insert(Pending,l2,state);
      Insert(Pending,l3,state);
      goto control_block;

For basic block BB2, the generating extension contains the following code:

  l2: genlabel("l2",state);
      BB2_INT              // int/INT-transformed code for BB2
      Unparse(ASTList);  // Emit the residual code
      genJmp(l4,state);
      Insert(Pending,l4,state);
      goto control_block;

and the latter translation is also performed for BB3 and BB4. The control block accesses Pending and repeatedly dispatches to the appropriate int/INT-transformed basic block. In this way, each basic block can be executed multiple times, and each time it produces code that is added to the (growing) residual program.

The operator-overloading method was used circa 1995 in the CMix partial evaluator for C. CMix used a pipeline that involved both C and C++:

  p.c ----BTA + gen.-ext. creation----> gen-p.cpp ----g++----> gen-p + input s ---------> p_s.c ----gcc----> p_s
  [C]                                     [C++]               [a.out]                      [C]            [a.out]

Partial Evaluation of Functional Programs

Note: the descriptions of binding-time analysis and specialization are different from that given in [Jones et al. 93].

With a functional language, we will need to address function calls. We may have to do partial evaluation for a program point a number of times. This is also the case during partial evaluation of a program written in our flowchart language, but in a functional language it can happen because of different calls, or sequences of calls, to the procedure that contains the program point. For some calling contexts, we may know that a given variable is static; for other calling contexts, that same variable may be dynamic.

A monovariant division will use one classifiction for each program point; a polyvariant division will possibly use multiple divisions for each program point (i.e., it is non-uniform).

Here is a chart contrasting and comparing PE for flow-chart languages and first-order functaional languages. (By ``first order'' we mean that only named functions are allowed.)

Feature of PE        | Imperative/Flow Chart     | First-order functional
                     |                           | (pp's will focus on fn. 
                     |                           | entries)
----------------------------------------------------------------------------
Binding              | Assignment to a global    | Values bound to params;
                     |                           | bindings created by fn.
                     |                           | application
----------------------------------------------------------------------------
Static/Dynamic       | Classification on globals | Classifications on params
----------------------------------------------------------------------------
Monovariant div.     | 1 div. per prog. pt.      |   
--------------------------------------------------    see graph for disc.
Polyvariant div.     | Many div. per prog. pt.   |
--------------------------------------------------

          ~~~~~~~~~~~          ~~~~~~~~~~~          ~~~~~~~~~~~
          ~~~~~~~~~~~          ~~~~~~~~~~~          ~~~~~~~~~~~
          call f(D,S)          call f(S,S)          call f(S,D)
                 \                  |                    /
                  \                 |                   /
                   +------------+   |   +--------------+
                                 \  |  /
                                  v v v
                               f( .., .. )
                               ~~~~~~~~~~~
                               ~~~~~~~~~~~

We might have many different binding-time-analysis results for the different calls to f.

Comparison chart continued:

Feature of PE        | Imperative/Flow Chart     | First-order functional
--------------------------------------------------------------------------
Congruence           | D begets D                | D params beget D params
                     |                           |
                     |                           | g( ... W /* D */ ...)
                     |                           | ~~~~~~ | ~~~~~~~~~~~~
                     |                           | ~~~~~~ V ~~~~~~~~~~~~
                     |                           | f( ... W ... )
                     |                           |        |
                     |                           |        |
                     |                           |        V /* D */
                     |                           | f( ... X ... )
                     |                           | /* X is dyn, b/c W is */
---------------------------------------------------------------------------
Specialized prog. pt.| (pp,vs)                   | (f,vs)
                     | /* pp was alabel */       |  f is a function or 
                     |                           |  manufactured from arm
                     |                           |  of dynamic condition 
                     |                           |
---------------------------------------------------------------------------

Other concepts that we need include reduction, transition compression.

In Scheme0, we will have a list of function defintions. The first function will be the ``goal function.'' The partial evaluator will start with the goal function.

The syntax of Scheme0 is similar to LISP:

expr -> const          /* nil, t, and numeric constants */
     | (quote a)       /* an sexpr or list constant a */
     | var
     | (if expr expr expr)
     | (call funcname arglist)
     | (áopñ exp .... exp)

We use small letters for atoms
Function Definitions look like (define (funcname varlist) expr)
conditionals: (if expr expr expr)
function calls: (call funcname expr ..... expr) Notice the explicit call keyword. For built-in primitives, we will have (op expr ... expr) for an appropriate collections of operators, op.

Other Scheme0 issues:

We have call-by-value parameter passing
no lambda abstractions
no let expressions
no assignments

A division is a classification of parameters. We may have different classifications for different functions. We will be working with the assumption that we have a monovariant division. This means that there is only one classification per function (but different functions can have different classifications).

We need to worry about maintaining congruence:

.....(call g .... ej /* D */ ....)  .... (call g .... ej' /* S */...) ....
                   \                                   /  ^
                    \___________   ___________________/   |
                                \ /                       |
                                 V                        |
                    (def (g .... xj ... ) ...)            |
                                 ^ This should be         |
                                   classified as D        |
                                                          |
                                              Here we have a Static argument, 
                                              but we have decided that the
                                              formal parameter is Dynamic.
                                              We perform a "lift" on this
                                              argument.  A lift is a coercion
                                              to cause this expr to be treated
                                              as if it is Dynamic.

Note that we may still have many different versions of the function g corresponding to different static values being passed into the parameters. This is fine in a multivarient division. In a polyvariant division, we may also have different versions of g with different classifications.

Our specialized program points will look like (g.vs) (i.e., a cons cell to bundle a function name with values for the static parameters).

Transition Compression:
flowchart language              | Scheme0
----------------------------------------------------------------------
case goto pp'                   | In Scheme0, we will do call unfolding:
     bb := lookup(pp', program) | (call f .... ej ....) => a copy of
                                | f's expr w/ the ej's substituted for 
--------------------------------/ the formals.

Call unfolding can cause problems:

  (define (f x) ..... x ..... x ..... x .....)

unfolded, f may become

                       => ..... e ..... e ..... e .....
                                ^       ^       ^
                                \       |       /
                                 ------\|/-----/
                                        V
                         e is evaluated multiple times.  We can work around
                         employing let expressions (see [Jones et al. 93])

Transition Compression for Scheme0, continued:

We will unfold only those functions for which all arguments are static.

For the purpose of comparing Scheme0 evaluation with the specialization/simplification method for Scheme0 that we give below, we now give a call-by-value interpreter with static scoping.

eval[exp, env, program] <-
  cases exp
    nil: nil
    t: t
    n: n                 /* n a number */
    (quote c): c
    z: v_z, where ((var₁.val₁) ... (z.v_z) ... (var_n.val_n)) = env
    (if e1 e2 d3): if eval[e1, env, program]
                         then eval[e2, env, program]
                         else eval[e3, env, program]
    (call f (e₁ ... e_n)): eval[e_f, env', program],
               where (define (f (x₁ ... x_n) e_f) = lookup[f,program]
               and v_j = eval[e_j, env, program], for j = 1, ..., n
               and env' = list[(x₁.v₁), ..., (x_n.v_n)]
    (áopñ e₁ ... e_n):
               perform_op[áopñ, eval[e₁, env, program], ..., eval[e_n, env, program]]

Basic Organization

In contrast to the two-phase strategy that we used for partial evaluation of the flowchart language, the partial evaluator for Scheme0 will have three phases:

Binding-time analysis: Identify a congruent division
Annotation: Label the abstract syntax tree with the results of binding-time analysis
Specialization: Create the specialized program by propagating the static state while interpreting the binding-time annotations on the abstract syntax tree

(Specialization may incorporate transition compression, or transition compression can be carried out during a fourth phase.)

Annotation of the Abstract Syntax Tree

During binding-time analysis (see below), we will annotate the abstract syntax trees with the binding-time information. The specializer will work on the annotated AST.

For this purpose, we introduce a 2-level syntax:

expr -> const          /* nil, t, and numeric constants */
     | (quote a)       /* an sexpr or list constant a */
     | var
     | (ifs expr expr expr)
     | (ifd expr expr expr)
     | (calls funcname (arglist /* S */) (arglist /* D */))
      /*     \                            \              /
       *      The s stands for Static      This list should be empty for calls
       * calls is an indication to perform transition compression (unfolding)
       */
     | (calld funcname (arglist /* S */) (arglist /* D */))
     | (áopñs e1 .... en)
     | (áopñd e1 .... en)
     | (lift exp)  /* A static expression that needs to be residuated */

Thus, if we have a function definition

  (define (f x1 .... xn) e),

the definition would become

  (define (f (xs₁ ... xs_m /* static params */)
             (xd₁ ... xd_k /* dynamic params */)
             e-annotated))

The expression e-annotated is generated as follows:

A condition (if e1 e2 d3) is translated to 'ifs' when e1 is static; 'ifd' otherwise.
An operator, áopñ, is translated to áopñs when all e_j are static; it is translated to áopñd otherwise.
A call is translated either to the form (calls g (es₁ .... es_n) ()) or to the form (calld g (es₁ .... es_m) (ed₁ .... ed_k))

Binding-Time Analysis

The method described here differs from that in [Jones et al. 93]. We will build a data-dependence graph. Consider this example:

(define (f1 x1 x2 x3) .......
        (call f2 (+ x1 x2) 3 (+ x1 x3)) .......
        (call f2 x3 x1 x2) ....... )

(define (f2 a b c) .... (call f2 (op a) (op' b) (op'' c)) .... )

In the data-dependence graph for this function, we create a node for each formal and actual parameter. We will pass information along the edges of the graph.

                                              return
                             x1    x2     x3   info
   +-------------- define f1  o     o      o    []  ----------------------+
   |                         (A)   (B)    (C)   (D)                       |
   |                       (to E) (to E) (to G)                           |
   |                       (to j) (to K) (to I)                           |
   |                                                                      |
   |                            (to D)                             (to D) |
   |             (E)   (F)   (G)  (H)               (I)   (J)   (K)  (L)  |
   +---- call f2  o     o     o   []        call f2  o     o     o   [] --+
                (to M)(to N)(to O)                 (to M)(to N)(to O)


                                              (to T)
                                              (to H)
                                              (to L)
                             (M)   (N)   (O)   (P)
            +----  define f2  o     o     o     []  ------+
            |               (to Q)(to R)(to S)            |
            |                                             |
            |                                 (to P)      |
            |                (Q)   (R)   (S)   (T)        |
            +-----   call f2  o     o     o     [] -------+
                            (to M)(to N)(to O)

As another example, consider the append function:

(define (app xs ys)
   (if (null? xs)
       ys
       (cons (car xs)
             (call app (cdr xs) ys)
       )
   )
)

In the graph, we start by optimistically labeling everything Static that we can. The arguments will be labeled according to the division we have found for our variables. We then adjust the labels by pushing the dynamic labels over the directed arrows. On the following graph, we are considering:

  (call append (1 2 3) ys)

which has the division [S D] for the actual arguments. (In the graph, nodes are labeled either S or D. S>D means that something we assumed was static is changed to dynamic.)

 +-------------------------+
 | +----------------+      |
 | |                V      V
 | |                xs     ys       +-----+
 | |define append  So     Do       [] S>D |
 | |                |\     |\    ^ ^ ^    |
 | |                | \    | +--+ /  |    |
 | |                |  +---+-----+   |    |
 | |                |      |         |    |
 | |                V      V         |    |
 | |  call append  So   S>Do        []<---+
 | |                ^      ^       S>D
 | +----------------+      |
 +-------------------------+

With this binding information, we can make the annotated program. For our function append, the first list is static, and the second list is dynamic.

(define (app (xs) (ys))
        (ifs (null?s xs)
         ys
        (consd (lift (cars cs))
               (calld app
                  ((cdr xs))
                  (ys)
               )
        )
)

Lifting

Lifting means the labeling of a static expression e as (lift e). We do this because we have a value v for the expression e but e is in a dynamic context; the label "lift" indicates that the specialization phase should behind leave a residual expression (quote v), where v is the value of e.

We perform lifting when e is in a dynamic context. An expression is in a dynamic context in any of the following situations:

 (1) the expression is a dynamic argument list of a calld
     fk:  S S D D          /* We have this division for fk */
     (call fk S1 S2 S3 D)  /* Here we need S3 to be left as an expression */
      => (calld fk (S1 S2) ((lift S3) D))
 (2) the expression is an arglist of an opd
 (3) the expression is a subexpression of an ifd
 (4) the expression is a branch of an ifs in a dynamic context
 (5) the expression is the body of a goal function (to avoid disappearence of the goal function!)
 (6) the expression is the body of a function that has >= 1 dynamic parameter.

Note that the definition is recursive because of case (4).

BTA: graph reachability to determine the D components

Annotation:

      if => ifs    if the branch-expression is S
         => ifd    otherwise

      op => ops    if all arguments are static
         => opd    otherwise

      (call f arglist) => (calls f (arglist)())                if all actuals are S
                       => (calld f (arglist s) (arglist d))    otherwise

      e => (lift e)     if e is in a dynamic context

Specialization

The specialization function uses two auxiliary functions

successors[E_vs], which finds the set of residual calls in E_vs
simplify[exp, names, values], which creates a simplified expression (simplified according to the environment (names, values))

specialize[program, vs₀] {
  let ( (define (f₁ ...) ...) ...) = program in   // the goal function
  rprog = ()  // empty list of functions
  pending = { (f₁.vs₀) }
  marked = ∅
  while (pending ≠ ∅) {
    select and remove a pair (f.vs) from pending
    marked = marked ∪ { (f.vs) }
    let (define (f (x₁ ... x_m) (x_m+1 ... x_n)) e) = lookup[f,program]
    let (vs₁ ... vs_m) = vs      /* decompose the list vs */
    let e_vs = simplify[e, (x₁ ... x_m, x_m+1 ... x_n), (vs₁ ... vs_m, x_m+1 ... x_n)]
       /* Note that the "value" of x_j is x_j itself, for m+1 ≤ j ≤ n */
    pending = (pending  ∪ successors[e_vs]) - marked
    rprog = append[rprog, list[define, list[cons[f,vs], x_m+1 ... x_n], e_vs] :: nil]
  }
  return rprog
}

Note that the goal function is the first function processed (and residuated); thereafter, new functions are appended to the end of rprog so that the (residual) goal function stays as the first function on the list.

simplify[exp, names, values] <-
  cases exp
    nil: nil
    t: t
    n: n                 /* n a number */
    (quote c): c
    y_j: v_j, where (y₁ ... y_j ... y_k) = names
             and (v₁ ... v_j ... v_k) = values
    (ifs e1 e2 d3): if simplify[e1, names, values]
                       then simplify[e2, names, values]
                       else simplify[e3, names, values]

    (ifd e1 e2 d3): list[if, simplify[e1, names, values],
                             simplify[e2, names, values],
                             simplify[e3, names, values]]
    (calls f (e₁ ... e_m) (e_m+1 ... e_a)):
               simplify[e_f, list[x₁ ... x_a], list[e₁' ... e_a']]
               where (define (f (x₁ ... x_m)(x_m+1 ... x_a) e_f) = lookup[f,program]
               and e_j' = simplify[e_j, names, values], for j = 1, ..., a
    (calld f (e₁ ... e_m) (e_m+1 ... e_a)):
               list[call, (f::(e₁' ... e_m')), e_m+1' ... e_a']
               where e_j' = simplify[e_j, names, values], for j = 1, ..., a
    (áopñs e₁ ... e_a):
               perform_op[áopñ, simplify[e₁, names, values], ..., simplify[e_a, names, values]]
    (áopñd e₁ ... e_a):
               list[áopñ, simplify[e₁, names, values], ..., simplify[e_a, names, values]]
    (lift e): list[quote, simplify[e, names, values]]

An Example

We now illustrate each of the three phases, using the power function with the second argument being static.

Binding-time analysis

  (define (power x n)
    (if (= n 0)
        1
        (* x (call power x (- n 1)))))

  BTA (init):  def pow   D.(x)  S.(n)  So      (given [D,S])
               ---
               call pow  S.     S.     So
               ----
                           _____________
                          /        ____ |
                          |       /    ||
                          |      |     vv
    (result):  def pow   D.(x)  S.(n)  Do--
               ---        | ^   | ^    ^   \
                          | |   | |    |    |
                          v  \  v  \   |    |
               call pow  D.  |  S. |   Do<-/
               ----       |  |  |  |
                           \_|   \_|

Annotation

                     S        D
                ----------- -----
  (call d power ( (- n 1) ) ( x ) )

Work bottom-up:

calld + x:D -> "*" => "*d"
n:S + 0:const -> "=" => "=s"
static condition -> "if" => "ifs"

The result is

prog =
  (define (power (n) (x) )
    (ifs (=s n 0)
      (lift 1)
      (*d x (calld power ( (-s n 1) ) ( x ) )))))

[See [Jones et al. 93], Figures 5.6 & 5.7,]

Specialization of prog for static input 3

Iteration 1:

=> specialize[program, (3)]
   Before the while loop:
       f₁ = power
       pending = { (power.(3)) }

   In the while loop:
       f = power
       vs = (3)
       e_vs = simplify[(ifs (= x n 0) ...), (n x), (3 x)]
                                 -------------------  -----  -----
                                   body of program    names  "values"

This situation leads to 3 calls on simplify:

   (i) simplify[(-s n 1), (n x), (3 x)]  \
               = 2                        |
                                          | processing of calld
  (ii) simplify[x, (n x), (3 x)]          |
               = x                       /

 (iii) simplify[x, (n x), (3 x)]
               = x

and produces

   e_vs = (* x (call (power (2)) x))
   marked = { (power.(3)) }
   pending = (∅ ∪ { (power.(2)) }) - { (power.(3)) }
   rprog = ( (define ((power.(3)) x) (* x (call (power.(2)) x))) )

Iteration 2 works on (power.(2)):

   ...
   e_vs = (* x (call (power (1)) x))
   marked = { (power.(3)), (power.(2)) }
   pending = (∅ ∪ { (power.(1)) }) - { (power.(3)), (power.(2)) }
   rprog = ( (define ((power.(3)) x) (* x (call (power.(2)) x)))
             (define ((power.(2)) x) (* x (call (power.(1)) x)))
           )

Iteration 3 works on (power.(1)):

   ...
   e_vs = (* x (call (power (0)) x))
   marked = { (power.(3)), (power.(2)), (power.(1)) }
   pending = (∅ ∪ { (power.(0)) }) - { (power.(3)), (power.(2)), (power.(1)) }
   rprog = ( (define ((power.(3)) x) (* x (call (power.(2)) x)))
             (define ((power.(2)) x) (* x (call (power.(1)) x)))
             (define ((power.(1)) x) (* x (call (power.(0)) x)))
           )

Iteration 4 works on (power.(0)):

   ...
   e_vs = (quote 1)
   marked = { (power.(3)), (power.(2)), (power.(1)), (power.(0)) }
   pending = (∅ ∪ ∅) - { (power.(3)), (power.(2)), (power.(1)), (power.(0)) }
   rprog = ( (define ((power.(3)) x) (* x (call (power.(2)) x)))
             (define ((power.(2)) x) (* x (call (power.(1)) x)))
             (define ((power.(1)) x) (* x (call (power.(0)) x)))
             (define ((power.(0)) x) (quote 1))
           )

Thus, the return value from "specialize[program, (3)]" is

  (   (define ((power.(3)) x) (* x (call (power.(2)) x)))
      (define ((power.(2)) x) (* x (call (power.(1)) x)))
      (define ((power.(1)) x) (* x (call (power.(0)) x)))
      (define ((power.(0)) x) (quote 1))
  )

If we had a more powerful unrolling strategy, the end result would be:

  (    (define ((power.(3) x) (* x (* x (* x 1))))
  )

Another Look at Binding-Time Analysis

The material below is taken from Section 5.2 of [Jones et al. 93]. However, some problems came to light when the TA implemented a partial evaluator for Scheme0 in Spring 2007. In particular, their equations for binding-time analysis do not seem to account for a certain kind of congruence issue that arises when a function that always has static arguments calls a function that can have dynamic arguments. A discussion of this issue, along with a modified set of equations for binding-time analysis can be found here.

Goal: mapping from function names to a vector of S's & D's
                                     ---------------------
                                    "binding times for the
                                     function's parameters"
Expressed with 3 functions:
  B   B   div
   e   v

Method: "Abstract Interpretation" (a.k.a. flow analysis)
                                                         _
    t   \in BindingTime = {S,D}       lattice:  D     S |_ D
                               *                |        -
    Tau \in BTEnv = BindingTime                 S    "S approximates D"
                                                     "D subsumes S"
    div \in Monodivision = Funcnames -> BTEnv

      _
NB:  |_  extends to BTEnv and div pointwise
---   -             -----     ---
                  _    ,     ,           _
  (1) (t ... t ) |_  (t ... t ) iff  t  |_  t   1 <= i <= n
        1     n   -    1     n        i  -   n
            _                      _
  (2) div  |_  div   iff div (f ) |_  div (f )  1 <= i <= k
         1  -     2         1  i   -     2  i
    ( we want the least dynamic solution -- i.e., most static possible)

  |_| : join  "least upper bound"    S |_| S = S
              "max" "highest"        * |_| D = D , D |_| * = D

  B [| e |] : BTEnv -> (Funcname -> BTEnv)
   v
    -- used to define the BTEnv transmormation from the BTEnv
       on entry to a function f to the the BTEnv at all call
       sites on a function g within f

        -----------  def  f   .   .   .   o  --------------------
       /                     /    |  /|                          \
      /                 ____/      \/ |                           \
     |                 / __________/\  \_____________________      |
      \               / /            \                       |     /
       \             v v              v                      v    /
        --  call g . . .       call g . . .     call h . . . . ---

  B [| e |] Tau g   -- just interested in g
   v

  B [| e |] Tau h   -- just interested in h
   v
                    |   |
  B [| e |] Tau g = |___| (vector for the kth use of g in e)
   v                  k

               ______ vector of S's & D's at entry of functions
              /
             v
  B [| c |] Tau g = (S, ..., S)  <- length = # parameters of g
   v

  B [| x  |] Tau g = (S, ..., S)
   v    1

  B [| if exp1 exp2 exp3 |] Tau g =
   v
      B [| exp1 |] Tau g  |_|  B [| exp2 |] Tau g  |_|  B [| exp3 |] Tau g
       v                        v                        v

                                n
                              |   |
  B [| op e ... e  |] Tau g = |___| B [| e  |] Tau g
   v       1     n             j=1   v    j


  B [| (call f e ... e ) |] Tau g =
   v            1     n

                n
              |   |
      let t = |___| B [| e  |] Tau g in
               j=1   v    j

        / t       if f != g
       < 
        \ t |_| (B [| e |] Tau ... B [| e  |] Tau) if f = g
                  e    1            e    n


  B [| e |] : BTEnv -> BindingTime
   e

  B [| e |] Tau =
   e

	case switch on e:

                    c   ->  s

                    x   ->  Tau  (jth component of Tau)
                     j         j

                             3
                           |   |
           if e  e  e   -> |___| B [| e  |] Tau
               1  2  3      j=1   e    j

                             3
                           |   |
         (op e  ... e ) -> |___| B [| e  |] Tau
              1      n      j=1   e    j

(once again we are looking for the least dynamic congruent division)

  div g =  |_| (vector for jth call site on g)
  -----     j              -------------
    ^                           ^
    |                           |
    |                            \
    \                             anywhere in the program
     this expresses congruence

                    (S)                                 (D)
     ... (call g ... e ...)        ......    (call g ... e  ...)
                      k                                   k
                      \_______________     ______________/
                                      \   /
                                       v v
                         (define (g ... x ...) ...)
                                       (D)
             n
           |   |
  div f  = |___| B [| e  |] (div f ) f
       k    i=1   v    i          i   k


  One equation for k = 1 ... n

  Start with div  = [f -> Tau , f -> (S, S, ... S), ... f  = (S, S, ... S)]
                0     1      1   2                       n

  Example
  -------

  (div power) = B [| e      |] (div power) power
                 v    power

              = B [| (= n 0) |] (div power) power
                 v   -------    -----------------
                      (S,S)         ( ... ) ...

            |_| B [| 1 |] ( ... )
                 v

            |_| B [| (* x (call power ...)) |] ( ... ) ...
                 v
                      _____ (S,S)
                     /
                   -----
              = B [| x |] ( ... ) ...
                 v
                         _____ (S,S)
                        /
                     -------
            |_| B [| (- n 1) |] ...
                 v

            |_| ( B  [| x |] (div power), B  [| (- n 1) |] (div power) )
                   e                       v
                 \-----------------------------------------------------/
                                           |
                                        ^  |           ^
                           ( (div power)|1, (div power)|2 )

              = (S,S) |_| (S,S) |_| (div power) = (div power)

     ===> the first approximation is stable (i.e., congruent)

     ===> compare with walking through the graph, pushing D's
          through the graph

References

[Jones et al. 93]: Jones, N.D., Gomard, C.K., and Sestoft, P., Partial Evaluation and Automatic Program Generation, Prentice-Hall International, Englewood Cliffs, NJ, 1993. [On-line copy: PostScript, PDF.]