CS 701, Program #2

CS 701, Program 2
Dataflow Analysis and Optimizations

Due: Wednesday, October 15, 2014 (by midnight)

Not accepted after midnight Wednesday, October 22, 2014

Overview of Program 2

For this project you will write an LLVM Function Pass that works as follows:

For each function, do live-variable analysis and then use the results of the analysis to remove useless assignments.
Produce some useful debugging output, too.

Everyone will write the same pass, whether working alone or in pairs.

For this project you will use the (built-in) mem2reg optimization before running your own pass. You'll also have your program keep track of the number of useless assignments that it finds, and store them as statistics, which are displayed to stderr when opt is run with the -stats flag.

For example, you'll use commands like the following:

    clang -emit-llvm -O0 -c foo.c -o foo.bc           // create bitcode .bc
    opt -mem2reg foo.bc > foo.opt                     // MEM2REG OPTIMIZATION
    mv foo.opt foo.bc
    (opt -load Debug/lib/P2.so -liveVars -stats foo.bc -o foo.opt) >& foo.stats  // new live vars pass
                                                                                 // bitcode written to .opt file
                                                                                 // stats and other debugging output written to .stats file
    mv foo.opt foo.bc
    llc foo.bc                                       // create assembly .s
    gcc foo.s -o foo                                 // create executable "foo"

The mem2reg optimization keeps non-pointed-to scalar local variables in pseudo-registers as much as possible, and also puts the code into SSA form by adding phi functions. This should not cause you any problems, and will in general open up more opportunities for your own "remove useless assignments" optimization.

Getting Started

Start by creating a copy of your proj1 directory (and all of its subdirectories). You can call the copy whatever you want; here we'll assume that it's called proj2. Now do the following steps:

Edit the Makefile: change p1 to p2, and change P1 to P2.
Change to the lib subdirectory and rename subdirectory p1 to p2.
Change to the p2 subdirectory and copy printCode.cpp to liveVars.cpp. Remove the other files that are specific to the optimizations that you implemented for the first project (but keep all of the code needed to implement the printCode pass).
Edit the Makefile and change P1 to P2. Make other needed changes to the Makefile (e.g., change the definition of OBJS).
Edit liveVars.cpp:
- Change everything specific to printCode to liveVars.
- Change the arguments to the call to RegisterPass to the following: ("liveVars", "Live vars analysis", false, true)
- Change the body of getAnalysisUsage to be empty since your liveVars pass may (eventually) modify the program and doesn't require other passes.

Go back up to the proj2 directory and type make clean then make. Fix any problems, then make sure you can run your (old) code on a file foo.c:

clang -emit-llvm -O0 -c foo.c -o foo.bc
opt -load Debug/lib/P2.so -liveVars foo.bc > foo.opt  // should print the function(s) in foo.c
mv foo.opt foo.bc
llc foo.bc
gcc foo.s -o foo
foo                                                  // run the code in foo.c

Now you're ready to write the code that does live variable analysis and removes useless assignments.

Live Variable Analysis and Removing Useless Assignments

For each function in the input program, your runOnFunction method will first do live-variables analysis, then use the results to remove useless assignments as described below. Please don't put all of the code in the runOnFunction method -- use good modular design. You should also change runOnFunction to return true iff it changes the code by removing one or more useless assignments.

Live-Variables Analysis

The goal of live-variable analysis is to compute liveBefore and liveAfter sets for each instruction in the given function. The liveBefore and liveAfter sets will be sets of pseudo-registers. Recall that in LLVM, the pseudo-register assigned to by an instruction is represented using the address of that instruction. So the liveBefore and liveAfter sets will actually be sets of pointers to instructions.

The iterative dataflow-analysis algorithm discussed in class would use a worklist of instructions. To process an instruction, you would compute its "after" fact as the meet of the "before" facts of all successors. Then you would compute its "before" fact by applying the appropriate dataflow function (depending on the instruction itself). If that "before" fact changed, you would put all of the instruction's predecessors on the worklist.

However, LLVM doesn't provide a convenient way to find an instruction's predecessors or successors, which makes it difficult to implement the iterative algorithm as described above. However, LLCM does provide iterators to go forward or backward over the basic blocks in a function (and you already know how to iterate over the instructions in a basic block). Therefore, you can first do live-variable analysis at the basic-block level, then use the results to do the analysis at the instruction level. Do this as follows:

For each basic block in the function, compute the block's liveBefore and liveAfter sets. Do this using a worklist algorithm where the items in the worklist are basic blocks. Note that to compute a liveBefore set from a liveAfter set for a basic block, you will need to (compute and) use the GEN and KILL sets for that block. The GEN set is the set of upwards-exposed uses: pseudo-registers that are used in the block before being defined. (Those will be the pseudo-registers that are defined in other blocks and used in this block, or are defined in this block and used in a phi function at the start of this block. You will have to write code that computes this set for a given basic block.) For the KILL set, you can use the set of all instructions in the block (which safely includes all of the pseudo-registers assigned to in the block).
Once you have the liveBefore and liveAfter sets for a basic block, you can compute those sets for each instruction in the block. Do this by iterating backward through the instructions in the block. You can do a reverse iteration by using the overloaded -- operator on the basic block iterator. For example, given Function::iterator bb, a pointer to the current basic block:
```
       BasicBlock::iterator oneInstr = bb->end();
       do {
          oneInstr--;
	  
	  // add code here to process the current instruction, oneInstr
	  
       } while (oneInstr != bb->begin());
      
```
The liveAfter set for the last instruction is the same as the liveAfter set for the whole block. For the other instructions, the liveAfter set is the liveBefore set of the next instruction. The liveBefore set of each instruction is the result of applying that instruction's dataflow function to its liveAfter set. To implement the instructions' dataflow functions you will again need to compute GEN and KILL sets, this time for each instruction. An instruction's GEN set is the set of its Instruction operands, and its KILL set is the Instruction itself.

Data Structures

You will need to use some appropriate data structure to represent the worklist and the results of your dataflow analysis. You'll need one data structure to store each liveBefore and liveAfter set (e.g., an LLVM SmallSet or a C++ set), and another data structure to store a map (e.g., an LLVM DenseMap, or a C++ map) from basic blocks (and then from instructions) to their liveBefore and liveAfter sets.

The on-line documentation for the LLVM data structures is not very complete, so you will probably want to look at the .h files, linked from here, too if you decide to use LLVM data structures.

Remove Useless Assignments

In LLVM IR, a useless assignment is an instruction that satisfies both of the following:

The instruction represents an assignment of a value to a pseudo-register; i.e., it is
- a binary instruction, or
- a cast instruction, or
- a shift instruction, or
- an instruction with any of the following op-codes: Alloca, Load, GetElementPtr, Select, ExtractElement, ExtractValue, ICmp, FCmp.

(See Instruction.h for functions that identify binary, cast, and shift instructions, and see Instruction.def to find out which op-code numbers are used for the other relevant instructions.)

The defined pseudo-register (the address of the instruction itself) is not in the instruction's liveAfter set.

After you have computed liveAfter sets for all of the instructions in a function, iterate over all of the instructions to find and remove the useless assignments (remember that you can't remove an instruction while iterating, and that you can use the eraseFromParent method to remove an instruction).

Statistics

To implement the number-of-useless-assignment statistics you should do the following:

In liveVars.cpp (but not inside the class itself) include this line:
Each time you find a useless assignment, increment RemovedInstructionCount.

The STATISTIC macro defines a global unsigned integer of the given name (RemovedInstructionCount). If opt is run with the -stats flag, the final value of the variable is printed along with the given string and the string used to define DEBUG_TYPE. For the example program given below, you should see the following output:

===-------------------------------------------------------------------------===
                          ... Statistics Collected ...
===-------------------------------------------------------------------------===

3 liveVars - Number of useless assignments found.

Debugging Output

To help you debug your code (and to ease grading) your code should conditionally write some useful output to stderr. To control whether output is produced or not, include the following code in liveVars.cpp (or in liveVars.h if you create that file):

#include "flags.h"
#ifdef PRINTANALRESULTS
    static const bool PRINT_ANAL_RESULTS = true;
#else
    static const bool PRINT_ANAL_RESULTS = false;
#endif
#ifdef PRINTREMOVING
    static const bool PRINT_REMOVING = true;
#else
    static const bool PRINT_REMOVING = false;
#endif

The file flags.h will either be empty, or will contain one or both of the following lines:

#define PRINTANALRESULTS 1
#define PRINTREMOVING 1

If either of the two fields is true, you will need to create an instruction map and a basic-block map (as you did for project 1). Then your code should print debugging output depending on the values of the two fields as follows (see below for example output for each case):

if PRINT_ANAL_RESULTS is true: For each function, after doing live-variable analysis (but before removing useless assignments), print the function name, the block names, and the instruction numbers. But instead of printing the actual instructions, print the live-before and live-after sets with the set elements (the instruction numbers) in sorted order. (You can use a C++ list, which supports a sort operation.) Also print those sets for each block.
if PRINT_REMOVING is true: For each useless assignment, print removing useless assignment %xx, where xx is the instruction number. The order in which this information is printed doesn't matter (your output will be sorted before comparing it to the expected output.)

EXAMPLE

Below is an example C program and the output that should be produced (also the output that would be produced by your printCode pass, which helps you see which instructions get removed). You don't have to match blank lines or whitespace within lines exactly, just make your output readable. Do match exactly the words and symbols used, including upper vs lower case so that we can compare your output with expected results using diff.

The example source and all of the outputs shown below can be found in /p/course/cs701-fischer/public/proj2.

Source Code

#include <stdio.h>
void foo(int x) {
  int y;
  scanf("%d", &y);
  x = y / 3; // useless
}

int main() {
  int a, b, c;
  scanf("%d", &a);  // force a to be stored in memory not a register
  b = a * 2;        // useless
  c = a + 4;        // useless
  foo(a);
  c = a * 6;
  printf("c: %d\n", c);
}

Output of `printCode` before running the `liveVars` pass.


FUNCTION foo

BASIC BLOCK entry
%1:     alloca   XXX
%2:     call     XXX %1 __isoc99_scanf
%3:     load     %1
%4:     sdiv     %3 XXX
%5:     ret

FUNCTION main

BASIC BLOCK entry
%6:     alloca   XXX
%7:     call     XXX %6 __isoc99_scanf
%8:     load     %6
%9:     mul      %8 XXX
%10:    load     %6
%11:    add      %10 XXX
%12:    load     %6
%13:    call     %12 foo
%14:    load     %6
%15:    mul      %14 XXX
%16:    call     XXX %15 printf
%17:    ret      XXX

Output of `printCode` after running the `liveVars` pass.

Note that instructions %4, %9 and %11 got removed (and so the old instruction %5 is now %4, and so on)


FUNCTION foo

BASIC BLOCK entry
%1:     alloca   XXX
%2:     call     XXX %1 __isoc99_scanf
%3:     load     %1
%4:     ret

FUNCTION main

BASIC BLOCK entry
%5:     alloca   XXX
%6:     call     XXX %5 __isoc99_scanf
%7:     load     %5
%8:     load     %5
%9:     load     %5
%10:    call     %9 foo
%11:    load     %5
%12:    mul      %11 XXX
%13:    call     XXX %12 printf
%14:    ret      XXX

Output when `PRINT_ANAL_RESULTS` is true

FUNCTION foo

BASIC BLOCK entry  L-Before: { }  L-After: { }
%1:   L-Before: { }     L-After: { %1 }
%2:   L-Before: { %1 }  L-After: { %1 }
%3:   L-Before: { %1 }  L-After: { %3 }
%4:   L-Before: { %3 }  L-After: { }
%5:   L-Before: { }     L-After: { }

FUNCTION main

BASIC BLOCK entry  L-Before: { }  L-After: { }
%6:   L-Before: { }     L-After: { %6 }
%7:   L-Before: { %6 }  L-After: { %6 }
%8:   L-Before: { %6 }  L-After: { %6 %8 }
%9:   L-Before: { %6 %8 }       L-After: { %6 }
%10:   L-Before: { %6 } L-After: { %6 %10 }
%11:   L-Before: { %6 %10 }     L-After: { %6 }
%12:   L-Before: { %6 } L-After: { %6 %12 }
%13:   L-Before: { %6 %12 }     L-After: { %6 }
%14:   L-Before: { %6 } L-After: { %14 }
%15:   L-Before: { %14 }        L-After: { %15 }
%16:   L-Before: { %15 }        L-After: { }
%17:   L-Before: { }    L-After: { }

Output when `PRINT_ANAL_REMOVING` is true

removing useless assignment %4
removing useless assignment %9
removing useless assignment %11

Useful Links

Here are all of the links provided above (gathered together into one place for your convenience):

LLVM iterators for the basic blocks in a function.
LLVM iterators for the instructions in a basic block.
LLVM SmallSet
LLVM DenseMap
links to LLVM .h files
C++ vector
C++ set
C++ map
Instruction.h LLVM functions that identify binary, cast, and shift instructions
Instruction.def op-code numbers and associated names
LLVM eraseFromParent method
C++ sort operation on lists

Submit Your Work

To submit your work, copy all of your .cpp and .h files as well as your Makefile to your handin directory:

    ~cs701-1/HANDIN/YOUR-LOGIN/P2

using your actual login in place of YOUR-LOGIN. If you are working with a partner, only one of you should submit your work.

Late Policy

The project is due on Wednesday, October 15. It may be handed in late, with a penalty of 3% per day, up to Wednesday, October 22. The maximum late penalty is therefore 21% (the maximum possible grade becomes 79). This assignment will not be accepted after Wednesday, October 22.

Fri Aug 22 11:45:08 CDT 2014

CS 701, Program 2 Dataflow Analysis and Optimizations