The main focus of CS 701 is on the (backend) optimization phase of a compiler, including several kinds of analysis that must be done to enable optimizations.
To reinforce your understanding of the concepts involved, you'll do four projects that implement various backend components. You'll be using the LLVM Compiler Infrastructure. (LLVM means "Low Level Virtual Machine"). LLVM was initially developed by a group led by Vikram Adve, an alumnus of the University of Wisconsin (and CS 701!).
LLVM is implemented in C++.
It includes commands clang
, opt
,
and llc
, which run a C front-end, an
optimizer, and a backend, respectively.
LLVM includes many more commands, most of which are documented at
http://llvm.org/docs/CommandGuide/index.html, but you won't need
those for this class.
LLVM is installed and ready to use in
/unsup/llvm-3.3/
.
The projects that you will build will produce dynamic libraries, which
will be used at runtime when you invoke the opt
and llc
commands.
This means you don't ever need to copy or compile the entire LLVM
source.
(The complete LLVM tree is about 1.5 GB, and a full compilation takes
about half an hour.)
Note: You will need to frequently reference the LLVM documentation in order to do the four CS 701 projects. The LLVM documentation is voluminous. We will try to give you enough information to save you from unnecessary frustration with the documentation. If you are having trouble anyway, don't hesitate to ask for help (particularly through Piazza).
To use the standard LLVM commands (like opt
and llc
), you should add the LLVM binary directory to
your PATH. If you don't have your own elaborate environment
configuration, you can add to PATH like this:
If you use the csh
or tcsh
shell, add
the following to the file
~/.cshrc.local
:
set path=($path /unsup/llvm-3.3/bin )
If you use ksh
or bash
, add the
following to the file
~/.profile
(for ksh
)
or .bash.local
or maybe .bashrc.local
(for bash
):
export PATH=$PATH:/unsup/llvm-3.3/bin
So that the changes to your PATH
take effect, restart
your console session (logout and back in).
LLVM is composed of many separate pieces. To use LLVM to turn a C source file into an x86 (or x86_64) executable, we need to:
Just as C files use the extension .c
, LLVM bitcode
uses the extension .bc
, assembly uses the
extension .s
, and LLVM human-readable assembly uses the
extension .ll
.
Suppose we have a C program in the file foo.c
. Below
are the steps needed to create an executable (with no optimization):
clang -emit-llvm -O0 -c foo.c -o foo.bc // create bitcode .bc
llc foo.bc // create assembly .s
gcc foo.s -o foo // create executable "foo"
To run your program:
foo
To turn LLVM bitcode into human-readable LLVM assembly
(foo.ll
):
llvm-dis -f foo.bc
The above LLVM commands (clang
, llc
,
and llvm-dis
)
are all available in the /unsup
directory.
Let's get a little more familiar with LLVM's instructions. Consider the
following C program, sum.c
:
#include <stdio.h>
int main() {
int n;
int sum;
sum = 0;
for (n = 0; n < 10; n++)
sum = sum + n*n;
printf("sum: %d\n", sum);
}
Running clang
and llvm-dis
produces
the following LLVM assembly code (on my 64-bit machine):
; ModuleID = 'sum.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
@.str = private unnamed_addr constant [9 x i8] c"sum: %d\0A\00", align 1
; Function Attrs: nounwind uwtable
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%n = alloca i32, align 4
%sum = alloca i32, align 4
store i32 0, i32* %retval
store i32 0, i32* %sum, align 4
store i32 0, i32* %n, align 4
br label %for.cond
for.cond: ; preds = %for.inc, %entry
%0 = load i32* %n, align 4
%cmp = icmp slt i32 %0, 10
br i1 %cmp, label %for.body, label %for.end
for.body: ; preds = %for.cond
%1 = load i32* %sum, align 4
%2 = load i32* %n, align 4
%3 = load i32* %n, align 4
%mul = mul nsw i32 %2, %3
%add = add nsw i32 %1, %mul
store i32 %add, i32* %sum, align 4
br label %for.inc
for.inc: ; preds = %for.body
%4 = load i32* %n, align 4
%inc = add nsw i32 %4, 1
store i32 %inc, i32* %n, align 4
br label %for.cond
for.end: ; preds = %for.cond
%5 = load i32* %sum, align 4
%call = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([9 x i8]* @.str, i32 0, i32 0), i32 %5)
%6 = load i32* %retval
ret i32 %6
}
declare i32 @printf(i8*, ...) #1
attributes #0 = { nounwind uwtable "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"="true" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"="true" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false" "use-soft-float"="false" }
Some remarks about this assembly code:
;
is a comment.entry
,
for.cond
,
for.body
,
for.inc
, and
for.end
,
entry
and for.cond
blocks,
as well as the for.body
and for.inc
blocks are each really a single basic
block (the first block ends with an unconditional branch to the
second block) -- I don't know why LLVM has separated them.%0
, %1
, %n
,
and %for.cond
)
are either virtual register names (more about this later)
or block labels.
You'll want to become comfortable with LLVM assembly (the human readable form of LLVM bitcode) because the first three projects that you will write for this class will accept LLVM bitcode as input and will emit LLVM bitcode as output.
Here are links to some LLVM documents that you may find useful during the semester (but see Useful Links below for documentation specific to project 1):
Each of the four class projects will be implemented by an individual or a two person team (your choice).
For each project you will write one or more new LLVM
passes.
For Project 1, you will write two passes.
All passes will be invoked by command-line flags
recognized by the LLVM opt
command.
The two passes (explained in
more detail below) for Project 1 are:
A printCode
pass, implemented in a file
called printCode.cpp
.
Your printCode
pass will print the LLVM assembly
code for each function in a useful format (defined below).
An optLoads
pass, implemented in a file called
optLoads.cpp
.
Your optLoads
pass will find and remove
unnecessary load instructions in each basic block of a function (as defined below).
Before starting the project, it might be a good idea to read some of the LLVM documentation. First, it might help to review some of the core LLVM classes. Concentrate on the following:
Instructions
, BasicBlocks
,
and Functions
are all Values
)And here are some other useful links, mentioned in the description of the project below:
proj1
TreeA skeleton for the printCode
pass has been prepared for you.
To set up the skeleton and build it, navigate to the
location where you want to put your 701 projects and type
the following:
cp -r /p/course/cs701-fischer/public/proj1 proj1
cd proj1
make
For the first part of this project, you'll work on the file:
lib/p1/printCode.cpp
in the proj1
directory that you just made.
The class printCode
is a
FunctionPass
that is run when you invoke the
opt
command.
Its Makefile
is configured to build it as a dynamic
library.
Given an LLVM bitcode file foo.bc
(created from a C
file as described above) we can tell opt
to run
the printCode
pass as follows:
opt -load Debug/lib/P1.so -printCode foo.bc > foo.opt
P1.so
is a shared library file created from the
C++ source code in the lib/p1
subdirectory of your
proj1
directory (for now, that's just
printCode.cpp
).
Debug/lib/P1.so
is in the proj1
directory that you created.-load
flag loads P1.so
as a
dynamic library.
This allows us to build and test the printCode
pass
without rebuilding the opt
binary.
-printCode
flag indicates the pass we'd
like opt
to run.
(That flag is defined in printCode.cpp
.)
To see all of the built-in passes,
type opt -help
.opt
to modify (optimize) the bitcode,
so when it is run it outputs the optimized version of the bitcode.
Therefore, we send the output to a new file (foo.opt
).
Since printCode
only prints (does not do any optimization),
foo.opt
will be the same as foo.bc
.
For passes that do modify the bitcode, you'll need to
mv foo.opt foo.bcbefore calling
llc
to transform the optimized bitcode to assembly code
if you want to run the program.
For more information about opt
, see its
documentation online.
To ensure that everything runs properly at this point, you should
write a small C program foo.c
in
the proj1
directory, create the corresponding LLVM
bitcode file, and run the printCode
pass over that
bitcode file.
You can do the steps explicitly like this:
clang -emit-llvm -O0 -c foo.c -o foo.bc
opt -load Debug/lib/P1.so -printCode foo.bc > foo.opt
make foo.printCode
The version of printCode
we've given you doesn't do much --
it simply prints the name of each function in the source program. Your
job will be to modify printCode
to print information
about each LLVM instruction.
For your first programming assignment, you will
make printCode
print a useful version of the
bitcode file on which it runs.
Getting more familiar with the bitcode and being able to output
it will be helpful for the remaining projects.
Before discussing how you should modify printCode
,
here's some important information about how LLVM represents virtual
registers.
When you look at LLVM assembly code (a .ll
file), you
see virtual register names -- names that start with a percent sign,
like %1
, %2
, %n
, %mul
etc, in the example code given above.
You can think of virtual registers whose names use numbers or names
that are not the names of variables in the source code
(e.g., %1
, %mul
) as temporaries, and those that
use identifiers from the source code (e.g., %n
,
%sum
) as registers
that hold pointers to the memory allocated for local variables.
However, in the intermediate representation (IR) used by LLVM, there
are no virtual-register objects.
Instead, for each instruction that assigns to a virtual register,
that register is represented by the instruction's address.
An instruction that uses the virtual register has the defining
instruction's address as its operand.
For example, one of the instructions in the
example code given above is
%mul = mul nsw i32 %2, %3The LLVM IR for that instruction has the following fields:
%2
and one for %3
.
The operand for %2
is the address of the
instruction that "assigned" to virtual register %2
,
i.e., the address of the instruction
%2 = load i32* %n, align 4Similarly, the operand for
%3
is the address of
the instruction that assigned to %3
.
There is no operand for the target register, %mul
;
instead, the subsequent use of
%mul
(in the instruction %add = add nsw i32 %1, %mul
)
uses the address of the instruction %mul = mul nsw i32 %2, %3
as an operand.
Now we'll talk about how you should modify printCode
.
The printCode
class is a
FunctionPass
. It includes a runOnFunction
method that is called once for each function in the input program.
Here's what your version of runOnFunction
should do:
Create a map (you can use a
DenseMap
, or a
C++ STL map,
or you can define your own Map
class)
that maps each instruction in the function to a unique integer
starting with 1.
Do this by iterating over all instructions in the
function and mapping each to the next integer value.
(See Useful Links for Project 1
above for how to do various kinds of iterations.)
Note:
printCode.cpp
.
Print "FUNCTION"
and the name of the function.
Iterate over all basic blocks in the function.
For each, print a blank line, then "BASIC BLOCK
"
and the name of the basic block.
After printing the block name, iterate over the instructions
in the block.
For each, print a percent, then the number of the instruction (using
your map), a colon, the name of the opcode, and each operand.
(Instruction.h
has methods for getting the opcode and the opcode name, and
the User
class has methods for getting the number of operands and the
operands themselves).
When you print an operand that is an instruction, print a percent
and the instruction's number (using your map).
For an operand that is not an instruction, if it has a name,
print that name; otherwise print XXX.
You can use the
isa
operator to see whether an operand is an
Instruction
, and
use the hasName
and getName
methods of the
Value
class to see whether an operand has a name and if so to get
that name.
All output should go to stderr
(i.e., use std::cerr
<< ...
).
For example, for the program shown above, your output should look like this:
FUNCTION main
BASIC BLOCK entry
%1: alloca XXX
%2: alloca XXX
%3: alloca XXX
%4: store XXX %1
%5: store XXX %3
%6: store XXX %2
%7: br for.cond
BASIC BLOCK for.cond
%8: load %2
%9: icmp %8 XXX
%10: br %9 for.end for.body
BASIC BLOCK for.body
%11: load %3
%12: load %2
%13: load %2
%14: mul %12 %13
%15: add %11 %14
%16: store %15 %3
%17: br for.inc
BASIC BLOCK for.inc
%18: load %2
%19: add %18 XXX
%20: store %19 %2
%21: br for.cond
BASIC BLOCK for.end
%22: load %3
%23: call XXX %22 printf
%24: load %1
%25: ret %24
You don't have to match the whitespace within each line
exactly, but please try to make your output as similar to this as possible
so that we can compare your output with the expected output using
diff -w
.
In particular, if there is whitespace in the example above,
please make sure that your output has whitespace, too.
So for example, you should not output the opcode immediately after
the instruction number, like this: %25:ret %24
.
For the second part of this project, you will implement a pass that
finds and removes unnecessary load instructions in each function.
An instruction that loads a value from memory into a
virtual register %k
is unnecessary if the
previous instruction in the same basic block stored a value v
to the same memory location.
You should find all such loads and replace all uses
of %k
with uses of v
.
Then you should remove the unnecessary load instruction.
For example, if the original code looks like this:
store i32 12, i32* %x, align 4 // store the value 12 into the memory location pointed to by %x
%0 = load i32* %x, align 4 // load the value in the memory location pointed to by %x into %0
%add = add nsw i32 %0, 22 // set %add to be the value in %0 + 22
store i32 %add, i32* %y, align 4 // store the value in %add into the memory location pointed to by %y
%1 = load i32* %y, align 4 // load the value in the memory location pointed to by %y into %1
%add1 = add nsw i32 %1, 33 // set %add1 to be the value in %1 + 33
store i32 %add1, i32* %z, align 4 // store the value in %add1 into the memory location pointed to by %z
You would change it to the following:
store i32 12, i32* %x, align 4 // store the value 12 into the memory location pointed to by %x
// 1st unnecessary load was removed
%add = add nsw i32 12, 22 // set %add to be the value 12 + 22
store i32 %add, i32* %y, align 4 // store the value in %add into the memory location pointed to by %y
// 2nd unnecessary load was removed
%add1 = add nsw i32 %add, 33 // set %add1 to be the value in %add + 33
store i32 %add1, i32* %z, align 4 // store the value in %add1 into the memory location pointed to by %z
Note that you can get the above code from the following source code:
int main() { int x, y, z; x = 12; y = x + 22; /* load value of x that was just stored */ z = y + 33; /* load value of y that was just stored */ }
Implement the optLoads
pass as a FunctionPass
in a file called optLoads.cpp
, run from opt
using the -optLoads
flag.
For example:
clang -emit-llvm -O0 -c foo.c -o foo.bc opt -load Debug/lib/P1.so -optLoads foo.bc -o foo.optLoads mv foo.optLoads foo.bcHere is what you should do:
Make a copy of printCode.cpp
called optLoads.cpp
in the same directory.
Change everything specific to printCode
to
refer instead to optLoads
.
Make sure that everything is OK so far:
Add optLoads.o
to the definition of OBJS
in the Makefile
in the proj1/lib/p1
.
Type make
in the proj1
directory to create
the optLoads
pass as well as the printCode
pass (both will be in library file P1.so
).
To run the optLoads
pass use:
opt -load Debug/lib/P1.so -optLoads foo.bc -o foo.optLoads
Now write the new runOnFunction
code:
printCode
.
Iterate over all basic blocks in the function, and all
instructions in each basic block.
Look for an instruction that stores a value
v
to the memory location pointed to by virtual
register %m
,
immediately followed by an instruction that loads from the
location pointed to by %m
into
register %k
.
The second instruction (the load) is unnecessary.
Note: The opcode for a load instruction is
Instruction::Load
, and the opcode for
a store instruction is Instruction::Store
.
Print (to stderr
)
%n is a useless loadwhere
n
is the number of the instruction that is
the useless load, retrieved from
your instruction map (which probably will not be the same as the
target virtual register you'll see for that instruction if you
look at the output of llvm-dis
).
Replace all uses of %k
with a use of
v
.
The
Value class includes a
replaceAllUsesWith
Save the (address of the) unnecessary load instruction so
that you can remove it when you're done iterating over all
basic blocks in the current function (you will mess up the
iteration if you remove it now).
The
Instruction class has an eraseFromParent
method
that you can use.
The documentation is
here.