LLVM and C++
This course focuses on compiler analyses, optimizations, and code generation issues
(instruction selection, scheduling and register allocation).
These are components of the "back-end" of a compiler.
Class projects will involve implementing several back-end components using the
LLVM Compiler
Infrastructure. (LLVM originally meant "Low Level Virtual Machine"). LLVM
was initially developed by a group led
by
Vikram Adve,
an alumnus of the University of Wisconsin (and CS 701!).
LLVM is implemented in C++.
It includes commands clang
, opt
,
and llc
, which run a C front-end, an
optimizer, and a back-end, respectively.
LLVM includes many more commands, most of which are documented at
http://llvm.org/docs/CommandGuide/index.html, but you won't need
those for this class.
One of the great attractions of LLVM is that is provides a wide variety of compiler components that
can be cobbled together to build a compilation tool. Thus you can change a front-end to accommodate a new
source language, change an optimization component to improve performance, or change a code generator to
accommodate a new target architecture. Because the LLVM program representation is virtual, such changes are
feasible. The
LLVM overview paper
explains this in more detail.
You will solve the projects in this class by writing various LLVM passes.
This tutorial on writing an
LLVM pass will be useful
(Ignore what this documentation says about Setting up the build environment.)
The compiler front-end (for C in our projects) will produce a translation, in LLVM IR, for
each function in the source file. We will use an iterator that examines each function in turn.
The translation of each function is organized as a collection of basic blocks.
Each basic block is a sequence of LLVM IR instructions, guaranteed always to execute sequentially.
A function's basic blocks are linked together to form a control flow graph that represents potential
execution paths in function.
We use an interator to examine the basic blocks in function.
For a given basic block, we use another iterator to examine the individual instructions in the block.
LLVM provides methods to examine the contents of an instruction (its operator and operands, their types and values).
The following links provide more detail on the iterators and classes involved:
Here are some additional useful LLVM links:
Because LLVM's primary implementation language is C++, we will implement our class projects
in C++. An
Introduction to C++ for Java Programmers
may be helpful if you wish to brush up on C++.
The class projects use fairly advanced makefiles. See the
Make Reference Manual
for more information on using this tool.
Fri Jul 18 14:00:55 CDT 2014