LLVM and C++

This course focuses on compiler analyses, optimizations, and code generation issues (instruction selection, scheduling and register allocation). These are components of the "back-end" of a compiler. Class projects will involve implementing several back-end components using the LLVM Compiler Infrastructure. (LLVM originally meant "Low Level Virtual Machine"). LLVM was initially developed by a group led by Vikram Adve, an alumnus of the University of Wisconsin (and CS 701!).

LLVM is implemented in C++. It includes commands clang, opt, and llc, which run a C front-end, an optimizer, and a back-end, respectively. LLVM includes many more commands, most of which are documented at http://llvm.org/docs/CommandGuide/index.html, but you won't need those for this class.

One of the great attractions of LLVM is that is provides a wide variety of compiler components that can be cobbled together to build a compilation tool. Thus you can change a front-end to accommodate a new source language, change an optimization component to improve performance, or change a code generator to accommodate a new target architecture. Because the LLVM program representation is virtual, such changes are feasible. The LLVM overview paper explains this in more detail. You will solve the projects in this class by writing various LLVM passes. This tutorial on writing an LLVM pass will be useful (Ignore what this documentation says about Setting up the build environment.) The compiler front-end (for C in our projects) will produce a translation, in LLVM IR, for each function in the source file. We will use an iterator that examines each function in turn. The translation of each function is organized as a collection of basic blocks. Each basic block is a sequence of LLVM IR instructions, guaranteed always to execute sequentially. A function's basic blocks are linked together to form a control flow graph that represents potential execution paths in function. We use an interator to examine the basic blocks in function. For a given basic block, we use another iterator to examine the individual instructions in the block. LLVM provides methods to examine the contents of an instruction (its operator and operands, their types and values). The following links provide more detail on the iterators and classes involved:

Here are some additional useful LLVM links:

Because LLVM's primary implementation language is C++, we will implement our class projects in C++. An Introduction to C++ for Java Programmers may be helpful if you wish to brush up on C++. The class projects use fairly advanced makefiles. See the Make Reference Manual for more information on using this tool.

Fri Jul 18 14:00:55 CDT 2014