The SUIF Library - Instruction Level

Go to the first, previous, next, last section, table of contents.

The Instruction Level

Each instruction node in an abstract syntax tree holds a SUIF instruction. See section Instructions and Expression Trees. Most SUIF instructions perform simple operations; the opcodes resemble those for a typical RISC processor. However, more complex instructions are used in places where it is important to retain high-level information.

SUIF supports both expression trees and flat lists of instructions. In an expression tree, the instructions for an expression are all grouped together. This works well for high-level passes. Because expression trees do not totally order the evaluation of the instructions, they do not work so well for back-end optimization and scheduling passes. Thus SUIF also provides the flat list representation where each instruction node contains a single instruction.

Most SUIF instructions use a "quadruple" format with a destination operand and two source operands; however, some instructions require more specialized formats. For example, ldc (load constant) instructions have an immediate value field in place of the source operands. While most SUIF instructions are very simple, it is important to retain sufficient high-level information to support detailed analysis. Thus SUIF includes several instructions with more complex behavior:

The cal (call) instruction implements a machine-independent procedure call with a list of parameters. This hides the details of various linkage conventions.
The mbr (multi-way branch) instruction transfers control to one of the given labels depending on the value of its source operand. This represents computed goto and switch statements.
The array instruction computes the address of an element in an array given a list of index values. These instructions are eventually expanded to additions and multiplications to perform the necessary pointer arithmetic.

These instructions are much easier to analyze than the equivalent series of low-level instructions.

The following example shows the low-SUIF instructions corresponding to a simple fragment of C code:

x = 0;
y = *ptr;
if (y < x + z) {
    *ptr = x;
}

1: ldc (i.32) x = 0           // load integer constant 0
2: lod (i.32) y = ptr         // load from address in ptr
3: add (i.32) nd#3 = x, z     // add x and z
4: sl  (i.32) nd#4 = y, nd#3  // set if y < x + z
5: bfalse     nd#4, L:L1      // branch if false to label L1
6: str        ptr = x         // store x to address in ptr
7: lab        L:L1            // label L1

Most of the operands in this example are variables; however, the results of the add and sl instructions are temporary values and are not stored in variables. Such temporary values occur frequently, and rather than requiring that new variables be created to hold them, SUIF allows them to be used directly. Each temporary must have a single definition and a single use within the same basic block. In the printed code above, the temporary values are indicated by "node" numbers, using the ID numbers of the instructions that produce the values. Internally, however, the operands contain pointers between the instructions. For example, the bfalse source operand contains a pointer to the sl comparison instruction, and the sl destination operand contains a pointer to the branch instruction. Thus the definition and use of a temporary value are directly connected, making it easy to find one from the other.

Flat lists of instructions work well for many back-end compiler passes, but for high-level transformations expression trees are often a better representation. Thus the SUIF system supports expression trees as well as flat lists. See section Expression Trees. The difference between the two representations is actually quite small. The temporary value pointers described above naturally create trees, except that with flat lists the nodes of the trees are listed in bottom-up order. When using the expression tree representation, the instructions are rearranged so that only the roots of the expression trees are included in the AST instruction nodes. The other instructions are reached through the pointers in the operands. For example, by labeling each subtree (e.g. e1 and e2), the expression tree in the example could be printed as:

5: bfalse e1, L:L1
4:   e1: sl (i.32) y, e2
3:     e2: add (i.32) x, z

In this case, only the branch instruction is directly contained in an instruction node. The sl instruction is reached through the branch's source operand, and the add instruction is contained in the second source operand of the sl instruction.

Go to the first, previous, next, last section, table of contents.