### **Optimal Spilling for CISC Machines** with Few Registers

Andrew Appel Princeton University

Lal George
Bell Labs

June 20, 2001

#### The Problem

## Few Registers ⇒ Spilling!

| 163,355       | Total Instructions                               |
|---------------|--------------------------------------------------|
| 84            | Static no<br>Spill ins<br>K = 32                 |
| 22,123<br>14% | Static number of Spill instructions K = 32 K = 8 |

### **CISC Machines**

## Allow memory operands

Can free the register t in the instruction:

$$t \uparrow t \oplus s$$

#### by generating:

$$\mathsf{mem}_t \leftarrow \mathsf{mem}_t \ \oplus \ \mathbf{s}$$

### Traditional register allocators:

### Register allocation using integer linear programming

Ken Wilkens, T. Kong, D.Goodwin ['96, '98]

# Solve the allocation problem in its entirety!

- Can exhibit long solve times for small programs.
- Published literature is difficult and complex.
- "The performance results are all in the details"

#### Phase I: Optimal Spilling Our Approach

At a program point, should each variable be in a register or memory?

At most K live variables at any point

Phase II: Register Assignment Which register?

Guaranteed not to insert spills.

#### Phase I

### **Optimal Spilling**

### Register or memory?

- 0-1 integer linear program.
- Solved optimally and quickly.
- **Modelled in AMPL**
- ▶ No special tuning required
- At most K live variables at any program point.

# Phase I — in more detail

# Pseudo registers and input flowgraph

**Set V of Pseudo registers =**  $\{ v_1 v_2 \cdots \}$ 

**Set P of Program points =**  $\{p_1 p_2 \cdots \}$ 





# Linear programming variables

# Just 4 kinds of LP variables

For all live variables v at each program point p:

| $\mathrm{inMem}_{\mathrm{p}}^{\mathrm{v}}$ | $\mathrm{inReg}_{\mathrm{p}}^{\mathrm{v}}$ | $\mathbf{store}_{\mathbf{p}}^{\mathbf{v}}$ | $\mathrm{load}_{\mathrm{p}}^{\mathrm{v}}$ | Variable    |
|--------------------------------------------|--------------------------------------------|--------------------------------------------|-------------------------------------------|-------------|
| at p, v continues in memory                | at p, v continues in a register            | at p, v must be stored to memory           | at p, v must be loaded from memory        | Description |

Liveness constraint:  $load_p^v + store_p^v + inReg_p^v + inMem_p^v = 1$ 

# Using Linear Programming Results

#### Program

#### LP Results

#### **Final Program**

$$load_{p_j}^{V} = 1$$

• 
$$p_j$$
  $v \leftarrow mem_v$ 

$$\begin{array}{c} \bullet p_k \\ v \leftarrow v \oplus w \end{array}$$

$$\mathrm{inMem}_{\mathbf{p_k}}^{\mathbf{w}} = 1$$

$$^{ullet}$$
p<sub>k</sub>

$$v \leftarrow v \oplus mem_w$$

$$\mathrm{store}^{\mathrm{v}}_{\mathbf{p_m}} = 1$$

$$mem_V \leftarrow w$$

## **Compiler Organization**



#### Constraints

$$inReg_{\mathbf{p_1}}^{\mathbf{v}} + load_{\mathbf{p_1}}^{\mathbf{v}} = 1$$

$$inReg_{p_2}^{v} + store_{p_2}^{v} = 1$$

$$\mathbf{v} \leftarrow \mathbf{mem}[\mathbf{w} + \mathbf{x} * \mathbf{4} + 128]$$

$$\mathbf{p_2}$$

$$\mathrm{inReg}_{\mathbf{p_1}}^{\mathrm{w}} + \mathrm{load}_{\mathbf{p_1}}^{\mathrm{w}} = 1$$

$$inReg_{p_1}^x + load_{p_1}^x = 1$$

$$inReg_{p_2}^v + store_{p_2}^v = 1$$

#### **Splits**



$$load_{\mathbf{p_1}}^{\mathbf{w}} + inReg_{\mathbf{p_1}}^{\mathbf{w}} = store_{\mathbf{p_2}}^{\mathbf{v}} + inReg_{\mathbf{p_2}}^{\mathbf{v}}$$

$$\begin{array}{c} \operatorname{inMem}_{p_1}^{\mathrm{w}} \\ + \\ \operatorname{store}_{p_1}^{\mathrm{w}} \end{array}$$



### Coloring constraint



# Objective cost function

### Minimize the weighted cost of:

- inserted loads and stores from  $load_p^v$  and  $store_p^v$ ,
- Penalty for using memory operands

where

is rewritten to

$$mem_t \leftarrow mem_t \oplus s$$

# Phase II: In more detail

Which register?

# Should be done without further spilling!

However · · ·

no more than K live variables at any point!

## Register Assignment

# Aggregate register pressure is insufficient!!

Suppose K = 2

Live variables

Point Live

p<sub>i</sub> {X, Y}
p<sub>j</sub> {Y, Z}
p<sub>k</sub> {Z, X}

Interference graph



## Optimistic coalescing

- Insert parallel copies before every instruction.
- Resulting interference graph guaranteed to be colorable.
- Paper defines the Optimal Coalescing Problem.
- Our solution based on the Park and Moon optimistic coalescing.
- Details in the paper





#### percent spill-related instructions





### **Execution Speed**

| simple | life  | mandelbrot | knuth-bendix | logic | fft  | icfp00 | count-graphs | lexgen | tsp  | mlyacc | boyer | barnes-hut          | Benchmark |
|--------|-------|------------|--------------|-------|------|--------|--------------|--------|------|--------|-------|---------------------|-----------|
| 31.53  | 19.03 | 27.92      | 8.08         | 5.10  | 8.58 | 109.29 | 24.07        | 9.08   | 6.92 | 9.14   | 12.57 | 2.92 <mark>s</mark> | Base      |
| 25.12  | 15.24 | 23.21      | 7.22         | 4.61  | 7.80 | 99.72  | 22.15        | 8.84   | 6.77 | 9.11   | 12.49 | 2.92 <mark>s</mark> | Opt       |
| 25.5   | 24.9  | 20.3       | 11.9         | 10.6  | 10.0 | 9.6    | 8.7          | 2.7    | 2.2  | 0.0    | 0.0   | 0.0                 | Speedup % |

### Contributions

- A two phase approach
- An integer linear program that is:
- ▷ Extremely simple
- ▶ Requires no fine tunning
- Effective!
- ▶ Production quality compilers.
- Highly dependent on good integer IL solvers.