BAMBI compiler

We chose to have a compiler that translates C programs into BAMBI assembly. To do this we ported the GNU C Compiler suite (GCC) to our architecture. GCC is already able to compile programs for around forty other processor architectures so it has a fairly well defined interface to add support for new architectures. We used the PowerPC and the SPUR ports as examples in the porting process. There are two main files which define each architecture to GCC. The first is a machine description file (.md) which describes to the compiler in basic terms what every instruction does, and sometimes tells the compiler how to combine instruction sequences to perform higher level operations. The second file is a C header which describes the machine environment and attributes (word width, byte ordering, available registers, etc..) to the compiler.

Function Emulation

GCC requires several operations that the BAMBI architecture does not define, most notable multiply and divide (__mulsi3 and __divsi3.) These were provided with some functions borrowed from a flavor of the MIPS port intended for embedded processors.

Files

To port GCC to the bambi architecture the following files were added to the standard binutils-2.9.1 directory:

RTL and the .md file

GCC separates C compilation from assembly output using a language called RTL. The output of the C parser is a description of the program in RTL. The RTL program is then converted to a particular machine's architecture using the RTL descriptions provided by the port, these translations exist in the file ece554.md. Compilation from C to RTL is often affected by which RTL operators are provided by the machine description, or how efficient any given one might be. Most of the optimization the compiler does is done on the RTL form of the program. This allows us to have not only a C compiler, but a fairly advanced _optimizing_ compiler without having to write one from scratch.

RTL looks like this:
  (set 
      (match_operand:SI 0 "register_operand" "=r")
      (not:SI 
             (match_operand:SI 1 "register_operand" "r")))

This particular statement says "set register operand 0 to the logical inverse of register operand 1." RTL is defined in terms of registers: the movement and operations that can be performed between them. Registers here are abstract notions, so can also be memory locations.

GCC has a certain minimal set of instructions which must be contained in a machine description. Some of these instructions, such as the one's compliment instruction shown above, must be written as a compound of other instructions. The full entry in the ece554.md file to describe how to generate ones compliment assembly is this:

;
; bitwise inversion of reg 1
;
(define_insn "one_cmplsi2"
  [(set (match_operand:SI 0 "register_operand" "=r")
	(not:SI 
               (match_operand:SI 1 "register_operand" "r")))]
  ""
  "sub %0,r0,%1\;subi %0,%0,1")

A line by line explanation of what this says:


Tell the compiler we are defining one of its needed functions: "one_cmplsi2"
(define_insn "one_cmplsi2"
The RTL for this instruction is contained in [square brackets]
The first line says "set the 0th operand, which must be a register one word wide (SI)"
  [(set (match_operand:SI 0 "register_operand" "=r")
The value which it is being set to is the result of a 'not' operation which returns a word-wide value (SI)
	(not:SI 
The not is performed on operand #1 (the second operand) which also must be in a register and be one word wide. This is the end of the RTL description.
               (match_operand:SI 1 "register_operand" "r")))]
Then there is a field which allows us to disable this instruction in case we are compiling for a variant of the architecture which does not have this instruction. We leave this blank because there is only one variant of the BAMBI architecture.
  ""
And finally a free form printf-like string to tell the compiler how to write the instruction in assembly. When this is actually output, the fields "%0" and "%1" will be replaced with the correct register names (the names of operands 0 and 1) by the compiler. The "\;" indicates the beginning of a new line of code.
  "sub %0,r0,%1\;subi %0,%0,1")
(Note: This is the fastest way to do a bitwise inversion of an entire register in BAMBI because xori does not sign extend its immediate.)

Compilation Example

The following C function demonstrates simple use of the ones complement operation that has been described:
int ones_complement(int dummy, int a)
{
    return ~a;
}
Functions are defined to pass variables in registers 3-12, and their return value always goes in register 3. The dummy vairable is used here to force the incoming "operand 0" to be in register 4, while the return value must go into register 3. Here is the assembly generated for this function:
        .align 2        # force function start address to be 
                        # aligned on 2^2 address boundary
.globl _ones_complement
_ones_complement:       # label for the function start address
        sw r30,4(r1)    # save the return address in the caller's stack
        subi r1,r1,12   # stack: size: 0b nreg: 3
        sw r31,8(r1)    # saving live
        addi r31,r1,8   # set up frame pointer
        sw r31,0(r1)    # set pseudo back-chain
        sub r3,r0,r4    # 11   one_cmplsi2
        subi r3,r3,1
        lw r31,8(r1)    # restore frame pointer
        addi r1,r1,12   # deallocate stack
        lw r30,4(r1)    # restore link reg
        jalr r30

Known Bugs

Possible Compiler Optimizations

Indispensible Resources

configuration

mkdir build.gcc
cd build.gcc
../gcc-2.95.2/configure --target=ece554-unknown-elf --prefix=/usr/home/tesch/ece554 --with-gnu-as --with-gnu-ld --nfp --enable-languages=c --with-headers=/usr/home/tesch/ece554/src/newlib-1.9.0/newlib/libc/include
Last modified: Sun May 13 01:49:10 CDT 2001