The Assembly Process

A computer understands machine code.

People (and compilers) write assembly language.

  assembly     -----------------       machine
  source  -->  |  assembler    | -->   code
  code         -----------------

An assembler is a program (a very deterministic program). It translates each instruction to its machine code.

In the past, there was a one-to-one correspondence between assembly language instructions and machine language instructions.

This is no longer the case. Assemblers are now-a-days made more powerful, and can "rework" code, doing further translation on assembly language code in a manner that would once have been done only by a compiler.

The Translation of MAL to TAL

There are lots of MAL instructions that have no direct TAL equivalent. They will be translated (composed, synthesized) into one or more TAL instructions. The MAL instructions that need to be translated are often called pseudoinstructions.

How to determine whether an instruction is a TAL instruction or not: look in the list of TAL instructions. If the instruction is there, then it is a TAL instruction!

The assembler takes (non MIPS, or pseudoinstructions) MAL instructions and synthesizes them with 1 or more MIPS instructions. Here are a bunch of examples.

Multiplication and Division Instructions

    mul $8, $17, $20
becomes
        mult  $17, $20
        mflo  $8

Why? 32-bit multiplication produces a 64-bit result. To deal with this larger result, the MIPS architecture has 2 registers that hold results for integer multiplication and division. They are called HI and LO. Each is a 32 bit register.

mult places the least significant 32 bits of its result into LO, and the most significant into HI. Note that this can lead to an incorrect product being used as the result, in the case that more than 32 bits are required to represent the correct product.

Then, more TAL instructions are needed to move data into or out of registers HI and LO:

    operation of mflo,  mtlo,  mfhi,  mthi
                 |||                  |||
                 ||-- register lo     ||- register hi
                 |--- from            |-- to
                 ---- move            --- move

Data is moved into or out of register HI or LO.

One operand is needed to tell where the data is coming from or going to.

Integer division also uses register HI and LO, since it generates both a quotient and remainder as a result.

  div $rd, $rs, $rt     # MAL
becomes
      div  $rs, $rt     # TAL
      mflo $rd          # quotient in register LO
and
  rem $rd, $rs, $rt     # MAL
becomes
      div  $rs, $rt     # TAL
      mfhi $rd          # remainder in register HI

Load and Store Instructions

    lw  $8, label
becomes
        la  $8, label
        lw $8, 0($8)
which becomes
        lui $8, 0xMSpart of label      # label represents an address
        ori $8, $8, 0xLSpart of label
        lw $8, 0($8)
or
        lui $8, 0xMSpart of label
        lw $8, 0xLSpart of label($8)
Note that this 2-instruction sequence only works if the most significant bit of the LSpart of label is a 0.

The la instruction is also a pseudoinstruction (MAL, but not TAL). Its synthesis is accomplished with the 2 instruction sequence of lui followed by ori as given above. The lui instruction places the most significant 16 bits of the desired address into a register, and the ori sets the least significant 16 bits of the register. For example, assume that the label X has been assigned by the assembler to be the address 0xaabb00cc. The MAL instruction

    la  $12, X
becomes
        lui $12, 0xaabb
        ori $12, $12, 0x00cc

A store instruction which implies the use of a la pseudoinstruction in its synthesis needs to place the address in a register. For example, consider the code

    sw  $12, X

This synthesis may not use register $12 as the place to temporarily hold the address of X as the above example for the lw did. Using $12 would overwrite the value that is to be stored to memory. In this case, and other cases like this, the assember requires the use of an extra register to complete the synthesis. Register $1 on the MIPS processor is set aside (by convention) for exactly this type of situation. The synthesis for this sw example becomes

        lui $1, 0xaabb
        ori $1, $1, 0x00cc
	sw  $12, 0($1)

Instructions with Immediates

Instructions with immediates are synthesized with instructions that must have an immediate value as the last operand.

    add $sp, $sp, 4
becomes
    addi $sp, $sp, 4

An add instruction requires 3 operands in registers. addi has one operand that must be an immediate.

These instructions are classified as immediate instructions. On the MIPS, they include: addi, addiu, andi, lui, ori, xori.

Instructions with Too Few Operands

 add $12, $18
is expanded back out to be
   add $12, $12, $18

I/O Instructions

putc $18
becomes
   li $2, 11         # MAL
   move $4, $18      # MAL
   syscall
which becomes
          addi $2, $0, 11
	  add  $4, $18, $0
	  syscall
getc $11
becomes
   li $2, 12
   syscall
   move $11, $2
which becomes
          addi $2, $0, 12
	  syscall
	  add  $11, $2, $0
puts $13
becomes
   li $2, 4
   move $4, $13
   syscall
which becomes
          addi $2, $0, 4
	  add  $4, $13, $0
	  syscall
done
becomes
   li  $2, 10
   syscall
which becomes
          addi $2, $0, 10
	  syscall

Assembly

The assembler's job is to

  1. assign addresses
  2. generate machine code

A modern assembler will

A simple assembler will make 2 complete passes over the data to complete this task.
Pass 1: create complete symbol table generate machine code for instructions other than branches, jumps, jal, la, etc. (those instructions that rely on an address for their machine code).
Pass 2: complete machine code for instructions that did not get finished in pass 1.

A symbol table is a table, listing address assignments (made by the assembler) for all labels.

The assembler starts at the top of the source code program, and scans. It looks for

An important detail: there are separate memory spaces for data and instructions. The assembler allocates each in sequential order as it scans through the source code program.

The starting addresses are fixed -- any program will be assembled to have data and instructions that start at the same, fixed address.

EXAMPLE (given in little endian order)


    .data
a1: .word 3
a2: .byte '\n'
a3: .space 5

       address     contents
     0x00001000    0x00000003
     0x00001004    0x??????0a
     0x00001008    0x????????
     0x0000100c    0x????????  (the 3 MSbytes are not part of the declaration)

Note: Our assembler (in the 354 simulator) will align data to word addresses unless you specify otherwise!

Machine Code Generation

Simple example of machine code generation for simple instruction:

     assembly language:      addi  $8, $20, 15

                              ^     ^   ^    ^
			      |     |   |    |

			    opcode rt   rs  immediate

     machine code format
      31                      15             0
      -----------------------------------------
      | opcode |  rs  |  rt  |  immediate     |
      -----------------------------------------

       opcode is 6 bits -- it is defined to be 001000

       rs is 5 bits,    encoding of 20, 10100
       rt is 5 bits,    encoding of  8, 01000
			     
      so, the 32-bit instruction for addi $8, $20, 15  is
       001000 10100 01000 0000000000001111

       re-spaced:
       0010 0010 1000 1000 0000 0000 0000 1111
	 OR
     0x  2    2   8    8    0    0    0    f

A Detailed MIPS R2000 Assembly Example

The Source Code:


 .data
a1: .word 3
a2: .word 16:4
a3: .word 5

 .text
__start: la $6, a2              # MAL code fragment
loop:    lw $7, 4($6)
         mult $9, $10
         b loop
         done

The Symbol Table:

    symbol      address
    ---------------------
    a1         0040 0000
    a2         0040 0004
    a3         0040 0014
    __start    0080 0000
    loop       0080 0008

Memory Map of the Data Section:

address     contents
	    hex          binary
0040 0000   0000 0003    0000 0000 0000 0000 0000 0000 0000 0011 
0040 0004   0000 0010    0000 0000 0000 0000 0000 0000 0001 0000
0040 0008   0000 0010    0000 0000 0000 0000 0000 0000 0001 0000
0040 000c   0000 0010    0000 0000 0000 0000 0000 0000 0001 0000
0040 0010   0000 0010    0000 0000 0000 0000 0000 0000 0001 0000
0040 0014   0000 0005    0000 0000 0000 0000 0000 0000 0000 0101

Translation to TAL Code:


 .text
__start: lui $6, 0x0040      # la $6, a2
         ori $6, $6, 0x0004
loop:    lw $7, 4($6)
         mult $9, $10
         beq $0, $0, loop    # b loop
         ori $2, $0, 10      # done
         syscall

Memory Map of the Text Section: memory map of text section

address      contents
	     hex          binary
0080 0000    3c06 0040    0011 1100 0000 0110 0000 0000 0100 0000 (lui)
0080 0004    34c6 0004    0011 0100 1100 0110 0000 0000 0000 0100 (ori)
0080 0008    8cc7 0004    1000 1100 1100 0111 0000 0000 0000 0100 (lw)
0080 000c    012a 0018    0000 0001 0010 1010 0000 0000 0001 1000 (mult)
0080 0010    1000 fffd    0001 0000 0000 0000 1111 1111 1111 1101 (beq)
0080 0014    3402 000a    0011 0100 0000 0010 0000 0000 0000 1010 (ori)
0080 0018    0000 000c    0000 0000 0000 0000 0000 0000 0000 1100 (syscall)

The Process of Assembly:

The assembler starts at the beginning of the ASCII source code. It scans for tokens, and takes action based on those tokens.

Branch Offset Computation

At execution time (for a taken branch):

     contents of PC + sign extended offset field | 00 --> PC
    

The PC points to the instruction after the beq when the offset is added.

At assembly time: (for the displacement or offset field of the beq in the above example)

    byte offset = target addr - ( 4 + beq addr )

		= 00800008 - ( 00000004 + 00800010 )  (hex)



                    (ordered to give POSITIVE result)
		 0000 0000 1000 0000 0000 0000 0001 0100
	      -  0000 0000 1000 0000 0000 0000 0000 1000
	      ------------------------------------------
		 0000 0000 0000 0000 0000 0000 0000 1100 (byte offset)

		    (compute the additive inverse)
		 1111 1111 1111 1111 1111 1111 1111 0011
	       +                                       1
	       -----------------------------------------
		 1111 1111 1111 1111 1111 1111 1111 0100  (-12)


		 we have 16 bit offset field.
		 throw away least significant 2 bits
		   (they should always be 0, and they are added
		    back at execution time)

	 1111 1111 1111 1111 1111 1111 1111 0100 (byte offset)
	  becomes
	                  11 1111 1111 1111 01   (offset field)

Jump Target Computation

At execution time:

     most significant 4 bits of PC || target field | 00 --> PC
					(26 bits)

at assembly time, to get the target field:

What remains is 26 bits, and it goes in the target field.

An example of machine code generated for a jump instruction:


      .
      .
      .
      j   L2
      .
      .
  L2: # another instruction here

Assume that the j instruction is to be placed at address 0x0100acc0
Assume that the assembler assigns address 0x0100ff04 for label L2

Then, when the assembler is generating machine code for the j instruction,

  1. The assembler checks that the most significant 4 bits of the address of the jump instruction is the same as the most significant 4 bits of the address for the target (L2).
    	    instruction address        0000 0001 0000 0000 (m.s. 16 bits)
    	    L2 address                 0000 0001 0000 0000 (m.s. 16 bits)
    	                               ^^^^
    
    These 4 bits ARE the same, so procede.
  2. Extract bits 27..2 of the target address for the machine code.
    	    L2  0000 0001 0000 0000 1111 1111 0000 0100
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    
  3. The machine code for the j instruction:
              000010     0001 0000 0000 1111 1111 0000 01
    	  op code       26-bit partial address
    
    	  Given in hexadecimal:
              0000 1000 0100 0000 0011 1111 1100 0001
    	  0x 0    8    4    0    3    f    c    1
    

In the first step, if the address of the jump instruction and the target address differ in their 4 most significant bits, then the assembler must translate to different TAL code.

One possible translation:

     j  L3     # assume j will be placed at address 0x0400 0088
     .
     .
     .
  L3:          # assume L3 is at address 0xab00 0040
becomes
      la   $1, L3
      jr   $1
which in TAL, would be
          lui  $1, 0xab00
	  ori  $1, $1, 0x0040
	  jr   $1

Copyright © Karen Miller, 2009