Procedures and Functions

Quick terminology:
function, procedure, subroutine are all used (incorrectly) as synonyms in this set of notes.

Why have functions?

reuse of code simplifies program writing
modular code facilitates modification
allows different programmers to write different parts of the same program

Assembly languages typically provide little or no support for function implementation.

So, we get to build a mechanism for implementing functions out of what we already know.

First, some terms and what we need.

In Pascal:


      begin
	.
	.
	.
       x := larger(a, b);      CALL
	.
	.
	.
      end.
                  HEADER     PARAMETERS
      function larger (one, two: integer): integer;
      begin
	 if ( one > two ) then
	   larger := one             BODY
	 else
	   larger := two
      end;

In C:


      {
	.
	.
	.
       x = larger(a, b);      CALL
	.
	.
	.
      }
               HEADER     PARAMETERS
      int larger (int one, int two)
      {
	 if ( one > two )
	   larger = one;             BODY
	 else
	   larger = two;
         return larger;
      }

Steps in the execution of the function:

save return address
function call
execute function
return

what is return address? the instruction following call

what is function call? jump or branch to first instruction in the function

what is return? jump or branch to return address

not-quite-right MAL implementation of procedure call:


	   la  $8, rtn1
	   b proc1                 # one procedure call
     rtn1: # next instruction here
	    .
	    .
	    .
	   la  $8, rtn2
	   b proc1                 # another call location
     rtn2: # next instruction here

	    .
	    .
	    .

   proc1:    # 1st instruction of procedure here
	     .
	     .
	     .
	     jr $8

jr (jump register) is a new instruction -- it does an unconditional branch (jump, actually) to the address contained in the register specified.

The MIPS R2000 architecture (MAL) provides a convenient instruction for calls.


	    jal  procname

It does 2 things

it places the address of the instruction following it into register $ra ($31). (The choice of 31 is arbitrary, but fixed.)
it branches (jumps) to the address given by the label (procname).

the example re-written:


             jal proc1   # use of $ra is implied
	     .
	     .
	     .
	     jal proc1
	     .
	     .
	     .


   proc1:    # 1st instruction of procedure here
	     .
	     .
	     .
	     jr $ra   # $ra is the alias for $31

Note: on the MIPS architecture, each register name has an alias. This alias is an alternative name that is designed to remind the code writer about the implied duties (function) of the register. This example code contained the first register alias to learn and use: $31 is $ra. ra stands for return address.

One problem with this scheme. What happens if a procedure calls itself (recursion), or if a procedure calls another procedure (nesting) using jal?

Here is sample code for this problematic case:


             jal proc1
	     .
	     .
	     .
	     jal proc1
	     .
	     .
	     .


   proc1:    .
	     .
	     jal proc2
	     .
	     .
	     jr  $ra


   proc2:    .
	     .
	     .
	     jr  $ra

The value in register $ra gets overwritten with each jal instruction. Return addresses are lost. This is an unrecoverable error!

What is needed to handle this problem is to have a way to save return addresses as they are generated. For a recursive subroutine, it is not known ahead of time how many times the subroutine will be called. This data is generated dynamically; while the program is running.

These return addresses will need to be used (for returning) in the reverse order that they are saved.

The best way to save dynamically generated data that is needed in the reverse order it is generated is on a stack.

As already defined in the class, the data can be defined as either

static -- can be defined when program is written (compile time)
dynamic -- is defined when a program is executed (run time)

In this case, it is the amount of memory needed to hold items on the stack that cannot be determined until run time. This data (for return addresses) is dynamically generated.

The System Stack

A stack is so frequently used in implementing procedure call/return, that many (most?) computer systems predefine a stack, called the system stack, or simply, the stack.

The size of the system stack is very large. In theory, it should be infinitely large. In practice, it must have a size limit.

on the MIPS architecture:

 address  |         |
    0     | your    |
          | program |
          | here    |
          |         |
          |         |
          |         |
          |         |
          |         |
          | system  |  / \
  very    | stack   |   |  grows towards smaller addresses
 large    | here    |   |
 addresses

A little terminology:

Some people say that this stack grows down in memory. This means that the stack grows towards smaller memory addresses. Their diagram would show address 0 at the bottom (unlike my diagram).

down and up are vague terms, unless you know what the diagram looks like.

The MIPS system stack is defined to grow towards smaller addresses, and the stack pointer points to an empty location (the next available) at the top of the stack. The stack pointer is register $29, also called $sp, and gets a value before program execution begins. sp stands for stack pointer.

Code fragment for push, in MAL:


    sw   $?, ($sp)       # the ? is replaced by whatever register
    sub  $sp, $sp, 4     # contains the data to be pushed.


    sub  $sp, $sp, 4     # a "better" implementation, since it allocates
    sw   $?, 4($sp)      # the space before using the space.

Code fragment for pop, in MAL:


    add $sp, $sp, 4    # the ? is replaced by a register number
    lw  $?, ($sp)


    lw  $?, 4($sp)       # a "better" implementation, since it copies
    add $sp, $sp, 4      # the data out before deallocating the space

NOTE: if $sp is used for any other purpose, then the value of the stack pointer is lost.

An example of using the system stack to save return addresses:



     jal doit              # one call location
     .
     .
     .
     jal doit              # another call location
     .
     .
     .

doit: 
       sub $sp, $sp, 4     # save return address
       sw  $ra, 4($sp)

      .
      .
      .
       jal another         # this would overwrite the return
                           # address, if it had not been saved.
      .
      .
      .

       lw  $ra, 4($sp)      # restore return address
       add $sp, $sp, 4
       jr  $ra

about Stack Frames (Activation Records)

From a compiler's point of view, there are a bunch of things that should go on the stack relating to procedure call/return. They include:

return address (register)
parameters
other various registers

Each procedure has different requirements for numbers of parameters, their size, and how many registers (which ones) will need to be saved on the stack. So, we compose a stack frame or activation record that is specific to a procedure.

Space for a stack frame gets allocated on the stack each time a procedure is called, and taken off the stack each time a return occurs. These stack frames are pushed/popped dynamically (while the program is running).

An initial example showing the steps, but not all the code:


main:
    jal  A
    jal  B
    .
    .
    .
    done


A:  allocate A's AR        # AR stands for Activation Record
    jal C
    jal D
    deallocate A's AR
    jr $ra

B:  allocate B's AR
    jal D
    deallocate B's AR
    jr $ra

C:  allocate C's AR
    jal E
    deallocate C's AR
    jr $ra

D:  jr $ra

E:  jr $ra

Here is the call tree for this little example.

       main
       /   \
      A     B
     / \    |
    C   D   D
    |
    E

The code (skeleton) for one of these procedures:


A:  sub $sp, $sp, 20     # allocate frame for A
    sw  $ra, 16($sp)     # save A's return address

    jal C
    jal D

    lw  $ra, 16($sp)     # restore A's return address
    add $sp, $sp, 20     # remove A's frame from stack
    jr $ra               # return from A

Some notes on this:

the allocation and removal of a frame should be done within the body of the procedure. That way, the compiler does not need to know the size of a procedure's frame (when producing the code where the call is).
Accesses to A's frame are done via offsets from the stack pointer. (This is a base displacement addressing mode.)

     $sp --> |            |     address 0 at top of diagram
 (after A    |------------| \
  allocates  |            |  \
  A's AR )   |------------|   |
             |            |   |
             |------------|   |-- A's frame
             |            |   |   (A allocates, and then deallocates this space)
             |------------|   |
             |   $ra      |   |
             |------------|  /
             |            | /
             |------------|
             |            |
             |------------|

Parameter Passing

Use parameter and argument as synonyms for this discussion.

Just as there is little to no support for implementing procedures in many assembly languages, there is little to no support for passing parameters to those procedures.

Remember that when it comes to the implementation,

everything is done by convention
it is up to the programmer to follow the conventions

Passing parameters means getting data into a place set aside for the parameters. Both the calling program (caller, parent) and the called procedure (callee, child) need to know where the parameters are.

The parent places the parameters into this place, and possibly uses values returned by the child. The child uses the parameters.

A note on parameter passing.

A HLL specifies rules for passing parameters. There are basically 2 types of parameters. Note that a language can offer only one or both types.

call by value -- what C has. In Pascal, these are parameters declared without the var in front of the variable name. Fortran does not offer this type of parameter.
The parameter passed may not be modified by the procedure. This can be implemented by passing a copy of the value. What call by value really implies is that the procedure can modify the value (copy) passed to it, but that the value is not changed outside the scope of the procedure.
call by reference -- what Fortran has. In Pascal, these are var parameters. C does not offer this type of parameter.
The parameter passed to the subroutine can be modified, and the modification is seen outside the scope of the subroutine. It is sort of like having access to a global variable.

There are many ways of implementing these 2 variable types. If call by value is the only parameter type allowed, how can we implement a reference type parameter? Pass the address of the variable as the parameter. Then access to the variable is made through its address.

The simplest mechanism: pass in a register.

The parent puts the parameter(s) into specific registers, and the child uses them.

Initial example:


	     .
	     .
	     .
	     move  $4, $20      # put parameter in $4
	     jal   decrement
	     move  $20, $4      # recopy parameter to its correct place
	     .
	     .
	     .


             # the call by reference parameter is in $4
decrement:   add  $4, $4, -1
             jr $ra

Notes:

-- This is a trivial example, since the procedure is 1 line long.

-- Why not just use $20 within the procedure?

convention -- parameters are passed in specific registers.
the same procedure could be used to decrement the value in other registers -- just copy the value to register $4 first, and copy it out afterwards.

Historically more significant mechanism: pass parameters on the stack.

Place the parameters to a procedure (function) on the stack. The parameters go between the parent and child's AR.

       sub $sp, $sp, 8   # allocate space for parameters
       sw  $9, 4($sp)    # place parameter 1 into allocated space
       sw  $18, 8($sp)   # place parameter 2 into allocated space
       jal proc
       add $sp, $sp, 8   # deallocate space for parameters
       .
       .
       .
    proc:
       sub $sp, $sp, 12  # allocate remainder of AR for proc
		         # assume fixed size (but too big) AR
       sw  $ra, 12($sp)  # save return address
       lw  $10, 16($sp)  # retrieve parameter 1 for use
       lw  $11, 20($sp)  # retrieve parameter 2
       .
       .
                         # use parameters in procedure calculations
       .
       .
       lw  $ra, 12($sp)  # restore return address
       add $sp, $sp, 12  # remove AR of proc
       jr  $ra

The parent:

allocates space for parameters
places parameters into stack
calls procedure
deallocates space for parameters (when appropriate)

The child:

allocates AR (or remainder of AR)
deallocates AR of procedure

Parameter Passing: The MIPS Way

MIPS convention -- when passing parameters in registers, the first 4 parameters are passed in registers $4-7. The aliases for $4-$7 are $a0-$a3. The first parameter to a procedure is always passed in $a0.

Then, any and all procedures use those registers for their parameters.

ALSO MIPS convention -- space for all parameters (passed in $a0-a3) is allocated in the parent's (caller's) AR !!

If there are nested subroutine calls, and registers $a0-a3 are used for parameters, the values would be lost (just like the return address would be lost for jal if not saved).

An example of this problem:

   procA:  # receives 3 parameters in $a0, $a1, and $a2

         # set up procB's parameters
         move $a0, $24  # overwrites procA's parameter in $a0
         move $a1, $9   # overwrites procA's parameter in $a1
         jal  procB     # the nested procedure call

         # procA continues after procB returns
         # procA's parameters are needed, but have been overwritten

There are 2 possible solutions.

(works for non-recursive nested calls) each procedure has associated with it a section of memory. Before a nested call is made, the current parameters are stored in that memory. After the return from the nested call, the current values are restored.
(works for any nested calls, even for recursive calls) current parameters are stored on the stack before a nested call. After the return from the nested call, the current parameters are restored.

The example re-written, to do things the MIPS way. Note that this is only a code fragment. It does not show everything (like saving $ra).

 procA:  # receives 3 parameters is in $a0, $a1, and $a2.
         # The caller of procA has allocated space for $a0-$a3
         # at the top of the stack.

         # assume that procA has an activation record of 5 words.
         sub  $sp, $sp, 20   # allocate space for AR

         # save procA's parameters
	 sw   $a0, 24($sp)
	 sw   $a1, 28($sp)
	 sw   $a2, 32($sp)

	 .
	 .
	 .

         # set up procB's parameters
         move $a0, $24 
         move $a1, $9 
         jal  procB     # the nested procedure call

	 .
	 .
	 .
         # procA continues after procB returns
         # procA's parameters are needed, so restore them
	 lw   $a0, 24($sp)
	 lw   $a1, 28($sp)

In this code fragment, procA saves its 3rd parameter (from $a2) on the stack. Is this necessary (given that procB only receives 2 parameters)? Why or why not?

Here is a general layout of how this second option is used on MIPS, following conventions (with 4 or fewer parameters):

	proc1 layout:
	    allocate AR (include space for outgoing parameters) 
	    put return address on stack into AR of procedure

	    procedure calculations

	    to set up and call proc2,
	       place current parameters (from $a0-a3) into previously allocated
	           space
	       set up parameters to proc2 in $a0-a3
	       call proc2 (jal proc2)
	       copy any return values out of $v0-v1, $a0-a3
	       restore current parameters back to $a0-a3

	    more procedure calculations (presumably using procedure's
	      parameters which are now back in $a0-a3)

	    get procedure's return address from AR
	    deallocate AR
	    return (jr $ra)

Summary of the general ideas:

use registers
- + easy, and don't have to store data in memory (faster)
- - limited number of registers
- - doesn't work for recursion, and must be careful when using it where there are nested subroutines
use some registers, and place the rest on the stack
- + since many procedures have few parameters, get the advantages of (1) most of the time.
- - lots of "data shuffling"
put all parameters on the stack (an unsophisticated compiler might do this)
- + simple, clean method (easy to implement)
- - lots of stack operations (meaning slow, since the stack is in memory)
put parameters in memory set aside for them
- + simple, clean method
- - lots of memory operations (slow)
- - doesn't work for recursion

Note: whatever you do, try to be consistant. Don't use all 4 methods in the same program. (Its poor style.)

about Frame Pointers

The stack gets used for more than just pushing/popping stack frames. During the execution of a procedure, there may be a need for temporary storage of variables. The common example of this is in expression evaluation.

 Example:      high level language statement
		 Z = (X * Y) + (A/2) - 100

The intermediate values of X*Y and A/2 must be stored somewhere. On older machines, register space was at a premium. There just were not enough registers to be used for this sort of thing. So, intermediate results (local variables) were stored on the stack.

They do not go in the stack frame of the executing procedure; they are pushed/popped onto the stack as needed.

So, at one point in a procedure, parameter 2 might be at 16($sp)

  |       |
  ---------
  |       |<- $sp
  ---------
  |       | ---
  ---------   |
  |       |   |
  ---------   |
  |       | ------ procedure's frame
  ---------    
  |param 2|    
  ---------    
  |       |    
  ---------    
  |       |
  ---------

and, at another point within the same procedure, parameter 2 might be at 24($sp)

  ---------
  |       |<- $sp
  ---------
  | temp2 |
  ---------
  | temp1 |
  ---------
  |       | ---
  ---------   |
  |       |   |
  ---------   |
  |       | ------ procedure's frame
  ---------    
  |param 2|    
  ---------    
  |       |    
  ---------    
  |       |
  ---------

All this is motivation for keeping an extra pointer around that does not move with respect to the current stack frame.

Call it a frame pointer. Make it point to the base of the current frame:

  ---------
  |       |<- $sp
  ---------
  | temp2 |
  ---------
  | temp1 |
  ---------
  |       | ---
  ---------   |-- procedure's frame
  |       |   |
  ---------   |
  |       | ---   <-- frame pointer
  ---------    
  |param 2|    
  ---------    
  |       |
  ---------
  |       |
  ---------

Now items within the frame can be accessed with offsets from the frame pointer, and the offsets do not change within the procedure.

parameter 2 will be at 4(frame pointer)

A new register is needed for this frame pointer. Pick one. (The chapter arbitrarily chooses $16, but it could be any register.)

parameter 2 is at 4($16)

NOTES:

-- The frame pointer must be initialized at the start of every procedure, and restored at the end of every procedure.

-- The MIPS architecture does not really allocate a register for a frame pointer. It has something else that it calls a "virtual frame pointer," but it is not really the same as described here. On the MIPS, all data with a stack frame is accessed via the stack pointer, $sp.

The skeleton of a procedure that uses a frame pointer and has parameters:


  # the frame (AR) is 4 words, 2 words are space for 2 parameters
  # passed in $a0 and $a1, 1 is for return address, and 1 is for
  # the frame pointer

  procedure:
    sub  $sp, $sp, 8   # allocate remainder of frame
                       # (assumes that caller allocated space for the
                       # 2 parameters)
    sw   $ra, 8($sp)   # save procedure's return address
    sw   $16, 4($sp)   # save caller's frame pointer
    add  $16, $sp, 8  # set procedure's frame pointer


    # procedure's code in here
    # Note that all accesses to procedure's AR is done with offsets from $16

    lw   $ra, ($16)    # restore return address
    move $8, $16       # save frame pointer temporarily
    lw   $16, -4($16)  # restore callers frame pointer
    move $sp, $8       # remove procedure's frame (AR)
    jr   $ra

The activation record (frame) for this procedure after everything is in it:

                     ^ smaller addresses up here
  |----------------|
  |                |<--- $sp
  |----------------|
  | frame pointer  |
  |----------------|
  | return address | <--- $16 (frame pointer)
  |----------------|
  | space for P2   |
  |----------------|
  | space for P1   |
  |----------------|
  |                |
  |----------------|
  |                |

New problem:

What happens if you have lots of variables, and your procedure runs out of registers to put them in. This occurs when you are following the conventions for register usage, and you should not overwrite the values in certain registers.

Most common solution: store register values temporarily on the stack in AR.

Two types:

CALLEE SAVED
- a procedure clears out some registers for its own use
- register values are preserved across procedure calls
- MIPS calls these saved registers, and designates $s0-s8 for this useage.
- $s0-$s8 are aliases for $16-$23, $30
- the called procedure saves register values in its AR, uses the registers for local variables, restores register values before it returns.
CALLER SAVED
- the calling program saves the registers that it does not want a called procedure to overwrite
- register values are NOT preserved across procedure calls
- MIPS calls these temporary registers, and designates $t0-t9 for this useage.
- $t0-$t9 are aliases for $8-15, $24-$25
- procedures use these registers for local variables, because the values do not need to be preserved outside the scope of the procedure.

What the mechanisms should look like from the compiler's point of view:


THE CODE:

   call setup
   procedure call
   return cleanup
   .
   .
   .
procedure:  prologue

            calculations

            epilogue

CALL SETUP
- place current parameters into stack (space already allocated by caller of this procedure)
- save any TEMPORARY registers that need to be preserved across the procedure call
- place first 4 parameters to procedure into $a0-$a3
- place remainder of parameters to procedure into allocated space within the stack frame
PROLOGUE
- allocate space for stack frame
- save return address in stack frame
- copy needed parameters from stack frame into registers
- save any needed SAVED registers into current stack frame
EPILOGUE
- restore (copy) return address from stack frame into $ra
- restore from stack frame any saved registers (saved in prologue)
- de-allocate stack frame (move $sp so the space for the procedure's frame is gone)
RETURN CLEANUP
- copy needed return values and parameters from $v0-v1, $a0-a3, or stack frame to correct places
- restore any temporary registers from stack frame (saved in call setup)

An excellent and detailed example -- written by Prof. David Wood


# procedure: procA
# function: demonstrate CS354 calling convention
# input parameters: $a0 and $a1
# output (return value): $v0
# saved registers: $s0, $s1
# temporary registers: $t0, $t1
# local variables: 5 integers named R, S, T, U, V
# procA calls procB with 5 parameters (R, S, T, U, V).
#
# Stack frame layout:
#	
#	| in $a1  |  68($sp)
#	| in $a0  |  64($sp)
#	|---------|
#	|    V    |  60($sp)       --|
#	|    U    |  56($sp)         |
#	|    T    |  52($sp)         |
#	|    S    |  48($sp)         |
#	|    R    |  44($sp)         |
#       |   $t1   |  40($sp)         |
#       |   $t0   |  36($sp)         | --  A's activation record
#       |   $ra   |  32($sp)         |
#       |   $s1   |  28($sp)         |
#       |   $s0   |  24($sp)         |
#       | out arg4|  20($sp)         |
#       | out $a3 |  16($sp)         |
#       | out $a2 |  12($sp)         |
#       | out $a1 |   8($sp)         |
#       | out $a0 |   4($sp)       --|
#	|---------|
#       |         | <-- $sp        ---
#       |         |                  |
#       |         |                  |   --  where B's activation record
#       |         |                  |        will be
procA:
	# procedure prologue
	sub $sp, $sp, 60	#allocate activation record, includes
				# space for maximum outgoing args
	sw $ra, 32($sp)		#save return address
	sw $s0, 24($sp)		# save 'saved' registers to stack
	sw $s1, 28($sp)		# save 'saved' registers to stack
	# end prologue

	....	# more code

	# call setup for call to procB
  	# save current (live) parameters into the space specifically
	# allocated for this purpose within caller's stack frame
	sw $a0, 64($sp)		# only needed if values are 'live'
	sw $a1, 68($sp)		# only need if values are 'live'
  	# save any registers that need to be preserved across the call
	sw $t0, 36($sp)		# only need if values are 'live'
	sw $t1, 40($sp)		# only need if values are 'live'
	# put parameters into proper location
	lw $a0, 44($sp)		# load R into $a0
	lw $a1, 48($sp)		# load S into $a1
	lw $a2, 52($sp)		# load T into $a2
	lw $a3, 56($sp)		# load U into $a3
	lw $t0, 60($sp)		# load V into a temp register
	sw $t0, 20($sp)		# outgoing arg4 must go on the stack
	#end call setup

	# procedure call
	jal procB

	# return cleanup for call to procB
	# restore saved registers
	lw $a0, 64($sp)
	lw $a1, 68($sp)
	lw $t0, 36($sp)
	lw $t1, 40($sp)
	# return values are in $v0 and $v1

	....	# more code

	# procedure epilogue
	# restore return address
	lw $ra, 32($sp)
	# restore $s registers saved in prologue
	lw $s0, 24($sp)         
	lw $s1, 28($sp)        
	# put return values in $v0 and $v1
	mov $v0, $t0
	# deallocate stack frame
	add $sp, $sp, 60
	# return
	jr $ra

# end of procA

An important detail for 354 students writing MAL code and using the simulator:

The I/O instructions putc, puts, and getc are implemented as functions within the operating system. They are not actual instructions on a MIPS R2000 processor. (In general, NO modern architecture has explicit input/output instructions.)

Parameters get passed to the operating system, and return values get set by the functions implementing putc, puts, and getc. Therefore, in general, you must assume that $a0-$a3 and $v0-$v1 will be overwritten during the execution of any putc, puts, or getc instruction.

In practice, I believe the simulator is implemented to only change values in $a0 and $v0.

Functions

Functions are just procedures that return a value.

For Java programmers, functions are methods. For C programmers, procedures are functions that return a void type.

A function sets a return value, and this return value is available (set) after the function returns. So, a return value is similar to a parameter, only the flow of information is the reverse.

The location of a function return value follows the same rules as the location of parameters. We could place a function return value

in a register.
on the stack. The caller would allocate space for both outgoing parameters AND return values in its activation record. The function places the return value into this allocated space. The caller then has the return value in this allocated space.
in a set-aside memory location (not on the stack). Just like for parameters, this location for a return value will not work for a recursive function.

The MIPS architecture specifies that a return value is placed into a register. In fact, there are two registers set aside for this purpose:

$2 is also called $v0 and

$3 is also called $v1

an example code fragment:


             jal  function1      # y = function1();
             sw   $v0, y
             .
             .
             .
 function1:  # the function does calculations
             .
             .
             .
	     # and then sets the return value
	     move $v0, $t0       # the return value was in $t0
	     jr   $ra

Some guidelines:

If parameters passed on stack, want them "between" parent and child's activation records.
Use of frame pointer can reduce amount of code. It gives a better level of abstraction.
Depending on conventions and implementations, the amount of space allocated for activation record may be different than the amount of space removed.
- If callee allocates space, and parameters are on stack.
- If caller and callee each allocate some of the space.
MIPS convention: always allocate space in activation record for all parameter registers, even if there are fewer than 4 parameters.