Parameter Passing


Contents

Overview

In a Java program, all parameters are passed by value. However, there are three other parameter-passing modes that have been used in programming languages:

  1. pass by reference
  2. pass by value-result (also called copy-restore)
  3. pass by name
We will consider each of those modes, both from the point of view of the programmer and from the point of view of the compiler writer.

First, here's some useful terminology:

  1. Given a method header, e.g.:
           void f(int a, boolean b, int c)
      
    we will use the terms parameters, formal parameters, or just formals to refer to a, b, and c.

  2. Given a method call, e.g.:
           f(x, x==y, 6);
      
    we will use the terms arguments, actual parameters, or just actuals to refer to x, x==y, and 6.

  3. The term r-value refers to the value of an expression. So for example, assuming that variable x has been initialized to 2, and variable y has been initialized to 3:

    expression r-value
    x2
    y3
    x+y6
    x==yfalse

  4. The term l-value refers to the location or address of an expression. For example, the l-value of a global variable is the location in the static data area where it is stored. The l-value of a local variable is the location on the stack where it is (currently) stored. Expressions like x+y and x==y have no l-value. However, it is not true that only identifiers have l-values; for example, if A is an array, the expression A[x+y] has both an r-value (the value stored in the x+yth element of the array), and an l-value (the address of that element).
L-values and r-values get their names from the Left and Right sides of an assignment statement. For example, the code generated for the statement x = y uses the l-value of x (the left-hand side of the assignment) and the r-value of y (the right-hand side of the assignment). Every expression has an r-value. An expression has an l-value iff it can be used on the left-hand side of an assignment.

Value Parameters

Parameters can only be passed by value in Java and in C. In Pascal, a parameter is passed by value unless the corresponding formal has the keyword var; similarly, in C++, a parameter is passed by value unless the corresponding formal has the symbol & in front of its name. For example, in the Pascal and C++ code below, parameter x is passed by value, but not parameter y:

When a parameter is passed by value, the calling method copies the r-value of the argument into the called method's AR. Since the called method only has access to the copy, changing a formal parameter (in the called method) has no effect on the corresponding argument. Of course, if the argument is a pointer, then changing the "thing pointed to" does have an effect that can be "seen" in the calling procedure. For example, in Java, arrays are really pointers, so if an array is passed as an argument to a method, the called method can change the contents of the array, but not the array variable itself, as illustrated below:


TEST YOURSELF #1

What happens when the following Java program executes?


Reference Parameters

When a parameter is passed by reference, the calling method copies the l-value of the argument into the called method's AR (i.e., it copies a pointer to the argument instead of copying the argument's value). If the argument has no l-value (e.g., it is an expression like x+y), the compiler may consider this an error (that is what happens in Pascal, and is also done by some C++ compilers), or it may give a warning, then generate code to evaluate the expression, to store the result in some temporary location, and to copy the address of that location into the called method's AR (this is is done by some C++ compilers).

In terms of language design, it seems like a good idea to consider this kind of situation an error. Here's an example of code in which an expression with no l-value is used as an argument that is passed by reference (the example was actually a Fortran program, but Java-like syntax is used here):

        void mistake(int x) {  // x is a reference parameter
	   x = x+1;
	}

        void main() {
           int a;
	   mistake(1);
	   a = 1;
           print(a);
        }
When this program was compiled and executed, the output was 2! That was because the Fortran compiler stored 1 as a literal at some address and used that address for all the literal "1"s in the program. In particular, that address was passed when "mistake" was called, and was also used to fetch the value to be assigned into variable a. When "mistake" incremented its parameter, the location that held the value 1 was incremented; therefore, when the assignment to a was executed, the location no longer held a 1, and so a was initialized to 2!

To understand why reference parameters are useful, remember that, although in Java, all non-primitive types are really pointers, that is not true in other languages. For example, consider the following C++ code:

Note that in main, variable P is a Person, not a pointer to a Person; i.e., main's activation record has space for P.name and P.age. Parameter per is passed by value (there is no ampersand), so when birthday is called from main, a copy of variable P is made (i.e., the values of its name and age fields are copied into birthday's AR). It is only the copy of the age field that is updated by birthday, so when the print statement in main is executed, the value that is output is 0.

This motivates some reasons for using reference parameters:

  1. When the job of the called method is to modify the parameter (e.g., to update the fields of a class), the parameter must be passed by reference so that the actual parameter, not just a copy, is updated.
  2. When the called method will not modify the parameter, and the parameter is very large, it would be time-consuming to copy the parameter; it is better to pass the parameter by reference so that a single pointer can be passed.


TEST YOURSELF #2

Consider writing a method to sort the values in an array of integers. An operation that is used by many sorting algorithms is to swap the values in two array elements. This might be accomplished using a swap method:

Assume that A is an array of integers, and that j and k are (different) array indexes. Draw a picture to illustrate what happens when the call: is executed, assuming
  1. that this is Java code (all parameters are passed by value)
  2. that this is some other language in which parameters are passed by reference

It is important to realize that the code generator will generate different code for a use of a parameter in a method, depending on whether it is passed by value or by reference. If it is passed by value, then it is in the called method's AR (accessed using an offset from the FP). However, if it is passed by reference, then it is in some other storage (another method's AR, or in the static data area). The value in the called method's AR is the address of that other location.

To make this more concrete, assume the following code:

Below is the code that would be generated for the statement a = a - 5, assuming (1) that a is passed by value and (2) assuming that a is passed by reference:


TEST YOURSELF #3

The code generator will also generate different code for a method call depending on whether the arguments are to be passed by value or by reference. Consider the following code:

Assume that f's first parameter is passed by reference, and that its second parameter is passed by value. What code would be generated to fill in the parameter fields of f's AR?

Value-Result Parameters

Value-result parameter passing was used in Fortran IV and in Ada. The idea is that, as for pass-by-value, the value (not the address) of the actual parameters are copied into the called method's AR. However, when the called method ends, the final values of the parameters are copied back into the arguments. Value-result is equivalent to call-by-reference except when there is aliasing (note: "equivalent" means the program will produce the same results, not that the same code will be generated).

Two expressions that have the same l-value are called aliases. Aliasing can happen:

Will will look at examples of each of these below.

Creating Aliases via Pointers

Pointer manipulation can create aliases, as illustrated by the following Java code. (Note: this kind of aliasing does not make pass-by-reference different from pass-by-value-result; it is included here only for completeness of the discussion of aliasing.)

Creating Aliases by Passing Globals as Arguments

This way of creating aliases (and the difference between reference parameters and value-result parameters in the presence of this kind of aliasing) are illustrated by the following C++ code:

As stated above, passing parameters by value-result yields the same results as passing parameters by reference except when there is aliasing. The above code will print different values when f's parameter is passed by reference than when it is passed by value-result. To understand why, look at the following pictures, which show the effect of the code on the activation records (only variables and parameters are shown in the ARs, and we assume that variable x is in main's AR):

Creating Aliases by Passing Same Argument Twice

Consider the following C++ code:

Assume that f's parameters are passed by reference. In this case, when main calls f, a and b are aliases. As in the previous example, different output may be produced in this case than would be produced if f's parameters were passed by value-result (in which case, no aliases would be created by the call to f, but there would be a question as to the order in which values were copied back after the call). Here are pictures illustrating the difference: With value-result parameter passing, the value of x after the call is undefined, since it is unknown whether a or b gets copied back into x first. This may be handled in several ways:


TEST YOURSELF #4

Assume that all parameters are passed by value-result.

Question 1: Give a high-level description of what the code generator must do for a method call.

Question 2: Give the specific code that would be generated for the call shown below, assuming that variables x and y are stored at offsets -8 and -12 in the calling method's AR.


Name Parameters

Call-by-name parameter passing was used in Algol. The way to understand it (not the way it is actually implemented) is as follows:

For example, given this code: The following shows this (conceptual) substitution, with the substituted code in red: Call-by-name parameter passing is not really implemented like macro expansion however; it is implemented as follows. Instead of passing values or addresses as arguments, a method (actually the address of a method) is passed for each argument. These methods are called 'thunks'. Each 'thunk' knows how to determine the address of the corresponding argument. So for the above example: Each time a parameter is used, the 'thunk' is called; then the address returned by the 'thunk' is used.

For the example above, call-by-reference would execute A[0] = 0 ten times, while call-by-name initializes the whole array!

Note that call-by-name is generally considered a bad idea, because it is hard to understand what a method is doing -- it may require looking at all calls to figure this out.

Comparisons of These Parameter Passing Mechanisms

Here are some advantages of each of the parameter-passing mechanisms discussed above:

Call-by-Value (when not used to pass pointers)

Call-by-Reference

Call-by-Value-Result

Call-by-Name