Pointers

You have seen simple Java classes such as:


public class Line {
    private int a, b, c;  /* line is  ax + by = c  */

    public void setA(int aValue) {
        a = aValue;
    }
    public void setB(int bValue) {
        b = bValue;
    }
    public void setC(int cValue) {
        c = cValue;
    }

}

public class PlayWithLines {

    public static void main(String args[]) {
        Line diagonal = new Line();
        diagonal.setA(1);
        diagonal.setB(1);
        diagonal.setC(0);
    }
}

Let's look at what is really going on in a simple program such as this. This will help to explain the very important concept of a pointer.

(Simple explanation: a C pointer is the same as a Java reference. It identifies the address of something.)

The Line class specifies information about a Line object, what the object looks like, and what operations (methods) do with or to the object.

A Line object contains 3 integers. A standard diagram of this appears as

To get one of these objects (in Java) we use the new function. It allocates a Line object. We get an instance of the Line object. So, the execution of the source code within main


        Line diagonal = new Line();

causes an instance of a Line object to be created. And, (very important) it gives a reference to this object, which we (correctly) assign to a variable. The variable diagonal is a reference to the created object. It is not the object. It is a pointer. The standard diagram after allocation and assignment of the reference appears as:

This diagram uses the standard notation that objects have rounded corners, and references have square corners. The diagram is not quite correct, in that the implicit constructor for a Line object must set values of a and b and c to 0.

A look at the low level of memory furthers understanding of pointers. Here is a diagram of a portion of memory:

For this example, each box is defined to be a place in memory that can hold a single integer. To the left of the boxes are addresses. Each address in memory is a unique integer (0 or positive). Make the further assumption that a Line object is made up of 3 consecutive boxes, and has been allocated to just this portion in memory. The diagram of memory appears similar to:

Given this picture of memory, the Line object referenced (pointed to) by diagonal looks like:

In the further Java code that operates on members (fields) of the object, invoking the method on an object causes the address of the object to be implicitly passed as a parameter.

So, the important concept here is that a pointer is a variable that contains an address. In the C programming language, this fact is explicit, and often used by programmers.

The & operator in C gives the address of a variable.

The * operator in C gives the variable that a pointer points to. It is also known as a dereferencing operator.

Some examples will help.


int a = 3;
int b = 8;
int *ap;   /* declares a variable that is a pointer to an int */
int *bp;   /* declares a variable that is a pointer to an int */

ap = &a;   /* ap contains a value that points to a */
bp = &b;   /* bp contains a value that points to b */

If I take liberties with my diagrams, replacing integer addresses with their symbolic names (like a and ap), and (incorrectly) assuming that a, b, ap, and bp are all placed into memory consecutively, then before the code fragment is executed, but after memory is initialized, we have

After this little code fragment is executed, with the same assumptions as before, we have

Another important concept is that a pointer is a variable. And every variable is located in memory at an address. (Just as in Java, a reference is a variable.)

We can use pointers in ways similar (and sometimes the same) as integer variables. This leads to both good and bad programming practices.

Allowed Operations on Pointers

assignment to other pointers of the same type
addition and subtraction of a pointer to an integer
assignment of the value 0
comparison to the value 0

Here is a little code fragment that does allowed operations using pointers.


    int a = 3;   /* declaration and initialization */
    int b = 8;   /* declaration and initialization */
    int c = 0;   /* declaration and initialization */
    int *ap;     /* declaration of a pointer to an integer */
    int *bp;     /* declaration of a pointer to an integer */
    int *cp;     /* declaration of a pointer to an integer */

    ap = &a;
    bp = &b;
    cp = &c;

    c = *ap + *bp;
    a = b + *cp;
    (*bp)++;
    cp++;    /* allowed, but probably not reasonable */

NOT Allowed Operations on Pointers

Any operation not in the list of allowed operations! For example:

multiplication or division on a pointer
addition or subtraction of two pointer values
assignment of a value other than 0 to a pointer

Here is a little code fragment that says to do several disallowed operations using pointers. This code fragment will give compiler errors.


int a = 3;
int b = 8;
int c = 0;
int *ap;
int *app;
int *bp;
int *cp;

    ap = 34;   /* BAD */
    app = ≈  /* BAD */

Here is a program that may help to illustrate what points where and when. Remember that * follows a pointer (to what it references). The & operator asks for the address of the variable.


#include <stdio.h>

main()
{
   int a = 3;
   int b = 8;
   int c = 12;
   int *ap;
   int *bp;
   int *cp;

   ap = &a;
   bp = &b;
   cp = &c;

   ap = bp;
   *bp = *cp;

   if ( b == c ) {
      printf("b equals c\n");
   } else {
      printf("b does not equal c\n");
   }

   if ( bp == cp ) {
      printf("pointers same\n");
   } else {
      printf("pointers not the same\n");
   }

   if ( (*ap) == c ) {
      printf("equal\n");
   } else {
      printf("not equal\n");
   }
   return(0);
}

This program prints out:


b equals c
pointers not the same
equal

Arrays

Simple declaration


    int ar[100];  /* an array of 100 integers */

An array always causes its elements to be allocated consecutively within memory. They must be consecutive, because the (compiled) assembly language code must calculate the address of a desired array element in order to access that array element. If elements were located randomly throughout memory, there would be no way of knowing the location of the element. The address of the first element is associated with the name (symbol) of the array.


    ar[4] = -3;

This code causes the 5th integer within the array to be assigned the value -3. Element index numbering always starts with 0.

An interesting and tricky part about programming in C, is that arrays and pointers can be and are often interchanged, because we can do arithmetic on pointers. We can set a pointer to contain the address of ar[4].


    int ar[100];  /* an array of 100 integers */
    int *arptr;

    arptr = &ar[4];

And, we could now change the value of this 5th element of the array to 16 with


    *arptr = 16;

And, we could now change the value of the 7th element of the array to 1000 with


    *(arptr+2) = 1000;

We can even do the same thing with


    *(ar+6) = 1000;  /* 7th item is at offset of 6 from the element at index=0 */

Stated a little more formally,
a[i] is the same as *(a+i)
and &a[i] is the same as a+i

However, a pointer is a variable, but an array name is not a variable. So,
arptr = ar is legal,
but ar = arptr and ar++ are not legal.

Structures

We can build our own sets of variables into a single type called a structure. You might think of a structure as an object, but there are no inherent operations on structures, as C is not an object oriented language.

For example, the Line object given at the beginning of these notes collects 3 integers into a single item. We can do the same thing with a structure:


struct line {
    int a, b, c;  /* line is  ax + by = c  */
};

Note that this defines a type, and the type is called line. This structure has 3 members or fields. Here is a declaration of an instance of this type:


struct line diagonal;

Further code may initialize the values of a, b, and c:


    diagonal.a = 1;
    diagonal.b = 1;
    diagonal.c = 0;

The . (period) is an operator on a structure, to access the correct member of the structure.

Operations that may be done to a structure

copy it
assign to it (as a whole unit)
get its address (with the & operator)
access a member of the structure

Operations that may not be done to a structure

compare two structures (even if the two are the same type)

The `->` operator

An alternative to the period or dot operator when using a pointer to a structure, written as ->, refers to a member of a structure as follows.


    struct line {
        int a, b, c;  /* line is  ax + by = c  */
    };

    struct line horizaxis; /* declares an instance of the structure */

    struct line *axisptr;  /* declares a variable that is a pointer */
                           /*   one of the a structures */

    axisptr = &horizaxis;
    axisptr->a = 0;
    axisptr->b = 1;
    axisptr->c = 0;

The . and -> operators associate left to right, and are of the highest precedence (for operators), so use parentheses when needed.