Introduction to C++ For Java Programmers

Basic Terminology and C++ Functions
Compiling and Running a C++ Program
C++ Types
Summary
Answers to Self-Study Questions

Basic Terminology and C++ Functions

In Java, we refer to the fields and methods of a class. In C++, we use the terms data members and member functions. Furthermore, in Java, every method must be inside some class. In contrast, a C++ program can also include free functions - functions that are not inside any class.

Every C++ program must include one free function named main. This is the function that executes when you run the program.

Here's a simple example C++ program:

# include <iostream>

int main() {
    cout << "Hello world!" << endl;
    return 0;
}

Things to note:

Function main should have return type int; you should return zero to indicate normal termination and any non-zero value to indicate that an error occurred (e.g., bad data was read, an attempt to open a non-existent file was made).
Function main is not required to have any parameters. However, if you intend your program to be run with command-line arguments, you should declare main as follows:
The first parameter is the number of command-line arguments, including the name of the executable itself (more about executable files below). The second parameter is an array of C-style strings (a C-style string is a sequence of characters, terminated with a special null character). Note that, like all formal parameters, you can use any names you want in place of "numargs" and "args". For historical reasons, some programmers use "argc" and "argv", but those are not very informative names.
To write to the standard output (usually the computer screen) use:
Use endl (or '\n' or "\n") to write a newline.

The output operator (<<) is overloaded; you can use it to write a value of any of the primitive types. For example:

       int k = 2;
       double d = 4/5;
       char c = 'x';
       
       cout << k << endl;  // write an int
       cout << d << endl;  // write a double
       cout << c << endl;  // write a char

Note that the output operator can be chained (just like the assignment operator). For example, assuming that x, y, and z are declared variables, both of the following are legal C++:
The assignment operator is right associative, so the chained assignment statement is evaluated right-to left (first z is set to 0, then y is set to 0, then x is set to 0). The output operator is left-associative, so the chained output statement is evaluated left-to-right (first x is output, then y, then z).
The line:
is similar to an import statement in Java (but note that the #include does not end with a semicolon). The #include is needed to provide the definitions of cout and endl, which are both used in the example program.

TEST YOURSELF NOW

Write a C++ program that uses a loop to sum the numbers from 1 to 10 and prints the result like this:

    The sum is: xxx

Note: Use variable declarations, and a for or while loop with the same syntax as in Java.

solution

Another example program (multiple functions and forward declarations)

Usually, when you write a C++ program you will write more than just the main function. Here's an example of a program with two functions:

# include <iostream>

void print() {
    cout << "Hello world!" << endl;
}

int main() {
    print();
    return 0;
}

In this example, function main calls function print. Since print is a free function (not a method of some class), it is called just by using its name (not xxx.print()). It is important that the definition of print comes before the definition of main; otherwise, the compiler would be confused when it saw the call to print in main. If you do want to define main first, you must include a forward declaration of the print function (just the function header, followed by a semi-colon), like this:

# include <iostream>

void print();

int main() {
    print();
    return 0;
}

void print() {
    cout << "Hello world!" << endl;
}

Compiling and running a C++ Program

By convention, C++ source code is put in a file with the extension ".C" or ".cc" or ".cpp". The name itself can be whatever you want (there is no need to match a class name as in Java).

To create an executable file named a.out for the C++ program in foo.C, type:

g++ foo.C

To create an executable file named foo for the C++ program in foo.C, type:

g++ foo.C -o foo

(The -o flag tells the compiler what name to use for the resulting executable; you use can use the -o flag to give the executable any name you want, with or without an extension. If you are working on a Unix machine, do not name your executable test -- there is a Unix utility with that name, so when you think you are running your program, you are really calling the utility, and that can be very confusing!)

To run your program, just type the name of the executable. For example, to run the executable named foo, just type foo at the prompt.

If your program is in more than one file, you can create an executable by compiling the whole program at once; e.g., if your program is in the two files, main.C and foo.C:

g++ main.C foo.C

You can also create individual object files that can later be linked together to make an executable. To tell the compiler that you want object code only, not an executable, use the -c flag. For example:

g++ -c main.C

This will create an object file named main.o. Once you have object files for all of your source files, you link them like this:

g++ main.o foo.o

Since no -o was used, this creates an executable named a.out

The advantage of creating individual object files is that you don't need to recompile all files if you change just one.

C++ Types

C++ has a larger set of types than Java, including: primitive types, constants, arrays, enumerations, structures, unions, pointers, and classes. You can also define new type names using the typedef facility.

C++ classes will be discussed in a separate set of notes. The other types are discussed below.

Primitive (built-in) types
The primitive C++ types are essentially the same as the primitive Java types: int, char, bool, float, and double (note that C++ uses bool, not boolean). Some of these types can also be qualified as short, long, signed, or unsigned, but we won't go into that here.

Unfortunately, (to be consistent with C) C++ permits integers to be used as booleans (with zero meaning false and non-zero meaning true). This can lead to hard-to-find errors in your code like:

if (x = 0) ...

The expression "x = 0" sets variable x to zero and evaluates to false, so this code will compile without error. One trick you can use to avoid this is to write all comparisons between a constant and a variable with the constant first; e.g.:

if (0 == x) ...

Since an attempt to assign to a constant causes a syntax error, code like:

if (0 = x) ...

will not compile.

Constants
You can declare a variable of any of the primitive types to be a constant, by using the keyword const. For example:

  const int MAXSIZE = 100;
  const double PI = 3.14159;
  const char GEE = 'g';

Constants must be initialized as part of their declarations, and their values cannot be changed.

Arrays
Unfortunately, C++ arrays lack many of the nice features of Java arrays:

There is no length operation.
There is no runtime check for an out-of-bounds index.
It is not possible to assign from one array to another, and there is no arraycopy operation.
It is not possible to change the size of an array dynamically.

In many cases it is better to use a vector class (either the one provided by the C++ standard template library -- more on this later -- or one you write yourself). A vector class can be defined to include Java-like features.

Here's an example C++ array declaration:

  int A[10];

This declares an array of 10 integers named A. Note that the brackets must follow the variable name (they cannot precede it as they can in Java). Note also that the array size must be part of the declaration (except for array parameters, more on this in a minute), and the size must be an integer expression that evaluates to a non-negative number.

Because in C++ an array declaration includes its size, there is no need to call new as is done in Java. Declaring an array causes storage to be allocated.

Multi-dimensional arrays are defined using one pair of brackets and one size for each dimension as in Java; for example, the following declares M to be a 10-by-20 array of ints:

  int M[10][20];

As mentioned above, array parameters are declared without a size. This allows a function to be called with actual array parameters of different sizes. However, since there is no length operation, it is usually necessary to pass the size of the array as another parameter. For example:

  void f(int A[], int size)

TEST YOURSELF NOW

Write a C++ function named ArrayEq that has 3 parameters: two arrays of integers, and their size (the same size for both arrays). The function should return true if and only if the arrays contain the same values in the same order.

solution

Enumerations
New types with a fixed (usually small) set of possible values can be defined using an enum declaration, which has the following form:

  enum type-name { list-of-values };

For example:

  enum Color { red, blue, yellow };

This defined a new type called Color. A variable of type Color can have one of 3 values: red, blue, or yellow. For example, here's how to declare and assign to a variable of type Color:

  Color col;
  col = blue;

Structures
A C++ structure is:

a way to group logically related values together
very similar to a C++ class

Here's an example declaration:

  struct Student {
      int id;
      bool isGrad;
  };                    // note that decl must end with a ;

The structure name (Student) defines a new type, so you can declare variables of that type, for example:

  Student s1, s2;

To access the individual fields of a structure, use the dot operator:

  s1.id = 123;
  s2.id = s1.id + 1;

It is possible to have (arbitrarily) nested structures. For example:

  struct Address {
      string city;
      int zip;
  };
  
  struct Student {
      int id;
      bool isGrad;
      Address addr;
  };

Access nested structure fields using more dots:

  Student s;
  s.addr.zip = 53706;

If you get confused, think about the type of each sub-expression:

The type of s is Student, so it makes sense to be able to follow s with ".addr" (since addr is a field of struct Student).
The type of s.addr is Address, so it makes sense to be able to follow the expression "s.addr" with ".zip" (since zip is a field of struct Address).
The type of s.addr.zip is int, so it makes sense to be able to assign 123 to s.addr.zip.

TEST YOURSELF NOW

Assume that the declaration of Student given above has been made. Write a C++ function named NumGrads that has 2 parameters: an array of Students, and the size of the array. The function should return the number of students in the array who are grad students.

solution

Unions
A union declaration is similar to a struct declaration, but the fields of a union all share the same space; so really at any one time, you can think of a union as having just one "active" field. You should use a union when you want one variable to be able to have values of different types. For example, assume that only undergrads care about their GPA, and only grads can be RAs. In that case, we might use the following declarations:

  union Info {
      double GPA;
      bool   isRA;
  };
  
  struct Student {
      int id;
      bool isGrad;
      Info inf;
  };

Of course, we could just add two new fields to the Student structure (both a GPA field and an isRA field), but that would be a bit wasteful of space, since only one of those fields would be valid for any one student. Using a union also makes it more clear that only one of those two fields is meaningful for each student.

It is important to realize that it is up to you as the programmer to keep track of which field of a union is currently valid. For example, if you write the following bad code:

  Info inf;
       
  inf.GPA = 3.7;
  if (inf.isRA) ...

you will get neither a compile-time nor a run-time error. However, this code makes no sense, and there is no way to be sure what will happen -- the value of inf.isRA depends on how 3.7 is represented and how a bool is represented.

Pointers
In Java, every array and class is really a pointer (and you need to use the "new" operator to allocate storage for the actual object). This is not true in C++ -- in C++ you must declare pointers explicitly, and it is possible to have a pointer to any type (including pointers to pointers!).

A pointer variable either contains an address or the special value NULL (note that it is upper-case in C++, not lower-case as in Java). Also, the value NULL is defined in stdlib.h, so you must #include <stdlib.h> in order to use NULL.

A valid non-NULL pointer can contain:

the address of a variable
the address of some dynamically allocated memory
the address of a function

The first is not very useful; we'll look at some examples just to see how it works, but you probably won't use that kind of pointer in your code. The third is sometimes useful, but we won't discuss it here.

Here are some examples of how to declare pointers:

  int *p;     // p is a pointer to an int, currently uninitialized
  double *d;  // d is a pointer to a double, currently uninitialized
  int *q, x;  // q is a pointer to an int; x is just an int

Note that if you want to declare several pointers at once, you must repeat the star for each of them (e.g., in the example above, x is just an int not a pointer, since there is no star in front of x).

There are several ways to give a pointer p a value:

Assign p the special value NULL:
Assign p the address of some variable, using the "address-of" operator: &. For example:

Assign p to be the (first) address in a chunk of storage allocated dynamically using the "new" operator:

  p = new int;  // p points to newly allocated (uninitialized) memory
                // note that the argument to the new operator is just the
		// type of the storage to be allocated; there are no
		// parentheses after the type as there would be in Java

Assign p to be the value currently stored in another pointer of the same type:

  int *p, *q;
  q = ...
  p = q;   // p and q both point to the same location

To access the location pointed to by a pointer p, use the star operator. For example:

  x = 3;
  p = &x;      // p points to x
  cout << *p;  // the value in the location pointed to by p is output;
               // since p points to x, it is x's value, 3, that is output
  *p = 5;      // the value in the location pointed to by p is set to 5;
               // since p points to x, now x == 5

If you use == or != to compare two pointers, the values that are compared are the addresses stored in the pointers, not the values that are pointed to. For example:

  int *p, *q;
  p = new int;
  q = new int;
  *p = 3;
  *q = 3;

After this code is executed, p and q point to different locations, so the expression (p == q) evaluates to false. However, the values in the two locations are the same, so the expression (*p == *q) evaluates to true.

In C++ there is no garbage collection (as there is in Java). This means that if you allocate memory from the heap (using new), that memory cannot be reused unless you deallocate it explicitly. Deallocation is done using the delete operator. For example:

  int *p = new int;   // p points to newly allocated memory
  ...
  delete p;           // the memory that was pointed to by p has been
                      // returned to free storage

In general, every time you use the new operator (to allocate storage from the heap), you should think about where to use a corresponding delete operator (to return that storage to the heap). If you fail to return the storage, your program will still work, but it may use more memory than it actually needs.

Beware of the following:

Never dereference an uninitialized pointer:

      int *p; // p is an uninitialized pointer
      *p = 3; // bad!

Never dereference a NULL pointer:

      int *p = NULL;  // p is a NULL pointer
      *p = 3;         // bad!

Never dereference a deleted pointer:

      int *p = new int;
      delete p;
      *p = 5;              // bad!

Never dereference a dangling pointer (a pointer to a location that was pointed to by another pointer that has been deleted):

      int *p, *q;
      p = new int;
      q = p;           // p and q point to the same location
      delete q;        // now p is a dangling pointer
      *p = 3;          // bad!

If p is the only pointer that points to dynamically allocated memory, and you reassign p without first deleting it, that memory will be lost (your code will have a storage leak):

Example 2 (dereferencing a NULL pointer) will probably cause a runtime error. The other examples will probably not cause runtime errors, but probably will cause your program to work incorrectly (e.g., you might get a runtime error later in the program execution, or the wrong values might be computed). Logical errors involving pointers are often difficult to track down. Using a tool like purify (more on this later) can help.

TEST YOURSELF NOW

In the following code, identify each expression that includes a dereference of a pointer that may be uninitialized, may be NULL, may be deleted, or may be a dangling pointer. Also identify the uses of new that are potential storage leaks (i.e., the memory that is allocated may not be returned to free storage).

    int x, *p, *q;
    p = new int;
    cin >> x;
    if (x > 0) q = &x;
    *q = 3;
    q = p;
    p = new int;
    *p = 5;
    delete q;
    q = p;
    *q = 1;
    if (x == 0) delete q;
    (*p)++;

solution

A common use of pointers in C++ is to point to a dynamically allocated structure. If variable p points to a structure with a field named f, there are two ways to access that field:

 (*p).f    // *p is the structure itself; (*p).f is the f field

 p->f      // this is just a shorthand for (*p).f

Here's an example of a typical piece of code used to create a linked list of integers whose values are read from the standard input (note that the list is built from right to left, so the values in the final list are in the opposite order to that in which they are read in):

  struct ListNode {
      int data;
      ListNode *next;  // pointer to the next node in the list
  };
  ListNode *head = NULL;  // head points to the list of nodes, initially empty
  int k;
  while (cin >> k) {

    // create a new list node containing the value read in
      ListNode *tmp = new ListNode; 
      tmp->data = k;

    // attach the new node to the front of the list
      tmp->next = head;
      head = tmp;
  }

TEST YOURSELF NOW

Write a function named PrintList that has one parameter of type "pointer to ListNode", that points to the first node in a linked list of integers like the one created by the code fragment given above. PrintList should print (to cout) each value in the list, one per line.
Write a recursive function PrintReverse that has one parameter of type "pointer to ListNode", that points to the first node in a linked list of integers. PrintReverse should print (to cout) each value in the list in reverse order, one per line.

solution

Another common use of pointers in C++ is to point to a dynamically allocated array of values. The new operator can be used to allocate either a single object of a given type or an array of objects. For example, to allocate an array of integers of size 10, and to set variable p to point to the beginning of the array, use:

  int *p = new int [10];

To return the array to free storage use:

  delete[] p;

(Note that when you free an array it is very important to include the square brackets, but that you must not include them when you are freeing non-array storage.) If a pointer is set to point to the beginning of the allocated array, it can be treated as if it were an array instead of a pointer. For example:

  int *p = new int[10];  // p points to an array of 10 ints
  p[0] = 5;
  p[1] = 10;

You can also access each array element by incrementing the pointer; for example:

  int *p = new int [10];
  *p = 5;         // same as p[0] = 5 above
  p++;            // changes p to point to the next item in the array
  *p = 10;        // same as p[1] = 10 above

However, this code is not as clear as the code given above that treats p as if it were an array. In general, incrementing and decrementing pointers is more likely to lead to logical errors, and thus should be avoided.

Typedef
You can define a new type with the same range of values and the same operations as an existing type using typedef. For example:

  typedef double Dollars;
  Dollars hourlyWage = 10.50;

This code defines a new type named Dollars, that is the same as type double. The reason to use a typedef is to emphasize the intended purpose of a type (this is similar to the reason for choosing good variable names -- to emphasize what they represent).

Summary

A C++ program consists of one or more functions, exactly one of which must be named main. When the program is run, the main function is called.
Use g++ to compile your program (use the -o flag to cause the executable to be given a name other than "a.out"). Run the program by typing the name of the executable.
C++ includes essentially the same set of primitive types as Java. In addition, you can create arrays, enumerations, structures, unions, pointers, and classes. You can create synonyms for existing types using typedef.

Solutions to Self-Study Questions

Test Yourself #1

#include <iostream>

int main() {
int sum = 0;
  for (int k=1; k<11; k++)   {
    sum += k;
  }
  cout << "The sum is: " << sum << endl;
  return 0;
}

Test Yourself #2

bool ArrayEq( int A1[], int A2[], int size ) {
  for (int k=0; k<size; k++) {
    if (A1[k] != A2[k]) return false;
  }
  return true;
}

Test Yourself #3

struct Student {
    int id;
    bool isGrad;
    Address addr;
};

int NumGrads( Student S[], int size ) {
  int num = 0;
  for (int k=0; k<size; k++) {
    if (S[k].isGrad) num++;

  }
  return num;
}

Test Yourself #4

int x, *p, *q;
   p = new int;
   cin >> x;
   if (x > 0) q = &x;
   *q = 3;                 <--- deref of possibly uninitialized ptr q
   q = p;
   p = new int;            <--- potential storage leak (if x != 0 this
                                memory will not be returned to free storage)
   *p = 5;
   delete q;
   *q = 1;                 <--- deref of deleted ptr q
   q = p;
   if (x == 0) delete q;
   (*p)++;                <--- deref of possibly dangling ptr p (if x is zero)

Test Yourself #5

struct ListNode {
    int data;
    ListNode *next;  // pointer to the next node in the list
};

void PrintList( ListNode *L ) {
  ListNode *tmp = L;
  while (tmp != NULL) {
    cout << tmp->data << endl;
    tmp = tmp-> next;
  }
}

void PrintReverse( ListNode *L ) {
  if (L == NULL) return;
  PrintReverse( L->next );
  cout << L->data << endl;
}

Introduction to C++ For Java Programmers

Contents

Basic Terminology and C++ Functions

Compiling and running a C++ Program