Introduction to C++ For Java Programmers


Contents


Basic Terminology and C++ Functions

In Java, we refer to the fields and methods of a class. In C++, we use the terms data members and member functions. Furthermore, in Java, every method must be inside some class. In contrast, a C++ program can also include free functions - functions that are not inside any class.

Every C++ program must include one free function named main. This is the function that executes when you run the program.

Here's a simple example C++ program:

Things to note:

  1. Function main should have return type int; you should return zero to indicate normal termination and any non-zero value to indicate that an error occurred (e.g., bad data was read, an attempt to open a non-existent file was made).

  2. Function main is not required to have any parameters. However, if you intend your program to be run with command-line arguments, you should declare main as follows: The first parameter is the number of command-line arguments, including the name of the executable itself (more about executable files below). The second parameter is an array of C-style strings (a C-style string is a sequence of characters, terminated with a special null character). Note that, like all formal parameters, you can use any names you want in place of "numargs" and "args". For historical reasons, some programmers use "argc" and "argv", but those are not very informative names.

  3. To write to the standard output (usually the computer screen) use: Use endl (or '\n' or "\n") to write a newline.

  4. The output operator (<<) is overloaded; you can use it to write a value of any of the primitive types. For example:

  5. Note that the output operator can be chained (just like the assignment operator). For example, assuming that x, y, and z are declared variables, both of the following are legal C++: The assignment operator is right associative, so the chained assignment statement is evaluated right-to left (first z is set to 0, then y is set to 0, then x is set to 0). The output operator is left-associative, so the chained output statement is evaluated left-to-right (first x is output, then y, then z).

  6. The line: is similar to an import statement in Java (but note that the #include does not end with a semicolon). The #include is needed to provide the definitions of cout and endl, which are both used in the example program.


TEST YOURSELF NOW

Write a C++ program that uses a loop to sum the numbers from 1 to 10 and prints the result like this:

    The sum is: xxx
Note: Use variable declarations, and a for or while loop with the same syntax as in Java.

solution


Another example program (multiple functions and forward declarations)

Usually, when you write a C++ program you will write more than just the main function. Here's an example of a program with two functions:

In this example, function main calls function print. Since print is a free function (not a method of some class), it is called just by using its name (not xxx.print()). It is important that the definition of print comes before the definition of main; otherwise, the compiler would be confused when it saw the call to print in main. If you do want to define main first, you must include a forward declaration of the print function (just the function header, followed by a semi-colon), like this:

Compiling and running a C++ Program

By convention, C++ source code is put in a file with the extension ".C" or ".cc" or ".cpp". The name itself can be whatever you want (there is no need to match a class name as in Java).

To create an executable file named a.out for the C++ program in foo.C, type:

To create an executable file named foo for the C++ program in foo.C, type: (The -o flag tells the compiler what name to use for the resulting executable; you use can use the -o flag to give the executable any name you want, with or without an extension. If you are working on a Unix machine, do not name your executable test -- there is a Unix utility with that name, so when you think you are running your program, you are really calling the utility, and that can be very confusing!)

To run your program, just type the name of the executable. For example, to run the executable named foo, just type foo at the prompt.

If your program is in more than one file, you can create an executable by compiling the whole program at once; e.g., if your program is in the two files, main.C and foo.C:

You can also create individual object files that can later be linked together to make an executable. To tell the compiler that you want object code only, not an executable, use the -c flag. For example: This will create an object file named main.o. Once you have object files for all of your source files, you link them like this: Since no -o was used, this creates an executable named a.out

The advantage of creating individual object files is that you don't need to recompile all files if you change just one.

C++ Types

C++ has a larger set of types than Java, including: primitive types, constants, arrays, enumerations, structures, unions, pointers, and classes. You can also define new type names using the typedef facility.

C++ classes will be discussed in a separate set of notes. The other types are discussed below.

Primitive (built-in) types
The primitive C++ types are essentially the same as the primitive Java types: int, char, bool, float, and double (note that C++ uses bool, not boolean). Some of these types can also be qualified as short, long, signed, or unsigned, but we won't go into that here.

Unfortunately, (to be consistent with C) C++ permits integers to be used as booleans (with zero meaning false and non-zero meaning true). This can lead to hard-to-find errors in your code like:

if (x = 0) ...
The expression "x = 0" sets variable x to zero and evaluates to false, so this code will compile without error. One trick you can use to avoid this is to write all comparisons between a constant and a variable with the constant first; e.g.:
if (0 == x) ...
Since an attempt to assign to a constant causes a syntax error, code like:
if (0 = x) ...
will not compile.

Constants
You can declare a variable of any of the primitive types to be a constant, by using the keyword const. For example:

Constants must be initialized as part of their declarations, and their values cannot be changed.

Arrays
Unfortunately, C++ arrays lack many of the nice features of Java arrays:

In many cases it is better to use a vector class (either the one provided by the C++ standard template library -- more on this later -- or one you write yourself). A vector class can be defined to include Java-like features.

Here's an example C++ array declaration:

This declares an array of 10 integers named A. Note that the brackets must follow the variable name (they cannot precede it as they can in Java). Note also that the array size must be part of the declaration (except for array parameters, more on this in a minute), and the size must be an integer expression that evaluates to a non-negative number.

Because in C++ an array declaration includes its size, there is no need to call new as is done in Java. Declaring an array causes storage to be allocated.

Multi-dimensional arrays are defined using one pair of brackets and one size for each dimension as in Java; for example, the following declares M to be a 10-by-20 array of ints:

As mentioned above, array parameters are declared without a size. This allows a function to be called with actual array parameters of different sizes. However, since there is no length operation, it is usually necessary to pass the size of the array as another parameter. For example:


TEST YOURSELF NOW

Write a C++ function named ArrayEq that has 3 parameters: two arrays of integers, and their size (the same size for both arrays). The function should return true if and only if the arrays contain the same values in the same order.

solution


Enumerations
New types with a fixed (usually small) set of possible values can be defined using an enum declaration, which has the following form:

For example: This defined a new type called Color. A variable of type Color can have one of 3 values: red, blue, or yellow. For example, here's how to declare and assign to a variable of type Color:

Structures
A C++ structure is:

Here's an example declaration: The structure name (Student) defines a new type, so you can declare variables of that type, for example: To access the individual fields of a structure, use the dot operator: It is possible to have (arbitrarily) nested structures. For example: Access nested structure fields using more dots: If you get confused, think about the type of each sub-expression:


TEST YOURSELF NOW

Assume that the declaration of Student given above has been made. Write a C++ function named NumGrads that has 2 parameters: an array of Students, and the size of the array. The function should return the number of students in the array who are grad students.

solution


Unions
A union declaration is similar to a struct declaration, but the fields of a union all share the same space; so really at any one time, you can think of a union as having just one "active" field. You should use a union when you want one variable to be able to have values of different types. For example, assume that only undergrads care about their GPA, and only grads can be RAs. In that case, we might use the following declarations:

Of course, we could just add two new fields to the Student structure (both a GPA field and an isRA field), but that would be a bit wasteful of space, since only one of those fields would be valid for any one student. Using a union also makes it more clear that only one of those two fields is meaningful for each student.

It is important to realize that it is up to you as the programmer to keep track of which field of a union is currently valid. For example, if you write the following bad code:

you will get neither a compile-time nor a run-time error. However, this code makes no sense, and there is no way to be sure what will happen -- the value of inf.isRA depends on how 3.7 is represented and how a bool is represented.

Pointers
In Java, every array and class is really a pointer (and you need to use the "new" operator to allocate storage for the actual object). This is not true in C++ -- in C++ you must declare pointers explicitly, and it is possible to have a pointer to any type (including pointers to pointers!).

A pointer variable either contains an address or the special value NULL (note that it is upper-case in C++, not lower-case as in Java). Also, the value NULL is defined in stdlib.h, so you must #include <stdlib.h> in order to use NULL.

A valid non-NULL pointer can contain:

The first is not very useful; we'll look at some examples just to see how it works, but you probably won't use that kind of pointer in your code. The third is sometimes useful, but we won't discuss it here.

Here are some examples of how to declare pointers:

Note that if you want to declare several pointers at once, you must repeat the star for each of them (e.g., in the example above, x is just an int not a pointer, since there is no star in front of x).

There are several ways to give a pointer p a value:

  1. Assign p the special value NULL:
  2. Assign p the address of some variable, using the "address-of" operator: &. For example:
  3. Assign p to be the (first) address in a chunk of storage allocated dynamically using the "new" operator:
  4. Assign p to be the value currently stored in another pointer of the same type:
To access the location pointed to by a pointer p, use the star operator. For example: If you use == or != to compare two pointers, the values that are compared are the addresses stored in the pointers, not the values that are pointed to. For example: After this code is executed, p and q point to different locations, so the expression (p == q) evaluates to false. However, the values in the two locations are the same, so the expression (*p == *q) evaluates to true.

In C++ there is no garbage collection (as there is in Java). This means that if you allocate memory from the heap (using new), that memory cannot be reused unless you deallocate it explicitly. Deallocation is done using the delete operator. For example:

In general, every time you use the new operator (to allocate storage from the heap), you should think about where to use a corresponding delete operator (to return that storage to the heap). If you fail to return the storage, your program will still work, but it may use more memory than it actually needs.

Beware of the following:

  1. Never dereference an uninitialized pointer:
  2. Never dereference a NULL pointer:
  3. Never dereference a deleted pointer:
  4. Never dereference a dangling pointer (a pointer to a location that was pointed to by another pointer that has been deleted):
  5. If p is the only pointer that points to dynamically allocated memory, and you reassign p without first deleting it, that memory will be lost (your code will have a storage leak):
Example 2 (dereferencing a NULL pointer) will probably cause a runtime error. The other examples will probably not cause runtime errors, but probably will cause your program to work incorrectly (e.g., you might get a runtime error later in the program execution, or the wrong values might be computed). Logical errors involving pointers are often difficult to track down. Using a tool like purify (more on this later) can help.


TEST YOURSELF NOW

In the following code, identify each expression that includes a dereference of a pointer that may be uninitialized, may be NULL, may be deleted, or may be a dangling pointer. Also identify the uses of new that are potential storage leaks (i.e., the memory that is allocated may not be returned to free storage).

solution


A common use of pointers in C++ is to point to a dynamically allocated structure. If variable p points to a structure with a field named f, there are two ways to access that field:

  1.  (*p).f    // *p is the structure itself; (*p).f is the f field 
  2.  p->f      // this is just a shorthand for (*p).f 
Here's an example of a typical piece of code used to create a linked list of integers whose values are read from the standard input (note that the list is built from right to left, so the values in the final list are in the opposite order to that in which they are read in):


TEST YOURSELF NOW

  1. Write a function named PrintList that has one parameter of type "pointer to ListNode", that points to the first node in a linked list of integers like the one created by the code fragment given above. PrintList should print (to cout) each value in the list, one per line.

  2. Write a recursive function PrintReverse that has one parameter of type "pointer to ListNode", that points to the first node in a linked list of integers. PrintReverse should print (to cout) each value in the list in reverse order, one per line.

solution


Another common use of pointers in C++ is to point to a dynamically allocated array of values. The new operator can be used to allocate either a single object of a given type or an array of objects. For example, to allocate an array of integers of size 10, and to set variable p to point to the beginning of the array, use:

To return the array to free storage use: (Note that when you free an array it is very important to include the square brackets, but that you must not include them when you are freeing non-array storage.) If a pointer is set to point to the beginning of the allocated array, it can be treated as if it were an array instead of a pointer. For example: You can also access each array element by incrementing the pointer; for example: However, this code is not as clear as the code given above that treats p as if it were an array. In general, incrementing and decrementing pointers is more likely to lead to logical errors, and thus should be avoided.

Typedef
You can define a new type with the same range of values and the same operations as an existing type using typedef. For example:

This code defines a new type named Dollars, that is the same as type double. The reason to use a typedef is to emphasize the intended purpose of a type (this is similar to the reason for choosing good variable names -- to emphasize what they represent).

Summary

Solutions to Self-Study Questions

Test Yourself #1

Test Yourself #2

Test Yourself #3

Test Yourself #4

Test Yourself #5