Lists


Contents


The List ADT

Our first ADT is the List: an ordered collection of items. Note that this doesn't mean that the objects are in sorted order, it just means that each object has a position in the List, starting with position zero.

Recall that when we think about an ADT, we think about both the external and internal views. The external view includes the "conceptual picture" and the set of "conceptual operations". The conceptual picture of a List is something like this:

      item 0
      item 1
      item 2
      . . .
      item n
and one reasonable set of operations is:

Operation Description
void add(Object ob) add ob to the end of the List
void add(int pos, Object ob) add ob at position pos in the List, moving the items originally in positions pos through size() one place to the right to make room (error if pos is less than 0 or greater than size())
boolean contains(Object ob) return true iff ob is in the List (i.e., there is an item x in the List such that x.equals(ob))
int size() return the number of items in the List
boolean isEmpty() return true iff the List is empty
Object get(int pos) return the item at position pos in the List (error if pos is less than 0 or greater than or equal to size())
Object remove(int pos) remove and return the item at position pos in the List, moving the items originally in positions pos+1 through size() one place to the left to fill in the gap (error if pos is less than 0 or greater than or equal to size())

Many other operations are possible; when designing an ADT, you should try to provide enough operations to make the ADT useful in many contexts, but not so many that it gets confusing. It is not always easy to achieve this goal; it will sometimes be necessary to add operations in order for a new application to use an existing ADT.


TEST YOURSELF #1

Question 1: What other operations on Lists might be useful? Define them by writing descriptions like those in the table above.

Question 2: Note that the second add method (the one that adds an item at a given position) can be called with a position that is equal to size(), but for the get and remove methods, the position has to be less than size(). Why?

Question 3: Another useful abstract data type is called a Map. A Map stores unique "key" values with associated information. For example, you can think of a dictionary as a Map, where the keys are the words, and the associated information is the definitions.

What are some other examples of Maps that you use?

What are the useful operations on Maps? Define them by writing descriptions like those in the table above.


Java Interfaces

The List ADT operations given in the table above describe the public interface of the List ADT, that is, the information the someone would need to know in order to be able to use a List (and keep in mind that the user of a List does not - and should not - need to know any details about the private implementation).

Now let's consider how we can create an ADT using Java. Java provides a mechanism for separating public interface from private implementation: interfaces. A Java interface (i.e., an interface as defined by the Java language) is very closely tied to the concept of a public interface. A Java interface is a reference type (like a class) that contains only (public) method signatures and class constants. Java interfaces are defined very similarly to the way classes are defined. A ListADT interface with the List ADT operations we've described would be defined as follows:

public interface ListADT {
    void add(Object ob);
    void add(int pos, Object ob);
    boolean contains(Object ob);
    int size();
    boolean isEmpty();
    Object get(int pos);
    Object remove(int pos);
}

Everything contained in a Java interface is public so you do not need to include the public keyword in the method signatures (although you can if you want). Notice that, unlike a class, an interface does not contain any method implementations (i.e., the bodies of the methods). The method implementations are provided by a Java class that implements the interface. To implement an interface named InterfaceName, a Java class must do two things:

  1. include "implements InterfaceName" after the name of the class at the beginning of the class's declaration, and
  2. for each method signature given in the interface InterfaceName, define a public method with the exact same signature

We will see very shortly how to write a class that implements the ListADT interface.

For example, the following class implements the ListADT interface (although it is a trivial implementation):

public class ListImplementation implements ListADT {
    private Object myOb;
    
    public ListImplementation() {
        myOb = null;
    }
    
    public void add(Object ob) {
        myOb = ob;
    }
    
    public void add(int pos, Object ob) {
        myOb = ob;
    }
    
    public boolean contains(Object ob) {
        return false;
    }
    
    public int size() {
        return 0;
    }
    
    public boolean isEmpty() {
        return false;
    }
    
    public Object get(int pos) {
        return null;
    }
    
    public Object remove(int pos) {

        return null;
    }
}

Lists vs. Arrays

 

In some ways, a List is similar to an array: both Lists and arrays are ordered collections of objects and in both cases you can add or access items at a particular position (and in both cases we consider the first position to be position zero). You can also find out how many items are in a List (using its size method), and how large an array is (using its length field).

The main advantage of a List compared to an array is that, whereas the size of an array is fixed when it is created (e.g., int[] A = new int[10] creates an array of integers of size 10 and you cannot store more than 10 integers in that array), the size of a List can change: the size increases by one each time a new item is added (using either version of the add method) and the size decreases by one each time an item is removed (using the remove method).

For example, suppose we have an ArrayList class that implements the ListADT interface. Here's some code that reads strings from a file called data.txt and stores them in a ListADT named L, initialized to be an ArrayList:

ListADT L = new ArrayList();
File infile = new File("data.txt");
Scanner sc = new Scanner(infile);
while (sc.hasNext()) {
    String str = sc.next();
    L.add(str);
}
If we wanted to store the strings in an array, we'd have to know how many strings there were so that we could create an array large enough to hold all of them.

One disadvantage of a List compared to an array is that whereas you can create an array of any size and then you can fill in any element in that array, a new List always has size zero and you can never add an object at a position greater than the size. For example, the following code is fine:

String[] strList = new String[10];
strList[5] = "hello"; !
but this code will cause a runtime exception:
ListADT strList = new ArrayList();
strList.add(5, "hello");       // error! can only add at position 0

Another (small) disadvantage of a List compared to an array is that you can declare an array to hold any type of item, including primitive types (e.g., int, char), but a List only holds Objects. This can make your code slightly more complicated in the following ways:


TEST YOURSELF #2

Question 1: Assume that variable L is a List containing k strings, for some integer k greater than or equal to zero. Write code that changes L to contain 2*k strings by adding a new copy of each string right after the old copy. For example, if L is like this before your code executes:

"happy", "birthday", "to", "you" then after your code executes L should be like this: "happy", "happy", "birthday", "birthday", "to", "to", "you", "you"

Question 2: Again, assume that variable L is a List containing zero or more strings. Write code that removes from L all copies of the string "hello". Be sure that your code works when L has more than one "hello" in a row.

solution


Implementing the List ADT

Now let's consider the "private" or "internal" part of a List ADT. We will consider two ways to implement the ListADT interface: using an array and using a linked list (the former is covered in this set of notes, the latter in another set of notes). Here's an outline of an ArrayList class, which uses an array to store the items in the List (the bodies of the methods are not filled in for now):

public class ArrayList implements ListADT {

  // *** fields ***
    private Object[] items; // the items in the List
    private int numItems;   // the number of items in the List

  // *** constructor ***
    public ArrayList() { ... }      

  //*** required ListADT methods ***

    // add items
    public void add(Object ob) { ... }  
    public void add(int pos, Object ob) { ... }   

    // remove items
    public Object remove(int pos) { ... }  

    // get items
    public Object get (int pos) { ... }  

    // other methods
    public boolean contains (Object ob) { ... }
    public int size() { ... }      
    public boolean isEmpty() { ... }  
}  

Note that the public methods provide the "external" view of the List ADT. Looking only at the signatures of the public methods of the ListADT interface (the method names, return types, and parameters), plus the descriptions of what they do (e.g., as provided in the table above), a programmer should be able to write code that uses ListADTs and ArrayLists. It is not necessary for a client of the ArrayList class to see how the ArrayList methods are actually implemented. The private fields and the bodies of the methods of the ArrayList class provide the "internal" view - the actual implementation - of the List ADT.

Implementing the add methods

Now let's think about how to implement some of the ArrayList methods. First we'll consider the add method that has just one parameter (the object to be added to the List). That method adds the object to the end of the List. Here is a conceptual picture of what add does:
BEFORE:  item 0    item 1    item 2    ...    item n
 

AFTER:   item 0    item 1    item 2    ...    item n   NEW ITEM

Now let's think about the actual code. First, note that the array that stores the items in the List (the items array) may be full. If it is, we'll need to:

  1. get a new, larger array, and
  2. copy the items from the old array to the new array.
Then we can add the new item to the end of the List.

Note that we'll also need to deal with a full array when we implement the other add method (the one that adds a new item in a given position). Therefore, it makes sense to implement handling that case (the two steps listed above) in a separate method, expandArray, that can be called from both add methods. Since handling a full array is part of the implementation of the ArrayList class (it is not an operation that users of the class need to know about), the expandArray method should be a private method.

Here's the code for the first add method and for the expandArray method.

//**********************************************************************
// add
//
// Given: Object ob
//
// Do:    Add ob to the end of the List
//
// Implementation:
//   If the array is full, replace it with a new, larger array;
//   store the new item after the last item
//   and increment the count of the number of items in the List.
//**********************************************************************
public void add(Object ob) {
    // if array is full, get new array of double size,
    // and copy items from old array to new array
    if (items.length == numItems) expandArray();

    // add new item; update numItems
    items[numItems] = ob;
    numItems++;
}


//**********************************************************************
// expandArray
//
// Do:
//   o Get a new array of twice the current size.
//   o Copy the items from the old array to the new array.
//   o Make the new array be this List's "items" array.
//**********************************************************************
private void expandArray() {
    Object[] newArray = new Object[numItems*2];
    for (int k = 0; k < numItems; k++) {
        newArray[k] = items[k];
    }
    items = newArray;
}

In general, when you write code it is a good idea to think about special cases. For example, does add work when the List is empty? When there is just one item? When there is more than one item? You should think through these cases (perhaps drawing pictures to illustrate what the List looks like before the call to add and how the call to add affects the List). You can then decide if the code works as is or if some modifications are needed.

Now let's think about implementing the second version of the add method (the one that adds an item at a specified position in the List). An important difference between this version and the one we already implemented is that for this version a bad value for the pos parameter is considered an error. In general, if a method detects an error that it doesn't know how to handle, it should throw an exception. (Note that exceptions should not be used for other purposes like exiting a loop.) More information about exceptions is provided in a separate set of notes.

So the first thing our add method should do is check whether parameter pos is in range and if not, throw an IndexOutOfBoundsException. If pos is OK, we must check whether the items array is full and if so, we must call expandArray. Then we must move the items in positions pos through numItems - 1 over one place to the right to make room for the new item. Finally, we can insert the new item at position pos and increment numItems. Here is the code:

//**********************************************************************
// add
//
// Given: int pos, Object ob
//
// Do:    Add ob to the List in position pos (moving items over to the right
//        to make room).
//
// Exceptions:
//        Throw IndexOutOfBoundsException if pos<0 or pos>numItems
//
// Implementation:
//   1. check for bad pos
//   2. if the array is full, replace it with a new, larger array
//   3. move items over to the right
//   4. store the new item in position pos
//   5. increment the count of the number of items in the List
// **********************************************************************
public void add(int pos, Object ob) {
    // check for bad pos and for full array
    if (pos < 0 || pos > numItems) throw new IndexOutOfBoundsException();
    if (items.length == numItems) expandArray();

    // move items over and insert new item
    for (int k = numItems; k > pos; k--) {
        items[k] = items[k-1];
    }
    items[pos] = ob;
    numItems++;
}

TEST YOURSELF #3

Question 1: Write the remove and get methods.

solution


Implementing the constructor

Now let's think about the constructor; it should initialize the fields so that the List starts out empty. Clearly, numItems should be set to zero. How about the items field? It could be set to null, but that would mean another special case in the add methods. A better idea would be to initialize items to refer to an array with some initial size, perhaps specified using a class constant (i.e., a static final field), so that the initial size could be easily changed.

Below is the code for the constructor (including the declaration of the class constant for the initial size). This code uses 10 as the initial size. In practice, the appropriate initial size will probably depend on the context in which the ArrayList class is used. The advantage of a larger initial size is that more add operations can be performed before it is necessary to expand the array (which requires copying all items). The disadvantage is that if the initial array is never filled, then memory is wasted. The requirements for memory usage and runtime performance of the application that uses the ArrayList class, as well as the expected sizes of the Lists that it uses, should be used to determine the appropriate initial size.

private static final int INITSIZE = 10;

//**********************************************************************
// ArrayList constructor
//
// initialize the List to be empty
//**********************************************************************
public ArrayList() {
    numItems = 0;
    items = new Object[INITSIZE];
}

The Java API and Lists

The Java API contains many interfaces, including interfaces that are very similar to many of the ADTs we will be talking about. For example, the Java.util package provides a List interface (with many more methods than the ones given in our ListADT interface), as well as a number of classes that implement that interface, including the ArrayList class and the LinkedList class. The ArrayList implementation we just developed outlines a way that Java's ArrayList class might be implemented.

Iterators

What are iterators?

When a client uses an abstract data type that represents a collection of items (as the List interface does), they often need a way to iterate through the collection, i.e., to access each of the items in turn. Our get method can be used to support this operation. Given a List L, we can iterate through the items in the List as follows:

for (int k = 0; k < L.size(); k++) {
    Object ob = L.get(k);
    ... do something to ob here ...
}
However, a more standard way to iterate through a collection of items is to use an Iterator, which is an interface defined in java.util. Every Java class that implements the Iterable interface (which includes classes that implement the subinterface Collection) provides an iterator method that returns an Iterator for that collection.

The way to think about an Iterator is that it is a finger that points to each item in the collection in turn. When an Iterator is first created, it is pointing to the first item; a next method is provided that lets you get the item pointed to (also advancing the pointer to point to the next item) and a hasNext method lets you ask whether you've run out of items. For example, assume that we have added an iterator method (that returns an Iterator) to our ArrayList class and that we have the following list of words:

If we create an Iterator for the List, we can picture it as follows, pointing to the first item in the List: Now if we call next we get back the word "apple" and the picture changes to: After two more calls to next (returning "pear" and "banana") we'll have: A call to hasNext now returns true (because there's still one more item we haven't accessed yet). A call to next returns "strawberry" and our picture looks like this: The iterator has fallen off the end of the List. Now a call to hasNext returns false and if we call next, we'll get a NoSuchElementException.

The Iterator interface is defined in java.util, so you need to include:

import java.util.*;
at the beginning of your program in order to use Iterators. Assuming that we've done that, and that we've added an iterator method to our ArrayList class, we can write code like the following that uses an iterator to iterate through a List L:
Iterator it = L.iterator();
while (it.hasNext()) {
    Object ob = it.next();
    ... do something to ob here ...
}

Implementing iterators

The easiest way to implement an iterator for the ArrayList class is to define a new class (e.g., called ArrayListIterator) with two fields:

  1. the List that's being iterated over, and
  2. the index of the current item (the item the "finger" is currently pointing to).
We also define a new iterator method for the ArrayList class; that method calls the ArrayListIterator class's constructor, passing the ArrayList itself:
//**********************************************************************
// iterator
//
// return an iterator for this List
//**********************************************************************
public Iterator iterator() {
    return new ArrayListIterator(this);
}
The ArrayListIterator class is defined to implement Java's Iterator interface and therefore must implement three methods: hasNext and next (both discussed above) plus an optional remove method. If you choose not to implement the remove method, you still have to define it, but it should simply throw an UnsupportedOperationException. If you choose to implement the remove method, then it should remove from the ArrayList the last (i.e., most recent) item returned by the iterator's next method or it should throw an IllegalStateException if the next method hasn't yet been called.

Here is code that defines the ArrayListIterator class (note that we have chosen not to implement the remove operation):

public class ArrayListIterator implements Iterator {
    // *** fields ***
    private ArrayList myList;  // the list we're iterating over
    private int myPos;         // the position of the next item

    // *** constructor ***
    public ArrayListIterator(List L) {
        myList = L;
        myPos = 0;
    }
    
    // *** methods ***

    public boolean hasNext() {
        return (myPos < myList.size());
    }

    public Object next() {
        if (!hasNext()) throw new NoSuchElementException();
        myPos++;
        return (myList.get(myPos-1));
    }

    public void remove() {
        throw new UnsupportedOperationException();
    }
}

Testing

One of the most important (but unfortunately most difficult) parts of programming is testing. A program that doesn't work as specified is not a good program and in some contexts can even be dangerous and/or cause significant financial loss.

There are two general approaches to testing: black-box testing and white-box testing. For black-box testing, the testers know nothing about how the code is actually implemented (or if you're doing the testing yourself, you pretend that you know nothing about it). Tests are written based on what the code is supposed to do and the tester thinks about issues like:

So if you were to do black-box testing of the ArrayList class, you should be sure to:

For white-box testing, the testers have access to the code and the usual goal is to make sure that every line of code executes at least once. So for example, if you were to do white-box testing of the ArrayList class, you should be sure to write a test that causes the array to become full so that the expandArray method is tested (something a black-box tester may not test, since they don't even know that ArrayLists are implemented using arrays).

You will be a much better programmer if you learn to be a good tester. Therefore, testing will be an important factor in the grades you receive for the programming projects in this class.