BAGS


          |   |
          |   |
         / 17  \
        /    23 \
       | 4       |
       |   11  17|
        \_______/
      

What can we do with bags?

Making and throwing away bags

Running a lottery


Summary of our "Bag of tricks" (bag operations)

Some new operations:

Specifying the Bag class

The specifications for the bag operations can be found in the file bag.h.

The details of how some of the bag operations are specified can be found in the these postscript slides of Main & Savitch.

Some reminders about C++ member functions:

Implementing the Bag class

The implementations for the bag operations can be found in the file bag.C.

The details of how some of the bag operations are implemented can also be found in the these postscript slides of Main & Savitch.

Other details are discussed below.

Algorithm for Bag::occurrences(int target):

  1. Initialize an answer-counter to 0
  2. For each element in the array representing the bag (data), check it if it is equal to the target item we are counting. If so, increment answer-counter.
  3. Return the answer-counter.

Bag::grab() - first attempt:

  1. (Check the precondition: the bag is not empty.)
  2. Select a `random' location in the used part of the data array.
  3. Remember the item at that location.
  4. Decrement the member variable count
  5. For each item past that location, move the item one position backwards in the array.
  6. Return the `remembered' item.
int Bag::grab()
{
  int x;
  size_t i;

  i = rand() % count; // i will be in range 0<=i<count
  x = data[i];
  count--;
  for(size_t j = i; j < count; j++)
    data[j] = data[j+1];
  return x;
}
  

An example of calling Bag::grab(). Consider the Bag represented as:

8 4 17 33 4 11 16
      

If the random int i is chosen to be 0 then the resulting array is:

4 17 33 4 11 16
      

Meaning that roughly count number of items of the data array had to be `slid over'. This makes grab slow for large Bags. (Slower than necessary - in fact, it is O(n) - see the notes on computational complexity.)

Building a Better Grab

Bag::grab() can be implemented more efficiently. We can simply move the last item in the data array into the position of the removed item since the order of the items does not matter:
int Bag::grab()
{
  int x;
  size_t i;

  i = rand() % count;
  x = data[i];
  count--;
  data[i] = data[count];
  return x;
}
    
The algorithm has a constant running time - it performs the same number of operations regardless of how many items in are in the bag.

We would like to be able to formalize the notion that one algorithm is more efficient than another. This motivates our next topic: computational complexity.


Bag equality

Suppose b and c are bags. Is b==c?
  1. We can quickly check if bags have same number of total elements. If not we return false. If so, we continue on to the next step.
  2. For each item x in b check if x occurs the same number of times in b as it does in c. If not then return false. If so, continue checking items in b.
  3. If we make it all the way through b then b and c must be equal so we return true.
Running-time analysis: So bag equality, as defined, is in the O(n2 ) worst-case.

Better Bag equality?

Can we do better? Yes. We can make arrays B,C of the size of the two bags and sort b.data into B and c.data into C. Then b==c if and only if B and C have the same elements in the same order.

How fast is this version of bag equality? 2 sorting operations plus a walk through arrays B, C (which is O(n)), so the complexity it 2*sort-time + O(n). In general, sorting is worse than O(n) so, asymptotically, worst-case time is proportional to the sorting complexity.

For selection sort, bag equality would be O(n2 ) again. However, there are O(n*lg(n)) sorting algorithms (which we will learn about later in the semester). In the case of the faster sorting, bag equality can be done in O(n*lg(n)).

Drawbacks to the sorting method of bag equality include the memory overhead for the additional arrays and that for small bags this may actually be slower.