Notes on Red-Black Trees

The efficiency of binary search trees

Improvements and alternatives

Rather than rely on luck, we'd like to be able to guarantee that we can perform the search tree operations in logarithmic time. There are several ways this can be accomplished:

As read in Main & Savitch, we can use B-trees to store the data in a way that is wide rather than deep, and still allows for efficient manipulations. B-trees are not binary search trees --- they aren't even binary trees. In order to implement them we need to start from the ground up with a fresh data structure.

We'd like instead to be able to reuse binary search trees, but in such a way that their height remains logarithmic. The are several ways to do this:

Since the same binary search tree can be represented by binary trees of varying height, there must be someway to transform trees that are imbalanced into ones that are balanced.

Our goal is to reuse the same algorithms for manipulating binary search trees, but in addition, when the tree is changed (i.e. with insert or remove) we apply an additional rebalancing phase. As in:

  Modified-insert:
    0. usual BST insert
    1. rebalance the tree
Of course, this will only be useful if the rebalancing is itself efficient (we'll get back to that).

Before proceeding into further details of how to keep binary search trees balanced, let's explore more thoroughly how we can build new, but similar data structures (for example, red-black trees) from existing ones (i.e. from binary search trees). This is our motivation for introducing inheritance.

Internal and external nodes

Thus far, we have considered only internal tree nodes --- nodes that contain data. We can also consider NULL pointers to be nodes without data (and without children) these are so called external leaf nodes:
       (2)    
      /   \   
   (1)     (5) 
   / \     / \
  x   x  (3)  x
         / \
        x   x

(the x's are external nodes)

Red-black trees

A red-black tree is a binary search tree such that each node (internal and external) is assigned a color (either red or black). The coloring of the tree must satisfy the following red-black properties:
  1. Every external leaf (NULL node) is considered to be black.
  2. If a node is red, then both its children are black.
  3. For a given node, the number of black nodes between it and an external leaf is the same regardless of which path is taken.

Red-black trees are well balanced

It can be proven that the height of a red-black tree is never more than 2*lg(n+1) where n is the total number of internal nodes in the tree. Thus, the height is O(lg(n)).

The intuition: By property 2, no two red nodes can be parent and child. This means on any path from root to an external leaf, there are at least as many black nodes as there are red. Property three says that regardless of which path is taken, the same number of black nodes are encountered. So, in the worst-cast, there is a path that is twice as long as the shortest path. So the height of the tree is no worse than twice the height of the shortest path. That's fairly balanced.

Implementing red-black trees

Full code for binary search trees may be found here. Full code for red-black trees may be found here.

template <class Item, class Key>
class BST {
public:
  // Constructors
  BST();
  BST(const BST<Item,Key>& source); // copy constructor

  // Destructor
  ~BST();
  
  // Constant member functions
  bool isEmpty() const;
  size_t size() const;
  size_t height() const;
  bool search(const Key& k, Item& returnVal) const;

  // Modification member functions
  void operator=(const BST& source);
  bool insert(const Item& v, const Key& k);
  bool remove(const Key& v);
  
protected:
  typedef BinaryTree<pair<Item,Key> > BT;
  BT *root;

  // protected member functions
  Item itemOf(const pair<Item,Key>& ikp) const {return ikp.first;}
  Key keyOf(const pair<Item,Key>& ikp) const {return ikp.second;}
  bool isRightChild(BT *node) const;
  BT *removeNode(BT *node);
  BT *successor(BT *node) const;
  BT *_search(BT *t, const Key& k) const;
  BT *_insert(const pair<Item,Key>& ikp);
  void _remove(BT *node);
};
enum color {RED, BLACK};

template < class Item, class Key >
class RBT : public BST<pair<Item, color>, Key> {
public:
  ...
  bool search(const Key& k, Item& returnVal) const;
  bool insert(const Item& v, const Key& k);
  bool remove(const Key& v);
  
private:
  typedef BinaryTree<pair<pair<Item, color>,Key> > BT;

protected:
  // Protected member functions
  Item itemOf(const pair<pair<Item,color>,Key>& ick) const;
  color colorOf(BT *node) const;
  void setColor(BT *node, color c);
  BT *rotateLeft(BT *node);
  BT *rotateRight(BT *node);
  void fixInsert(BT *node);
  void fixRemove(BT *x, BT *p);
};
template <class Item, class Key>
bool RBT<Item,Key>::insert(const Item& v, const Key& k)
{
  BT *node = _insert(pair<pair<Item,color>,Key>(pair<Item,color>(v, RED), k));
  if (node != NULL)
    fixInsert(node);
  return (node != NULL);
}

Binary search property preserving transformations

rotateLeft(x):

     (x)                   (y)     
     / \     		   / \     
   'a   (y)      ===>    (x) 'c
        / \  	         / \
      'b  'c  	       'a  'b
rotateRight(y):

       (y)                (x)     
       / \                / \    	    
     (x) 'c      ===>   'a  (y) 	 
     / \  	            / \ 	 
   'a  'b                 'b  'c

Restoring red-black properties after insert

New node (other than root) always colored red.

Restoring red-black properties after remove