Balanced Search Trees

The complexity of BSTs is:

The worst case can happen, for example, when you insert sorted data in a BST.

Potential fixes:

This is known as a benevolent side effect since only the internal representation changes and not what the user sees (in terms of data - the access costs change).

The cost of this operation is O(n)

This would be good if you can do it. Turns out can do this with O(log(n)) costs.

The second technique is called balanced trees (possibly more accurate to say balanced search trees). There are many variants:

We will discuss red-black trees as an example. The details are involved. Keep in mind the important result that the worst case for add, delete and find will be O(log(n)).

An important technique to rebalance the tree is a rotation:

 

Notice how rotate right makes the right side of the tree higher. Thus, if the height of the left tree is too large then you rotate right. You rotate left for the inverse case.

How do you know you still have a BST? Look at the right rotation and assume you started with a BST on the left:

On right: A is the right child of B Þ B < A

On right: T2 is in right subtree of B Þ T2 > B

On right: T2 is left child of A Þ T2 < A

All balanced trees are complex:

red-black trees

A BST is a red-black tree if (it also has the standard ordering rules of a BST):

  1. every node is either red or black
  2. The child of a leaf node is normally null. We make this into a special "null" value where the node has no data but is colored black. (These nodes are just for counting and aren't normally stored in the tree.)
  3. If a node is red then both of its children are black. (Note the inverse is not true. A black node can have any color children.)
  4. Every path from a node to a descendant leaf contains the same number of black nodes.

It may be hard to see, but these rules imply the height is not worse than 2 * log(n). Here are a few intuitive reasons this is true:

Rule 4 is easier to keep track of with this term: black-height of a node is the number of black nodes (not counting itself) from x to all of its leaves (including the null black "node").

 

The black-height of 26 (root) is 3

The black-height of 17 is 3

The black-height of 41 is 2

The black-height of whole tree is 4

Insertion into a red-black tree

Here are the basic steps:

  1. Apply a BST insert and color the node red.
  2. Note this does not change the black-height so that is still ok

  3. Recolor and rotate as necessary to fix up that may have red node with a red child. The cases for this are below. This is what keeps the tree balanced.
  4. At very end color the root of the tree black.

Here is some notation:

Initially this will be at a leaf since this is where you insert. However, it can move up the tree as you apply the rules.

There are 6 cases that must be covered in case 2) above:

  1. XP (X's parent) is a left child
  1. XA (X's auncle) is red
  2. XA is black and X is a right child
  3. XA is black and X is a left child
  1. XP (X's parent) is a right child
  1. XA (X's auncle) is red
  2. XA is black and X is a left child
  3. XA is black and X is a right child

Let's look at the cases:

A.1-3) XP (X's parent) is a left child

A general picture is:

 

 

In A.3) X is a left child.

Case A.1) XP (X's parent) is a left child and XA (X's auncle) is red

Here are the steps:

  1. color XGP (X 's grandparent) red.
  2. It must have started black because XP is red and it was a red-black tree at the start.

  3. color XP (X's parent) and XA (X's auncle) black

Here is what it looks like:

 

 

A few points:

 

Case A.2) XP (X's parent) is a left child and XA (X's auncle) is black and X is a right child

Here are the steps:

  1. apply a left rotation to X and XP
  2. apply case A.3)

Here is what it looks like:

 

 

A few points:

B becomes X

X becomes XP

 

Case A.3) XP (X's parent) is a left child and XA (X's auncle) is black and X is a left child

Here are the steps:

  1. reverse the coloring on XP (X's parent) and XGP (X's grandparent)
  1. apply a right rotation to XP and XGP

Here is what it looks like:

 

 

A few points:

 

B.1-3) XP (X's parent) is a right child

As you might expect, the cases are similar to A. but left and right are reversed.

A general picture is:

 

 

In case B.2) X is a left child.

 

Case B.1) XP (X's parent) is a right child and XA (X's auncle) is red

Here are the steps:

  1. color XGP (X 's grandparent) red.
  2. It must have started black because XP is red and it was a red-black tree at the start.

  3. color XP (X's parent) and XA (X's auncle) black

Here is what it looks like:

 

 

 

Case B.2) XP (X's parent) is a right child and XA (X's auncle) is black and X is a left child

Here are the steps:

  1. apply a right rotation to X and XP
  2. apply case B.3).

Here is what it looks like:

 

 

 

Case B.3) XP (X's parent) is a right child and XA (X's auncle) is black and X is a right child

Here are the steps:

  1. reverse the coloring on XP (X's parent) and XGP (X's grandparent)
  2. apply a left rotation to XP and XGP

Here is what it looks like:

 

 

Here are a few examples of insertion:

 

 

 

 

Red-Black deletion

We will not discuss deletion in detail. However, consider the possibilities:

  1. node to delete has no children:
  2.  

    S1 is an appropriate subtree so this is a red-black tree at start.

    If you delete D and it is black then you change the black-height and this is a problem. Since the node to delete has no children (null black) you don't have a red following red problem.

  3. node to delete has 1 child:

  1. node to delete has 2 children:

Here you have more possibilities. Recall you find the smallest successor and replace the deleted node with that.

 

Note: Weiss makes all null black "nodes" (nB) into same node called null node. This node is black without any element (data value). When doing a search, the code sets the element in null node to key to act as a sentinel. This stops the search from going past the end of the tree.

Quick Highlight

All the insert and delete operation on red-black trees are O(log(n)).