Why B Trees?

Referencing a tree node requires a memory access, and memory access operations can be slow. The goal is to keep the number of memory accesses as low as possible. The solution is to increase the number of values stored in each node and increase the number of children at each node.

Definition

B Trees are self balancing trees where each node may have multiple children and multiple values. B Trees are not binary search trees, but they do follow an order property. 2-3-4 Trees are a kind of B tree where nodes can be 2 nodes, 3 nodes, or 4 nodes. Insertion in a B tree is always done in a leaf node. When a new level is needed, the tree grows at the root level.

Properties

Rules Every 2-3-4 Tree Follows

Internal Node Types

2-Nodes, 3-Nodes, and 4-Nodes

Tree Diagram

A Sample 2-3-4 Tree

Diagram of a 2-3-4 tree showing a 4-node root with keys 10, 20, 30 and children that are 2-nodes and 3-nodes, with all leaves at the same depth.
All leaf nodes sit at the same depth. Internal nodes hold multiple values and have one more child than they have values.

Insert Operation

Pre-emptive Splitting

Pre emptive splitting is when we split full nodes along the search path during insertion. This keeps the tree balanced without needing to walk back up.

  1. Search for where the new value should go.
  2. For every node along the search path, check if the node is full (a 4 node). If it is, split it pre emptively before continuing.
  3. Once we reach a leaf node it will always have an open spot because we split any full nodes on the way down. Add the new value there.

Further Reading

For more detail on B Trees and 2-3-4 Trees including formal definitions and additional examples:

Wikipedia: 2-3-4 Tree →