An Exposition on B-Trees

B-Trees are a type of balanced search tree and can be though of as an extension of BSTs. Computer scientists realized that if a BST was stored on the hard drive, the time cost it would take to retrieve each node's data from the hard drive and store it in memory for processing was too expensive to just get one piece of data. So, computer scientists came up with a new form of balanced search tree which allowed us to store more data per node which helps maximize the amount of data we retrieve per memory call and also reduces our tree depth so we don't have to make as many calls to memory.

B-Tree Structure

B-Trees have a property called order. This order tells you the maximum and minimum number of children a node can have. A node can have between order and order*2 children. And a node will have one less data values than its number of children. Now, you may be asking if your node has multiple children, how do you organize them. That's where those n-1 children come in. In our node, the children are stored and listed in increasing order. The children nodes are kind of "stored" in between our values. Take for example a b-tree with order 2: so it has between 2 and 4 children and thus nodes can have between 1 and 3 values.

B-Tree Insertion Algorithm

The B-Tree Insertion Algorithm is very similar to a search operation. We'll be covering the algorithm for B-Trees of order 2 (i.e. 2-3-4 Tress). We start at the root node

  1. First, we check if our node is full. If it is, we split it and push its middle value up, adding it to the node above it or creating a new node is this node was the root. We also let our moved up middle key's former neighbours becomes its left and right children. Then we continue our search at the top node, but don't split even if it's full
  2. If our node isn't full, we check if it's a leaf. If it's a leaf, we can insert there and then place our new key, making sure to keep the keys in order. If it isn't, we check which of it's children we could fit into. That is, if we had values: a,b,c. If our node is less than a, then it would be going into the left-most child, if it was between a and b, it would be going to the child between them, and so on.
  3. Then we continue or search there. Repeating step 1 again until we reach a leaf we can insert into.