# (a, b)-Trees

**Aim:** Allow efficient insertion, deletion, resizing of data in extremely large databases

{% embed url="<https://www.youtube.com/watch?v=aZjYr87r1b8>" %}
Great video to watch
{% endembed %}

## Invariants

In an $$(a,b)$$ tree, $$a$$ and $$b$$ are the parameters where $$2 \leq a \leq (b+1)/2$$.

Respectively, $$a$$ and $$b$$ refer to the minimum and maximum number of **children (NOT keys)** an internal (i.e., non-root, non-leaf) node can have.

| Binary Tree                      | (a,b)-trees                             |
| -------------------------------- | --------------------------------------- |
| Each node has at most 2 children | Each node can have more than 2 children |
| Each node stores exactly 1 key   | Each node can store multiple keys       |

Every $$(a,b)$$-tree must satisfy the following 3 rules:

#### 1. $$(a,b)$$ *- child Policy*

| Node Type | Min Keys  | Max Keys  | Min Children | Max Children |
| --------- | --------- | --------- | ------------ | ------------ |
| Root      | 1         | $$b - 1$$ | 2            | $$b$$        |
| Internal  | $$a - 1$$ | $$b - 1$$ | $$a$$        | $$b$$        |
| Leaf      | $$a - 1$$ | $$b - 1$$ | 0            | 0            |

With the exception of leaves, note that the **number of children is always one more than the number of keys.**

#### 2. *Key Ranges*

A **non-leaf** node (i.e, root or internal) must have one more child than its number of keys. This is to ensure that all value ranges due to its keys are covered in its subtrees. The permitted range of keys within a subtree is called its **key range.**

In particular, for a non-leaf node with $$k$$ keys and $$k + 1$$ children,

* Its keys in sorted order are $$v\_1, v\_2, \dots , v\_k$$
* The subtrees due to its keys are $$t\_1, t\_2, \dots, t\_{k+1}$$

Then,

* First child $$t\_1$$ has key range $$\leq v\_1$$
* Final child $$t\_{k+1}$$ has key range $$> v\_k$$
* All other children $$t\_i$$ where $$i \in \[2,k]$$ has key range $$(v\_{i-1}, v\_i]$$

#### 3. *Leaf Depth*

All leaf nodes must be at the same depth from the root.

You should realize that a regular BST is just an $$(a,b)$$ tree with $$a = 1, b = 2$$.

## B-Trees

Firstly, it is important to note that the B in B-Tree does not refer to Binary. Actually, no one really knows what the B stands for - it could be Balanced?

B-trees are simply $$(B,2B)$$ trees. That is, they are a subcategory of $$(a,b)$$ trees such that $$a = B, b = 2B$$. For instance, when $$B = 2$$, we have a $$(2,4)$$-tree. (This is sometimes referred to as a $$(2,3,4)$$ tree. An example of a $$(2,3,4)$$-tree is given below:

<figure><img src="/files/BsCScdC93H0KzZ4GXgHu" alt=""><figcaption></figcaption></figure>

**Are B-trees balanced?**

Yes! This is because of Rule 3: All leaves must be at the same depth from the root. This ensures height-balance, which implies balance. In fact, any two siblings (in terms of depth too!) are at exactly the same height. This is due to the fact that a B-tree grows upward from the leaves.

**What is the minimum and maximum height of an** $$(a,b)$$ **tree with n keys?**

The minimum height would be when each leaf has maximum possible number of children (i.e, the tree is dense) and so, minimum height is $$O(log\_bn)$$ and maximum height would be $$O(log\_an)$$.

Now that we have established that an $$(a,b)$$ tree is balanced and height is $$O(log\_an)$$, we see how it helps us for searching, insertion and deletion.

## Searching

What data structure should we use for storing the keys and children in a node?

One possible way is to use an array of (key, subtree) pairs as follows:

<figure><img src="/files/9VCUp0sciPKQV6qrxxGz" alt=""><figcaption></figcaption></figure>

Total search cost?

Height of the tree is $$O(log\_an)$$.

Binary search (woah binary search to the rescue again!) for the key at every node takes $$O(log\_2b)$$ time. (Note that $$b$$ is a constant, and so we consider this as constant time)

Hence, total cost: $$O(log\_2b \times log\_an) = O(log\_an) = O(logn)$$.

## Insertion

The key idea is that just like in BSTs, we only insert at leaves here. So, the idea is to navigate to a suitable leaf and add the new key to its key list.

For example, consider the insertion of the key 71 to the $$(2,4)$$-tree below:

<figure><img src="/files/5ay0gRHzOlGjkh3HZYqr" alt="" width="494"><figcaption></figcaption></figure>

<figure><img src="/files/WGcQQvL7tsdcb0kKTOBo" alt="" width="491"><figcaption></figcaption></figure>

But now what happens when you insert 72? You cannot have 4 keys at a single node! Something needs to be done to solve this problem!

Insertion may violate Rule 1: Recall that rule 1 enforces that nodes can have at most $$b - 1$$ keys. Insertion may cause the leaf nodes to grow too large. We need to have some operation to handle such cases.

**Idea**: Redistribute out the keys!

How can we redistribute the keys?

1. Over sibling nodes?
   * No! Cannot guarantee that the rule is still not violated (i.e., now the sibling may be too large)!
2. Over a new node?
   * Yes! We can always guarantee that the postcondition adheres rule 1.

### Split Operation (Key-redistribution)

1. Choose the median key in the overpopulated node.
2. Use that to split the keylist into 2 halves
3. Left half goes into a new node
4. Move median key to the parent node (what do we do when the root gets overpopulated? it has no parents! don’t worry, we will deal with this too)
5. Link the two nodes to the parent.

If we perform the split operation on the example tree above, the following steps are executed:

<figure><img src="/files/PMz0QF6GwKAvcLpfg0SC" alt="" width="503"><figcaption></figcaption></figure>

<figure><img src="/files/8voBFytylhWXOueH0DZ1" alt="" width="486"><figcaption></figcaption></figure>

<figure><img src="/files/MXIjqmkJyip3zc2Vlr8t" alt="" width="494"><figcaption></figcaption></figure>

<figure><img src="/files/rlqmkIWIr36u6u2yoeTi" alt="" width="486"><figcaption></figcaption></figure>

Why must the `split` operation “offer” one key to the parent?

After splitting, the parent will have one more child than before, therefore it must also have one more key. Taking the (median) key from the node to be split is convenient.

Will the split procedure ever end up with nodes too small (i.e, too few keys) such that they violate rule 1?

No. this is because:

1. Keylist size of LHS is $$\lfloor (b-1)/2 \rfloor = \lfloor b/2 - 1/2 \rfloor = \lfloor a - 1/2 \rfloor$$
2. Keylist size of RHS is $$\lceil(b-1)/2 \rceil = \lceil b/2 - 1/2 \rceil = \lceil a - 1/2\rceil$$
3. In both cases, the size of the keylist is at least $$a -1$$.

But there’s another problem.. What happens if the parent node becomes over-filled after one of its children is split? Umm.. no big deal. Just perform split on the parent node then. So, during insertion, in the worst case, you might have to perform a split at **every** node on the root-to-leaf path on the way back from the recursive call. But all split operations take $$O(1)$$ time (essentially its something like a rotation - just moving some pointers here and there and so the total time of insertion remains $$O(logn)$$).

What about when the root is overpopulated? How do you split the root?

1. Follow the same procedure as before
2. Instead of elevating split key $$v\_m$$ to the parent (there’s no parent for root), make it the new root! (This is exactly why we allow the root to have 1 key and there’s no minimum condition on the number of keys in the root)

In other words, if $$z$$ was the old root that become overpopulated, then the following steps explain how to split a root node:

1. Pick the median key $$v\_m$$ as the split key
2. Split $$z$$ into LHS and RHS using $$v\_m$$
3. Create a new node $$y$$
4. Move LHS split from $$z$$ to $$y$$
5. Create a new empty node $$r$$
6. Insert $$v\_m$$ to $$r$$
7. Promote $$r$$ to be the new root node
8. Assign $$y$$ and $$z$$ to be the left and right child of
9. Assign previous subtree $$t\_m$$ associated with $$v\_m$$ to be the final child of $$y$$

Will the split operation ever violate rule 3?

No! A split on node $$z$$ will create a new node $$y$$ so:

* If $$z$$ is an internal node, $$y$$ will be at the same level and nothing will be pushed down since $$y$$ was split from $$z$$
* If $$z$$ is a leaf node, $$y$$ will also be a leaf node at the same level
* If $$z$$ is the root node, everything will be pushed down 1 level since a new root is created above (maintaing the invariant that all leaf nodes are at the same depth).

**Unlike BSTs which grows the leaves downwards,** $$(a,b)$$**-trees grows the root upwards! Therefore, leaves are always guaranteed to be on the same level as one another.**

It should be pretty clear that the **cost of insertion into an** $$(a,b)$$**-tree with** $$n$$ **nodes is** $$O(logn)$$ **and the number of split operations that need to be performed during insertion is also** $$O(logn)$$**.**

## Deletion

We only explain deleting keys from leaves here. To delete a key from an internal node, swap with its successor, and then delete it off.

We first find the key. Then delete it from the keylist. But wait... what if the number of keys in the node falls below $$a - 1$$?

Deletion may cause internal nodes to shrink too small (and hence, violate rule 1).

**Idea: Join with an adjacent sibling!**

### Merge Operation

Let us explain this with an example. Consider a $$(2,4)$$-tree in which you wanted to delete the element with key = 30. That would go as follows:

<figure><img src="/files/ORQ9XX4mEvK0EsLx1wJv" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/v41s77CFxqW0RjtgESDX" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/yWAzXi4ojs8FkPy5FNZp" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/zwjsByjMuUi1oYG8iVX2" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/XBfY4oTJ2qRtTfGjK9XD" alt=""><figcaption></figcaption></figure>

`Merge` is intuitively the reverse operation of `split`.

In `split`, we **promote** the median key to form LHS and RHS nodes. In `merge`, we **demote** the key in parent separating LHS and RHS to join them together.

Why can’t we just join the 2 children directly? Why do we need to bring down 1 key from the parent?

For the same reason as `split`. After merging, the parent will have 1 less child than before, therefore, it must have one less key than before. Moving that key into the newly merged node is convenient

Can a merged node be too large such that it violates rule 1?

Yes, but that is not a problem because then we can perform the `split` operation on that node to make it normal. FYI, when `merge` is followed by `split`, we call that a `share` operation. That is, `share` = `merge` + `split`. In summary,

<figure><img src="/files/IiQyCZ45UWvOcQLSBWQg" alt=""><figcaption></figcaption></figure>

<mark style="background-color:red;">What about deleting a key from an internal node or root?</mark>

Think about AVL trees. Swap the key to be deleted off with its predecessor/successor **which will be in a leaf node (this is critical since we are explaining only deleting keys from leaf nodes, it would be a huge problem if the successor/predecessor of a key in an internal node was not in a leaf node since our algorithm wouldn’t be able to delete keys from internal nodes then)**. The predecessor would be the right-most key in the left subtree and the successor would be the left-most key in the right subtree.

### Handling Duplicates

Instead of just storing keys, store a pair of (key, insertion order) where insertion order refers to the order when said key is inserted (e.g. timestamp)

## Why B-Trees over BST?

Both B-trees and BST have $$O(logn)$$ time for search, insert and delete operations. Then why do we need to learn about B-trees when we can use BSTs anyway?

In general, data is stored on disk in blocks, e.g. $$B\_1, B\_2, \dots, B\_m$$. Each block stores a chunk of memory of size $$B$$. (Think of each block as an array of size $$B$$). When accessing a memory location in some block $$B\_j$$, the entire block is read into memory. You can assume that your memory can hold some $$M$$ blocks at a time.

<figure><img src="/files/0FSmAgvsSnEof2pANFnF" alt=""><figcaption></figcaption></figure>

<mark style="background-color:red;">If you had an array of length</mark> $$n$$ <mark style="background-color:red;">how many blocks would you need to transfer to the memory to do a linear search for an element?</mark>

Ans: $$n/B$$ (since $$B$$ is the amount of data in any 1 block - let’s say that each element takes up 1 unit of data)

<mark style="background-color:red;">What about doing a binary search on an array of length</mark> $$n$$<mark style="background-color:red;">?</mark>

Ans: $$log(n/B)$$ block transfers. Think of this as doing a binary search on $$n/B$$ blocks. Once your search finds the right block, it is loaded into memory and the rest of the search is free.

<mark style="background-color:red;">Now imagine you are storing your data in a</mark> $$B$$<mark style="background-color:red;">-tree (Notice that you might choose</mark> $$a = B/2, b =B$$ <mark style="background-color:red;">depending on how you want to optimize). Notice that each node in your B-tree can be stored in</mark> $$O(1)$$ <mark style="background-color:red;">blocks. For example, one block stores the key list, one block stores the subtree links and one block stores the other information (e.g. parent pointer, auxiliary information, etc.) Now what is the cost of searching a keylist in a B-tree node? What is the cost of splitting a B-tree node? What is the cost or merging or sharing a B-tree node?</mark>

Ans: $$O(1)$$!!! Since after you load the block in memory, everything happens soo fast that the time taken for that is negligible compared to accessing data stored in disk.

<mark style="background-color:red;">So, what is the overall cost of searching a</mark> $$B$$<mark style="background-color:red;">-tree? Wht is the cost of inserting or deleting in a</mark> $$B$$<mark style="background-color:red;">-tree? (here</mark> $$B$$ <mark style="background-color:red;">refers to the block size)</mark>

Ans: $$O(log\_Bn)$$, i.e., the cost only depends on the height of the tree, since each of the operations to access a single node is only cost $$O(1)$$. The important thing here is in the value of $$B$$. Searching in a BST is $$O(log\_2n)$$ while searching in a $$B$$-tree is $$O(log\_Bn)$$. In practice, $$B$$ is a pretty large number. For instance, if your disk has 16KB blocks (which is reasonably normal) and you set $$B = 16k$$, then given a 10TB database, your $$B$$-tree just requires 3 levels. The root of your B-tree will always stay in memory. For typical memory sizes (e.g. 256MB disk cache), the first level of your B-tree will also always be in memory. Thus, the cost of searching your 10TB database is typically one block transfer. For a 1000TB disk, you have 4 levels and 2 block transfers. That’s why its really hard to beat a well-implemented B-tree.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://cs2040s.devanshshah.dev/data-structures/a-b-trees.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
