(a, b)-Trees

Aim: Allow efficient insertion, deletion, resizing of data in extremely large databases

Great video to watch

Invariants

In an (a,b)(a,b) tree, aa and bb are the parameters where 2a(b+1)/22 \leq a \leq (b+1)/2.

Respectively, aa and bb refer to the minimum and maximum number of children (NOT keys) an internal (i.e., non-root, non-leaf) node can have.

Binary Tree
(a,b)-trees

Each node has at most 2 children

Each node can have more than 2 children

Each node stores exactly 1 key

Each node can store multiple keys

Every (a,b)(a,b)-tree must satisfy the following 3 rules:

1. (a,b)(a,b) - child Policy

Node Type
Min Keys
Max Keys
Min Children
Max Children

Root

1

b1b - 1

2

bb

Internal

a1a - 1

b1b - 1

aa

bb

Leaf

a1a - 1

b1b - 1

0

0

With the exception of leaves, note that the number of children is always one more than the number of keys.

2. Key Ranges

A non-leaf node (i.e, root or internal) must have one more child than its number of keys. This is to ensure that all value ranges due to its keys are covered in its subtrees. The permitted range of keys within a subtree is called its key range.

In particular, for a non-leaf node with kk keys and k+1k + 1 children,

  • Its keys in sorted order are v1,v2,,vkv_1, v_2, \dots , v_k

  • The subtrees due to its keys are t1,t2,,tk+1t_1, t_2, \dots, t_{k+1}

Then,

  • First child t1t_1 has key range v1\leq v_1

  • Final child tk+1t_{k+1} has key range >vk> v_k

  • All other children tit_i where i[2,k]i \in [2,k] has key range (vi1,vi](v_{i-1}, v_i]

3. Leaf Depth

All leaf nodes must be at the same depth from the root.

You should realize that a regular BST is just an (a,b)(a,b) tree with a=1,b=2a = 1, b = 2.

B-Trees

Firstly, it is important to note that the B in B-Tree does not refer to Binary. Actually, no one really knows what the B stands for - it could be Balanced?

B-trees are simply (B,2B)(B,2B) trees. That is, they are a subcategory of (a,b)(a,b) trees such that a=B,b=2Ba = B, b = 2B. For instance, when B=2B = 2, we have a (2,4)(2,4)-tree. (This is sometimes referred to as a (2,3,4)(2,3,4) tree. An example of a (2,3,4)(2,3,4)-tree is given below:

Are B-trees balanced?

Yes! This is because of Rule 3: All leaves must be at the same depth from the root. This ensures height-balance, which implies balance. In fact, any two siblings (in terms of depth too!) are at exactly the same height. This is due to the fact that a B-tree grows upward from the leaves.

What is the minimum and maximum height of an (a,b)(a,b) tree with n keys?

The minimum height would be when each leaf has maximum possible number of children (i.e, the tree is dense) and so, minimum height is O(logbn)O(log_bn) and maximum height would be O(logan)O(log_an).

Now that we have established that an (a,b)(a,b) tree is balanced and height is O(logan)O(log_an), we see how it helps us for searching, insertion and deletion.

Searching

What data structure should we use for storing the keys and children in a node?

One possible way is to use an array of (key, subtree) pairs as follows:

Total search cost?

Height of the tree is O(logan)O(log_an).

Binary search (woah binary search to the rescue again!) for the key at every node takes O(log2b)O(log_2b) time. (Note that bb is a constant, and so we consider this as constant time)

Hence, total cost: O(log2b×logan)=O(logan)=O(logn)O(log_2b \times log_an) = O(log_an) = O(logn).

Insertion

The key idea is that just like in BSTs, we only insert at leaves here. So, the idea is to navigate to a suitable leaf and add the new key to its key list.

For example, consider the insertion of the key 71 to the (2,4)(2,4)-tree below:

But now what happens when you insert 72? You cannot have 4 keys at a single node! Something needs to be done to solve this problem!

Insertion may violate Rule 1: Recall that rule 1 enforces that nodes can have at most b1b - 1 keys. Insertion may cause the leaf nodes to grow too large. We need to have some operation to handle such cases.

Idea: Redistribute out the keys!

How can we redistribute the keys?

  1. Over sibling nodes?

    • No! Cannot guarantee that the rule is still not violated (i.e., now the sibling may be too large)!

  2. Over a new node?

    • Yes! We can always guarantee that the postcondition adheres rule 1.

Split Operation (Key-redistribution)

  1. Choose the median key in the overpopulated node.

  2. Use that to split the keylist into 2 halves

  3. Left half goes into a new node

  4. Move median key to the parent node (what do we do when the root gets overpopulated? it has no parents! don’t worry, we will deal with this too)

  5. Link the two nodes to the parent.

If we perform the split operation on the example tree above, the following steps are executed:

Why must the split operation “offer” one key to the parent?

After splitting, the parent will have one more child than before, therefore it must also have one more key. Taking the (median) key from the node to be split is convenient.

Will the split procedure ever end up with nodes too small (i.e, too few keys) such that they violate rule 1?

No. this is because:

  1. Keylist size of LHS is (b1)/2=b/21/2=a1/2\lfloor (b-1)/2 \rfloor = \lfloor b/2 - 1/2 \rfloor = \lfloor a - 1/2 \rfloor

  2. Keylist size of RHS is (b1)/2=b/21/2=a1/2\lceil(b-1)/2 \rceil = \lceil b/2 - 1/2 \rceil = \lceil a - 1/2\rceil

  3. In both cases, the size of the keylist is at least a1a -1.

But there’s another problem.. What happens if the parent node becomes over-filled after one of its children is split? Umm.. no big deal. Just perform split on the parent node then. So, during insertion, in the worst case, you might have to perform a split at every node on the root-to-leaf path on the way back from the recursive call. But all split operations take O(1)O(1) time (essentially its something like a rotation - just moving some pointers here and there and so the total time of insertion remains O(logn)O(logn)).

What about when the root is overpopulated? How do you split the root?

  1. Follow the same procedure as before

  2. Instead of elevating split key vmv_m to the parent (there’s no parent for root), make it the new root! (This is exactly why we allow the root to have 1 key and there’s no minimum condition on the number of keys in the root)

In other words, if zz was the old root that become overpopulated, then the following steps explain how to split a root node:

  1. Pick the median key vmv_m as the split key

  2. Split zz into LHS and RHS using vmv_m

  3. Create a new node yy

  4. Move LHS split from zz to yy

  5. Create a new empty node rr

  6. Insert vmv_m to rr

  7. Promote rr to be the new root node

  8. Assign yy and zz to be the left and right child of

  9. Assign previous subtree tmt_m associated with vmv_m to be the final child of yy

Will the split operation ever violate rule 3?

No! A split on node zz will create a new node yy so:

  • If zz is an internal node, yy will be at the same level and nothing will be pushed down since yy was split from zz

  • If zz is a leaf node, yy will also be a leaf node at the same level

  • If zz is the root node, everything will be pushed down 1 level since a new root is created above (maintaing the invariant that all leaf nodes are at the same depth).

Unlike BSTs which grows the leaves downwards, (a,b)(a,b)-trees grows the root upwards! Therefore, leaves are always guaranteed to be on the same level as one another.

It should be pretty clear that the cost of insertion into an (a,b)(a,b)-tree with nn nodes is O(logn)O(logn) and the number of split operations that need to be performed during insertion is also O(logn)O(logn).

Deletion

We only explain deleting keys from leaves here. To delete a key from an internal node, swap with its successor, and then delete it off.

We first find the key. Then delete it from the keylist. But wait... what if the number of keys in the node falls below a1a - 1?

Deletion may cause internal nodes to shrink too small (and hence, violate rule 1).

Idea: Join with an adjacent sibling!

Merge Operation

Let us explain this with an example. Consider a (2,4)(2,4)-tree in which you wanted to delete the element with key = 30. That would go as follows:

Merge is intuitively the reverse operation of split.

In split, we promote the median key to form LHS and RHS nodes. In merge, we demote the key in parent separating LHS and RHS to join them together.

Why can’t we just join the 2 children directly? Why do we need to bring down 1 key from the parent?

For the same reason as split. After merging, the parent will have 1 less child than before, therefore, it must have one less key than before. Moving that key into the newly merged node is convenient

Can a merged node be too large such that it violates rule 1?

Yes, but that is not a problem because then we can perform the split operation on that node to make it normal. FYI, when merge is followed by split, we call that a share operation. That is, share = merge + split. In summary,

What about deleting a key from an internal node or root?

Think about AVL trees. Swap the key to be deleted off with its predecessor/successor which will be in a leaf node (this is critical since we are explaining only deleting keys from leaf nodes, it would be a huge problem if the successor/predecessor of a key in an internal node was not in a leaf node since our algorithm wouldn’t be able to delete keys from internal nodes then). The predecessor would be the right-most key in the left subtree and the successor would be the left-most key in the right subtree.

Handling Duplicates

Instead of just storing keys, store a pair of (key, insertion order) where insertion order refers to the order when said key is inserted (e.g. timestamp)

Why B-Trees over BST?

Both B-trees and BST have O(logn)O(logn) time for search, insert and delete operations. Then why do we need to learn about B-trees when we can use BSTs anyway?

In general, data is stored on disk in blocks, e.g. B1,B2,,BmB_1, B_2, \dots, B_m. Each block stores a chunk of memory of size BB. (Think of each block as an array of size BB). When accessing a memory location in some block BjB_j, the entire block is read into memory. You can assume that your memory can hold some MM blocks at a time.

If you had an array of length nn how many blocks would you need to transfer to the memory to do a linear search for an element?

Ans: n/Bn/B (since BB is the amount of data in any 1 block - let’s say that each element takes up 1 unit of data)

What about doing a binary search on an array of length nn?

Ans: log(n/B)log(n/B) block transfers. Think of this as doing a binary search on n/Bn/B blocks. Once your search finds the right block, it is loaded into memory and the rest of the search is free.

Now imagine you are storing your data in a BB-tree (Notice that you might choose a=B/2,b=Ba = B/2, b =B depending on how you want to optimize). Notice that each node in your B-tree can be stored in O(1)O(1) blocks. For example, one block stores the key list, one block stores the subtree links and one block stores the other information (e.g. parent pointer, auxiliary information, etc.) Now what is the cost of searching a keylist in a B-tree node? What is the cost of splitting a B-tree node? What is the cost or merging or sharing a B-tree node?

Ans: O(1)O(1)!!! Since after you load the block in memory, everything happens soo fast that the time taken for that is negligible compared to accessing data stored in disk.

So, what is the overall cost of searching a BB-tree? Wht is the cost of inserting or deleting in a BB-tree? (here BB refers to the block size)

Ans: O(logBn)O(log_Bn), i.e., the cost only depends on the height of the tree, since each of the operations to access a single node is only cost O(1)O(1). The important thing here is in the value of BB. Searching in a BST is O(log2n)O(log_2n) while searching in a BB-tree is O(logBn)O(log_Bn). In practice, BB is a pretty large number. For instance, if your disk has 16KB blocks (which is reasonably normal) and you set B=16kB = 16k, then given a 10TB database, your BB-tree just requires 3 levels. The root of your B-tree will always stay in memory. For typical memory sizes (e.g. 256MB disk cache), the first level of your B-tree will also always be in memory. Thus, the cost of searching your 10TB database is typically one block transfer. For a 1000TB disk, you have 4 levels and 2 block transfers. That’s why its really hard to beat a well-implemented B-tree.

Last updated