B-Trees

Why Disk and Block Storage

How does a large branching factor minimize the number of expensive disk-block reads, and why is the cost model here “block transfers,” not comparisons?

Order / Minimum Degree \( t \)

Defining t

Define \( t \) — how many keys and children may a node hold, and what is special about the root’s lower bound?

Why Half-Full

Why does the “at least half-full” floor exist, and what would go wrong if nodes could get arbitrarily sparse?

Keys and Children Layout

With \( k \) keys, why are there exactly \( k+1 \) child pointers, and how do the keys partition the value range so each pointer covers one interval?

The B-Tree Invariants

All leaves at the same depth, every non-root at least half-full, keys sorted within a node — what do these rules guarantee together about balance and node utilization?

Search Within and Across Nodes

How does a key scan (linear or binary) inside one fat node combine with descending a single child pointer to reach the next block?

Split-on-Full Insert

The Proactive Split

Why split any full node you encounter on the way DOWN, before you ever need to — what problem does proactive splitting avoid on the way back up?

Promotion

When a full node splits, which median key is promoted to the parent, and how do the remaining keys divide into two nodes?

Deletion: Borrow and Merge

Borrowing from a Sibling

When a node underflows but an adjacent sibling has spare keys, how does a key rotate up through the parent and another down?

Merging Two Nodes

When neither sibling can spare a key, how do you fuse two nodes and pull the separator key down — and how can this shrink the height?

How Balance Is Maintained

Why does growing only by splitting the root and shrinking only by merging at the root keep every leaf on exactly the same level?

Time Complexity

Search

Why is search \( O(\log_t n) \) block reads with \( O(t) \) work per node — and why do we care far more about the block-read count than the in-node scan?

Insert

Why does proactive split-on-the-way-down keep insert to a single root-to-leaf pass of \( O(\log_t n) \) blocks?

Delete

Why does borrow/merge also stay within \( O(\log_t n) \) blocks, even when a merge cascades upward?

Why a High Branching Factor Flattens the Tree (Intuition)

If each node fans out to \( t \) children, why does height grow like \( \log_t n \) — and why does a large \( t \) make that height tiny in practice?

Space Complexity

Node Storage and Fill Factor

Why is total storage \( \Theta(n) \), and how does the half-full floor bound wasted space inside under-occupied nodes?

Recursion / Path Buffering

Why does a top-down insert or delete only need \( O(\log_t n) \) nodes “in flight,” and why does that matter when nodes are disk blocks?

Implementation Walkthrough

Node Layout

How do you model a node’s sorted key array, child-pointer array, and key count, and how do you mark a leaf?

Search

How does the in-node scan find the first key \( \ge \) the target and pick the child pointer to follow?

Insert with Proactive Split

How do you split a full child before descending into it, and how do you promote the median into the current node?

Delete with Borrow and Merge

How do you ensure a child has at least \( t \) keys before recursing into it, choosing borrow or merge as needed?

B-Tree vs B+ Tree

Why do databases store values only in leaves and chain the leaves together — what range and scan queries does that speed up?

When to Use It / Common Pitfalls

When does a B-tree beat an in-memory balanced BST, and where do node-full edge cases, the half-full rule, and off-by-one key/child counts trip you up?

Implementation

// implement from scratch here

Summary

Recap the whole topic in a few lines — the core idea, the key complexities and trade-offs, and when to reach for it. The cheat-sheet you’d skim before an exam or interview.

My Notes

Fill this in as you learn

Keyboard shortcuts

Data Structures and Algorithms