FREE E LEARNING PLATFORM
HOMEEXCEPTIONSOOPSJVMINTRO
 

B-Tree




Introduction

A B-Tree is a self-balancing multi-way search tree in which every node can contain multiple keys and multiple child nodes. Unlike a Binary Search Tree, where every node has at most two children, a B-Tree allows many children, making it highly efficient for handling very large datasets.

B-Trees are widely used in database management systems, file systems, and storage devices because they minimize disk accesses while maintaining sorted data.



B-Tree Illustration

B-Tree Structure

Figure 1 : Structure of a B-Tree of Order 4


Why Do We Need a B-Tree?

When millions of records are stored in databases, ordinary Binary Search Trees become inefficient because their height may increase significantly. A B-Tree stores multiple keys in one node, reducing the tree height and minimizing disk reads.

Binary Search Tree B-Tree
Maximum 2 children. Multiple children.
Large height. Small height.
Used mainly in memory. Used for external storage.
Less efficient for databases. Ideal for databases.

Properties of a B-Tree

Property Description
Balanced Tree All leaf nodes are at the same level.
Multiple Keys Each node may contain multiple keys.
Multiple Children A node may have more than two child nodes.
Sorted Keys Keys inside every node remain sorted.
Efficient Operations Searching, insertion and deletion require logarithmic time.

Terminology

Term Description
Order (m) Maximum number of child nodes allowed.
Root Node The topmost node of the B-Tree.
Internal Node A node having one or more child nodes.
Leaf Node A node without children.
Key A value stored inside a node.

Solved Example

Construct a B-Tree of Order 4 by inserting the following keys:

10, 20, 5, 6, 12, 30, 7, 17

Initially, the tree is empty. Keys are inserted one by one. Whenever a node becomes full, it is split into two nodes and the middle key is promoted to the parent node. This process continues until all keys are inserted while maintaining the balanced structure of the B-Tree.


Working Principle

  • Search the correct leaf node.
  • Insert the new key in sorted order.
  • If the node overflows, split the node.
  • Move the middle key to the parent.
  • If the parent also overflows, continue splitting upward.
  • The root may split and create a new level.

B-Tree Insertion Algorithm

Insert(Key) Step 1 : Search the appropriate leaf node. Step 2 : Insert the key in sorted order. Step 3 : If the node is not full, Insertion is complete. Step 4 : If the node becomes full, Split the node into two parts. Step 5 : Move the middle key to the parent node. Step 6 : Repeat the splitting process if the parent also overflows. Step 7 : If the root splits, Create a new root.

Searching in a B-Tree

Searching starts from the root node. The required key is compared with all keys present in the current node. If the key is found, the search terminates. Otherwise, the algorithm follows the appropriate child pointer and repeats the process until the key is found or a leaf node is reached.


Insertion Example

Step Operation
1 Insert 10.
2 Insert 20.
3 Insert 5.
4 Insert 6 (Node Overflow).
5 Split the node and promote key 10.
6 Continue inserting remaining keys.

Node Splitting Illustration

B Tree Node Split

Figure 2 : Splitting a Full Node in B-Tree


Searching Example

Suppose the key 17 is to be searched.

  • Start from the Root node.
  • Compare 17 with available keys.
  • Move to the appropriate child.
  • Continue until the key is found.
  • Searching finishes successfully.

Complexity Analysis

Operation Time Complexity Explanation
Searching O(log n) Tree height remains very small.
Insertion O(log n) May require node splitting.
Deletion O(log n) May require merge or borrowing.
Traversal O(n) Visits every key.
Space Complexity O(n) Storage for all keys.

Advantages of B-Tree

  • Maintains a balanced tree automatically.
  • Supports efficient searching, insertion and deletion.
  • Reduces disk accesses.
  • Suitable for storing millions of records.
  • Height remains very small even for huge datasets.

Limitations of B-Tree

  • Implementation is more complex than Binary Search Tree.
  • Node splitting and merging increase implementation complexity.
  • Consumes extra memory for child pointers.
  • Not suitable for very small datasets.

Applications of B-Tree

Application Description
Database Systems Database indexing.
File Systems NTFS, FAT and Ext file systems.
Search Engines Large-scale indexing.
Cloud Storage Fast data retrieval.
Operating Systems Directory management.

B-Tree vs Binary Search Tree

B-Tree Binary Search Tree
Multi-way tree. Binary tree.
Multiple keys per node. One key per node.
Balanced. May become skewed.
Small height. Height depends upon insertion order.
Preferred for databases. Preferred for in-memory searching.

B-Tree vs B+ Tree

B-Tree B+ Tree
Keys and records stored in all nodes. Records stored only in leaf nodes.
Leaf nodes are not linked. Leaf nodes are linked together.
Searching is efficient. Range searching is more efficient.
Used in databases. Widely used in modern databases.

💡 Remember This

Overflow ↓ Split Node ↓ Promote Middle Key ↓ Tree Remains Balanced


Important Points

  • A B-Tree is a balanced multi-way search tree.
  • Each node may contain multiple keys.
  • All leaf nodes remain at the same level.
  • Node overflow is handled by splitting.
  • Searching, insertion and deletion require O(log n) time.
  • B-Trees are widely used in database indexing.

💼 Interview Corner

  • Why is a B-Tree preferred in databases?
  • What happens when a node overflows?
  • Differentiate B-Tree and B+ Tree.
  • What is the order of a B-Tree?
  • State the time complexity of insertion.

🎓 AKTU / Placement Tip

AKTU examinations frequently ask students to construct a B-Tree after inserting a given sequence of keys, explain node splitting, state the properties of a B-Tree and compare it with a Binary Search Tree and B+ Tree.


Summary

A B-Tree is a balanced multi-way search tree that stores multiple keys in each node. It minimizes tree height and significantly reduces disk access, making it highly suitable for database management systems and file systems. By splitting overflowing nodes and promoting middle keys, the B-Tree maintains balance and guarantees efficient searching, insertion and deletion in O(log n) time.