Bounding the Fragmentation of B-Trees Subject to Batched Insertions

Authors: Michael A. Bender, Aaron Bernstein, Nairen Cao, Alex Conway, Martín Farach-Colton et al.

Year: 2026

cs.DScs.DB

0
Citations
2026
Published
8
Authors

Abstract

The issue of internal fragmentation in data structures is a fundamental challenge in database design. A seminal result of Yao in this field shows that evenly splitting the leaves of a B-tree against a workload of uniformly random insertions achieves space utilization of around 69%. However, many database applications perform batched insertions, where a small run of consecutive keys is inserted at a single position. We develop a generalization of Yao's analysis to provide rigorous treatment of such batched workloads. Our approach revisits and reformulates the analytical structure underlying Yao's result in a way that enables generalization and is used to argue that even splitting works well for many workloads in our extended class. For the remaining workloads, we develop simple alternative strategies that provably maintain good space utilization.

Read PDF