Detecting Underspecification in Software Requirements via k-NN Coverage Geometry

Authors: Wenyan Yang, Tomáš Janovec, Samantha Bavautdin

Year: 2026

cs.SE

0
Citations
2026
Published
3
Authors

Abstract

We propose \geogap{}, a geometric method for detecting missing requirement types in software specifications. The method represents each requirement as a unit vector via a pretrained sentence encoder, then measures coverage deficits through $k$-nearest-neighbour distances z-scored against per-project baselines. Three complementary scoring components -- per-point geometric coverage, type-restricted distributional coverage, and annotation-free population counting -- fuse into a unified gap score controlled by two hyperparameters. On the PROMISE NFR benchmark, \geogap{} achieves 0.935 AUROC for detecting completely absent requirement types in projects with $N \geq 50$ requirements, matching a ground-truth count oracle that requires human annotation. Six baselines confirm that each pipeline component -- per-project normalisation, neural embeddings, and geometric scoring -- contributes measurable value.

Read PDF