Y-Trim: Evidence-gated Adaptase tail trimming for single-stranded bisulfite sequencing

Authors: Yihan Fang

Year: 2026

q-bio.GN

0
Citations
2026
Published
1
Authors

Abstract

Background: Single-stranded whole-genome bisulfite sequencing (ssWGBS) enables DNA methylation profiling in low-input and highly fragmented material, including cell-free DNA. In widely used post-bisulfite protocols, Adaptase-mediated tailing adds stochastic, template-free end sequence. Unlike adapter-defined junctions, these tails lack a fixed sequence template, so trimming must be decided from FASTQ-stage observables under intrinsic uncertainty.
Results: We show that bisulfite-induced compositional degeneracy implies a strictly positive error floor for any fixed per-read boundary rule under a finite nucleotide alphabet. Guided by this limit, we introduce Y-Trim, an evidence-gated framework that separates admission (should we trim) from inference (where to trim). For Read 2, Y-Trim performs per-read adaptive cut placement via a fixed, chemistry-typed matrix-linear texture scoring scheme; for Read 1, it uses automated sample-level anchoring when read-level localization is feasibility-limited. Across modules, Y-Trim is an explicit, chemistry-specific decision rule with interpretable operating points. On a curated 34-run public cohort (CCGB-34) and simulator stress tests with known latent boundaries, Y-Trim exhibits stable Read 2 operating behavior and Read 1 feasibility-limited behavior consistent with conditional read-through.
Conclusions: Template-free Adaptase tail trimming is best viewed as an evidence-limited FASTQ-stage decision rather than a generic preprocessing knob. By making admissibility and abstention explicit and exposing interpretable genomic-retention versus residual-carryover trade-offs, Y-Trim provides a practical uncertainty-aware preprocessing strategy for ssWGBS.

Read PDF