Anatomically constrained deep learning for clinical-grade volumetric pancreatic cancer segmentation: development, validation, and architectural benchmarking.
Authors
Affiliations (5)
Affiliations (5)
- Department of Radiology, Mayo Clinic, Rochester, MN, USA.
- Department of Radiology, Kobe University Graduate School of Medicine, Hyogo, Japan.
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA.
- Department of Gastroenterology, Hepatology and Nutrition, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
- Department of Radiology, Mayo Clinic, Rochester, MN, USA. [email protected].
Abstract
Automated segmentation of PDAC is a prerequisite to realize the promise of precision oncology, yet existing approaches remain insufficiently validated at the cohort scale and acquisition heterogeneity required for clinical deployment. We developed and validated a pancreas-localized three-dimensional convolutional neural network model (Model-BB) using 1859 multi-institutional treatment-naïve, biopsy-confirmed PDAC CT examinations. Model-BB achieved a DSC of 0.76 ± 0.13 on internal testing and 0.76 ± 0.09 on external validation, with stable performance across acquisition site, scanner vendor, slice thickness, and temporal epochs. Upstream localization analysis showed complete tumor enclosure in 224 of 241 test cases (92.9%), with preserved performance (DSC 0.78) among cases with partial peripheral exclusion, indicating that localization error was not the principal source of reduced performance. In a controlled architectural benchmark, Model-BB outperformed Swin UNETR, a 3D vision transformer (DSC, 0.76 ± 0.13 versus 0.68 ± 0.18; p < 0.001). On a difficulty-enriched subset (n = 50), Model-BB achieved DSC 0.71 against a STAPLE-derived consensus, exceeding individual reader-pair agreement (DSC range, 0.57-0.65), with concordance correlation coefficient 0.93 for tumor volume. These findings support anatomically constrained, task-specific segmentation as a reproducible geometric substrate for volumetric tumor-burden quantification, treatment-response assessment, and multimodal outcomes modeling, pending prospective validation in clinical-trial workflows.