Multi-class transformer-based segmentation of pancreatic ductal adenocarcinoma and surrounding structures in CT imaging: a multi-center evaluation.
Authors
Affiliations (2)
Affiliations (2)
- Oncology Department, Ganzhou People's Hospital, Jiangxi, 341000, Ganzhou, China.
- Department of Hepatopancreatobiliary Surgery, Ganzhou People's Hospital, Jiangxi, 341000, Ganzhou, China. [email protected].
Abstract
Accurate segmentation of pancreatic ductal adenocarcinoma (PDAC) and surrounding anatomical structures is critical for diagnosis, treatment planning, and outcome assessment. This study proposes a deep learning-based framework to automate multi-class segmentation in CT images, comparing the performance of four state-of-the-art architectures. This retrospective multi-center study included 3265 patients from six institutions. Four deep learning models-UNet, nnU-Net, UNETR, and Swin-UNet-were trained using five-fold cross-validation on data from five centers and tested independently on a sixth center (n = 569). Preprocessing included intensity normalization, voxel resampling, and standardized annotation for six structures: PDAC lesion, pancreas, veins, arteries, pancreatic duct, and common bile duct. Evaluation metrics included Dice Similarity Coefficient (DSC), Intersection over Union (IoU), directed Hausdorff Distance (dHD), Average Symmetric Surface Distance (ASSD), and Volume Overlap Error (VOE). Statistical comparisons were made using Wilcoxon signed-rank tests with Bonferroni correction. Swin-UNet outperformed all models with a mean validation DSC of 92.4% and test DSC of 90.8%, showing minimal overfitting. It also achieved the lowest dHD (4.3 mm), ASSD (1.2 mm), and VOE (6.0%) in cross-validation. Per-class DSCs for Swin-UNet were consistently higher across all anatomical targets, including challenging structures like the pancreatic duct (91.0%) and bile duct (91.8%). Statistical analysis confirmed the superiority of Swin-UNet (p < 0.001). All models showed generalization capability, but Swin-UNet provided the most accurate and robust segmentation across datasets. Transformer-based architectures, particularly Swin-UNet, enable precise and generalizable multi-class segmentation of PDAC and surrounding anatomy. This framework has potential for clinical integration in PDAC diagnosis, staging, and therapy planning.