Automated segmentation of esophageal squamous cell carcinoma in contrast-enhanced free-breathing 3D-GRE: a comparative study of UNet, nnUNet, and UMamba for tumor delineation.
Authors
Affiliations (8)
Affiliations (8)
- Department of Information, The Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, No. 127 Dongming Road, Zhengzhou, Henan, 450008, China.
- Department of Radiology, The Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, Henan, 450008, China.
- HNHC Key Laboratory of Oncology Medical Imaging Response Assessment, Henan Cancer Hospital, Zhengzhou, Henan, 450008, China.
- MR Research Collaboration Team, Siemens Healthineers Ltd, Beijing, China.
- Department of Radiology, Guangdong General Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.
- Department of Information, The Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, No. 127 Dongming Road, Zhengzhou, Henan, 450008, China. [email protected].
- Department of Radiology, The Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou, Henan, 450008, China. [email protected].
- HNHC Key Laboratory of Oncology Medical Imaging Response Assessment, Henan Cancer Hospital, Zhengzhou, Henan, 450008, China. [email protected].
Abstract
Esophageal squamous cell carcinoma (ESCC) is a highly prevalent and aggressive malignancy associated with poor prognosis. Accurate tumor segmentation is critical for the detection and characterization of ESCC. However, manual segmentation remains labor-intensive and time-consuming. Deep learning presents a promising approach for automating this process. This retrospective study assessed the performance of three deep learning models-UNet, nnUNet-2D/3D, and UMamba-for ESCC segmentation using contrast-enhanced free-breathing 3DGRE MRI with 1 mm isotropic resolution. The dataset comprised 192 patients, divided into 171 for training and 21 for validation. Manual annotations excluded calcifications and hemorrhage, with maximum tumor cross-sections utilized for 2D model training and full volumetric tumor layers employed for 3D architectures. Segmentation performance was evaluated using the Dice Similarity Coefficient (DSC), Hausdorff Distance (HD), and mean surface distance (MSD). UMamba achieved the highest DSC of 0.764, significantly outperforming nnUNet-3D (0.738) and nnUNet-2D (0.702). Additionally, UMamba demonstrated superior boundary delineation, as evidenced by the lowest HD of 5.048 mm and MSD of 1.088 mm. These results highlight the importance of 3D contextual analysis over traditional 2D approaches, as illustrated by the 5.1% improvement in DSC observed with nnUNet-3D compared to nnUNet-2D. Furthermore, the computational efficiency of each model was evaluated, revealing that 3D models exhibited longer inference times than their 2D counterparts. UMamba demonstrates superior performance for automated ESCC segmentation on high-resolution MRI, providing a robust tool that has the potential to reduce manual segmentation time and inter-observer variability in clinical practice.