ES-UNet: efficient 3D medical image segmentation with enhanced skip connections in 3D UNet.

Authors

Park M,Oh S,Park J,Jeong T,Yu S

Affiliations (3)

  • School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-Ro, Dongjak-Gu, Seoul, 06974, Republic of Korea.
  • School of Artificial Intelligence Convergence, Hallym University, 1 Hallymdaehak-Gil, Chuncheon, 24252, Republic of Korea. [email protected].
  • School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-Ro, Dongjak-Gu, Seoul, 06974, Republic of Korea. [email protected].

Abstract

Deep learning has significantly advanced medical image analysis, particularly in semantic segmentation, which is essential for clinical decisions. However, existing 3D segmentation models, like the traditional 3D UNet, face challenges in balancing computational efficiency and accuracy when processing volumetric medical data. This study aims to develop an improved architecture for 3D medical image segmentation with enhanced learning strategies to improve accuracy and address challenges related to limited training data. We propose ES-UNet, a 3D segmentation architecture that achieves superior segmentation performance while offering competitive efficiency across multiple computational metrics, including memory usage, inference time, and parameter count. The model builds upon the full-scale skip connection design of UNet3+ by integrating channel attention modules into each encoder-to-decoder path and incorporating full-scale deep supervision to enhance multi-resolution feature learning. We further introduce Region Specific Scaling (RSS), a data augmentation method that adaptively applies geometric transformations to annotated regions, and a Dynamically Weighted Dice (DWD) loss to improve the balance between precision and recall. The model was evaluated on the MICCAI HECKTOR dataset, and additional validation was conducted on selected tasks from the Medical Segmentation Decathlon (MSD). On the HECKTOR dataset, ES-UNet achieved a Dice Similarity Coefficient (DSC) of 76.87%, outperforming baseline models including 3D UNet, 3D UNet 3+, nnUNet, and Swin UNETR. Ablation studies showed that RSS and DWD contributed up to 1.22% and 1.06% improvement in DSC, respectively. A sensitivity analysis demonstrated that the chosen scaling range in RSS offered a favorable trade-off between deformation and anatomical plausibility. Cross-dataset evaluation on MSD Heart and Spleen tasks also indicated strong generalization. Computational analysis revealed that ES-UNet achieves superior segmentation performance with moderate computational demands. Specifically, the enhanced skip connection design with lightweight channel attention modules integrated throughout the network architecture enables this favorable balance between high segmentation accuracy and computational efficiency. ES-UNet integrates architectural and algorithmic improvements to achieve robust 3D medical image segmentation. While the framework incorporates established components, its core contributions lie in the optimized skip connection strategy and supporting techniques like RSS and DWD. Future work will explore adaptive scaling strategies and broader validation across diverse imaging modalities.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.