SPMFE-UNet: shape perception and multi-scale features enhancement UNet for robust abdominal organ and skin lesion segmentation.
Authors
Affiliations (2)
Affiliations (2)
- School of Information Science and Engineering, Yunnan University, Kunming, People's Republic of China.
- Department of Colorectal Surgery, The Third Affiliated Hospital of Kunming Medical University, Kunming, People's Republic of China.
Abstract
Convolutional neural networks demonstrate strong performance in medical image segmentation but face clinically significant challenges due to the morphological diversity of anatomical targets-including substantial variations in shape, scale, and position. To overcome these limitations, we propose shape perception and multi-scale features enhancement UNet, a novel architecture designed to jointly learn discriminative geometric features (shapes and scales) for robust target perception. The proposed framework designs two synergistic core modules to address the challenges of anatomical shape and scale variability: a shape perception module (SPM) that employs a dynamic gating mechanisms to adaptively sharpen crucial contour features and suppress irrelevant background interference, and a multi-scale features enhancement module (MFEM) which leverages a parallel multi-branch convolutional architecture with varied receptive fields to capture and intelligently fuse hierarchical patterns, from local textures to global semantics. These co-optimized modules form an integrated feature learning pipeline where the SPM purifies shape-related features and the MFEM empowers them with rich contextual information, enabling joint geometric perception for robust and accurate segmentation across heterogeneous clinical imaging scenarios. Experiments demonstrate competitive performance across three datasets: on Synapse, our method achieves a Dice score of 84.67% (surpassing CCViM by 2.02%) and an HD95 of 16.37 mm; on ISIC-2017, it attains a Dice of 92.36% (outperforming EMCAD by 1.15%) and an mean Intersection over Union (mIoU) of 86.93%; and on ISIC-2018, it reaches a Dice of 90.81% (exceeding CCViM by 0.75%) and an mIoU of 84.31%. Our method effectively mitigates mis-segmentation artifacts stemming from scale mismatches and shape irregularities, ultimately delivering superior robustness and accuracy in complex clinical imaging scenarios.