Multi-scheme cross-level attention embedded U-shape transformer for MRI semantic segmentation.
Authors
Affiliations (3)
Affiliations (3)
- UAV Industry Academy, Chengdu Aeronautic Polytechnic, Chengdu, 610100, China. [email protected].
- Department of Electrics and Information Engineering, Beihang University, Beijing, 100191, China. [email protected].
- Chengdu University of Information Technology, Chengdu, 610225, China.
Abstract
Accurate MRI image segmentation is crucial for disease diagnosis, but current Transformer-based methods face two key challenges: limited capability to capture detailed information, leading to blurred boundaries and false localization, and the lack of MRI-specific embedding paradigms for attention modules, which limits their potential and representation capability. To address these challenges, this paper proposes a multi-scheme cross-level attention embedded U-shape Transformer (MSCL-SwinUNet). This model integrates cross-level spatial-wise attention (SW-Attention) to transfer detailed information from encoder to decoder, cross-stage channel-wise attention (CW-Attention) to filter out redundant features and enhance task-related channels, and multi-stage scale-wise attention (ScaleW-Attention) to adaptively process multi-scale features. Extensive experiments on the ACDC, MM-WHS and Synapse datasets demonstrate that the proposed MSCL-SwinUNet surpasses state-of-the-art methods in accuracy and generalizability. Visualization further confirms the superiority of our model in preserving detailed boundaries. This work not only advances Transformer-based segmentation in medical imaging but also provides new insights into designing MRI-specific attention embedding paradigms.Our code is available at https://github.com/waylans/MSCL-SwinUNet .