Vision Mamba Empowered by Dynamic Domain Generalization for Cross-Modality Medical Segmentation.
Authors
Affiliations (4)
Affiliations (4)
- School of Computer Science and Information Engineering, Bengbu University, Bengbu, Anhui, China.
- Anhui Heneng Technology Co., LTD., Bengbu, Anhui, China.
- Universiti Malaysia Sarawak (UNIMAS), Kota Samarahan, Sarawak, Malaysia.
- School of Computer Science and Information Engineering, Bengbu University, Bengbu, Anhui, China. [email protected].
Abstract
Deep learning techniques have achieved significant advancements in medical image segmentation in recent years. However, model generalization remains severely constrained by domain shift, particularly in cross-modal medical image segmentation tasks. Traditional segmentation models struggle to generalize effectively to unseen target domains due to differences in joint probability distributions across medical imaging modalities. Existing methods primarily focus on unsupervised domain adaptation (UDA) and domain generalization (DG) techniques. While UDA methods face practical limitations due to difficulties in obtaining target domain data, current DG approaches often overlook inherent anatomical priors in medical images as well as the heterogeneity and sparsity of lesion regions. To address these challenges, this paper proposes a cross-modal medical image segmentation framework that integrates the Vision Mamba model with dynamic domain generalization. The framework achieves cross-domain feature alignment and multi-scale feature fusion by leveraging bidirectional state-space sequence modeling, Bezier curve-style enhancement, and a dual-normalization strategy. Additionally, the VEBlock module is introduced, which effectively combines the dynamic sequence modeling capabilities of the Mamba model with non-local attention mechanisms to better capture cross-modal global dependencies. Experimental results on the BraTS 2018 and cross-modal cardiac datasets demonstrate significant improvements in cross-modal segmentation. For example, in T2 → T1 tasks, our framework achieves an average Dice score of 56.22%, outperforming baseline methods by 1.78% while reducing Hausdorff distance for tumor boundaries to 13.26 mm. Furthermore, in the Cardiac CT → MRI tasks, the Hausdorff distance is optimized to 27.34 mm, validating the proposed framework's strong generalization capability for complex anatomical structures.