CFG-MambaNet: Contextual and Frequency-Guided Mamba Network for medical image segmentation.
Authors
Affiliations (10)
Affiliations (10)
- Department of Cardiology, The Affiliated Lihuili Hospital of Ningbo University, Ningbo, China.
- Department of Cardiology, Xuzhou Central Hospital, Xuzhou, China.
- School of Software, Nanchang University, Nanchang, China.
- School of Mathematics and Computer Sciences, Nanchang University, Nanchang, China. [email protected].
- Department of Cardiology, The Affiliated Hospital of Xuzhou Medical University, Institute of Cardiovascular Disease Research, Xuzhou Medical University, Xuzhou, China.
- Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China.
- Xuzhou Health Information Center, Xuzhou, China.
- Department of Cardiology, The Affiliated Lihuili Hospital of Ningbo University, Ningbo, China. [email protected].
- Department of Cardiology, The Affiliated Lihuili Hospital of Ningbo University, Ningbo, China. [email protected].
- Department of Cardiology, The Affiliated Hospital of Xuzhou Medical University, Institute of Cardiovascular Disease Research, Xuzhou Medical University, Xuzhou, China. [email protected].
Abstract
Accurate medical image segmentation continues to pose significant challenges, as existing methods often struggle to concurrently achieve efficient global context modeling, precise boundary delineation, and robust generalization. To address these issues, a novel framework named Contextual and Frequency-Guided Mamba Network (CFG-MambaNet) is presented. Specifically, a variable-scale state space block based on Mamba is employed so that long-range dependencies can be captured with linear complexity, efficiently addressing the inefficiency of Transformer-based models in high-resolution medical imaging. Moreover, a frequency-guided representation module is incorporated to explicitly separate global low-frequency structures from high-frequency boundary details, which significantly alleviates the difficulty of segmenting lesions with blurred contours or weak textures. Furthermore, an adaptive context aggregation mechanism is introduced to integrate heterogeneous semantic cues and to consistently highlight clinically critical regions, substantially improving robustness across diverse anatomical scales and morphologies. To further stabilize training and improve boundary adherence, a composite loss combined with deep supervision is employed. Extensive experiments were conducted on four publicly available datasets, including ACDC, Kvasir-SEG, ISIC, and SEED, covering cardiac MRI, endoscopy, dermoscopy, and pathology images.