Multimodal hybrid mamba classification model for tumor pathological grade prediction using magnetic resonance images.
Authors
Affiliations (8)
Affiliations (8)
- The Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing 100081, China.
- The School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China. Electronic address: [email protected].
- The Department of Radiology, Beijing Tongren Hospital, Capital Medical University, Beijing 100005, China. Electronic address: [email protected].
- The School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China.
- The School of Ophthalmology & Optometry, Wenzhou Medical University; Wenzhou 325027, China.
- The School of Computer Science, Beijing Institute of Technology, Beijing 100081, China; The School of Biomedical Engineering, Hainan University, Hainan 570228, China.
- The Department of Radiology, Beijing Tongren Hospital, Capital Medical University, Beijing 100005, China.
- The Beijing Engineering Research Center of Mixed Reality and Advanced Display, School of Optics and Electronics, Beijing Institute of Technology, Beijing 100081, China; The School of Biomedical Engineering, Hainan University, Hainan 570228, China; The School of Ophthalmology & Optometry, Wenzhou Medical University; Wenzhou 325027, China. Electronic address: [email protected].
Abstract
Malignant tumors present a significant global health challenge, and accurate pathological grading is essential for personalized treatment. Traditional grading methods, which rely on invasive biopsies, are limited by tumor location. In contrast, magnetic resonance imaging (MRI) offers a non-invasive, high-resolution tool with multi-sequence MRI (e.g., T1, T2, T1C) enabling comprehensive tumor assessment. However, existing methods often struggle to capture cross-modal correlations and global dependencies. To address this limitation, we propose the Multimodal Hybrid Mamba (MSHM) classification model for tumor pathological grade prediction. The model integrates convolutional neural networks for shallow feature extraction, Mamba encoders for modeling global dependencies, and cross-modal attention to fuse multi-sequence MRI data. The Mamba-Fusion module further refines the global features, enhancing lesion recognition and computational efficiency. Experimental results demonstrate that MSHM outperforms existing methods, achieving 98.36 ± 1.00% AUC and 92.08 ± 3.26% F1-Score on the private orbital adnexal lymphoma dataset from multi-centers, and 98.93 ± 0.19% AUC and 95.82 ± 0.62% F1-Score on the public glioma BraTS 2024 dataset. Additionally, MSHM performs exceptionally well on the LLD-MMRI dataset, achieving 99.25 ± 0.26% AUC and 96.97 ± 0.55% F1-Score in distinguishing between benign and malignant liver lesions, further validating the model's robust performance across diverse datasets. Ablation studies confirm the effectiveness of the proposed modules. Overall, MSHM strikes a balance between high performance and efficiency, advancing both tumor pathological grade prediction and multimodal medical image analysis.