An interpretable cross-attentive multi-modal MRI fusion framework for schizophrenia identification.
Authors
Affiliations (5)
Affiliations (5)
- Tulane University, Department of Computer Science, 6823 St. Charles Ave, New Orleans, 70118, LA, USA.
- Tulane University, Department of Biomedical Engineering, 6823 St. Charles Ave, New Orleans, 70118, LA, USA.
- Georgia State University, Georgia Institute of Technology, Emory University, Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), 55 Park Pl NE, Atlanta, 30303, GA, USA.
- Mind Research Network, 1101 Yale Blvd NE, Albuquerque, 87106, NM, USA.
- Boys Town National Research Hospital, Institute for Human Neuroscience, 14000 Boys Town Hospital Rd, Boys Town, 68010, NE, USA.
Abstract
Functional MRI (fMRI) and structural MRI (sMRI) offer complementary insights into brain function and anatomy, but their integration for schizophrenia identification remains challenging due to modality heterogeneity. Many existing methods fall short of effective modeling of the interaction between two modalities. We propose CAMF, a Cross-Attentive Multi-modal Fusion framework that employs self-attention to capture intra-modal patterns and cross-attention to learn inter-modal relationships. In addition, we introduce the gradient-guided score-class activation map to enhance interpretability by highlighting salient features. Our approach significantly improves the accuracy in classifying schizophrenia, as demonstrated by the evaluation of multi-modal brain imaging datasets from four cohorts of schizophrenia studies. Furthermore, the model identifies functional networks and anatomical regions aligned with established biomarkers. CAMF provides an accurate and interpretable framework for multimodal brain imaging analysis, offering new insights into schizophrenia-related alterations.