AlzFormer: Multi-modal framework for Alzheimer's classification using MRI and graph-embedded demographics guided by adaptive attention gating.
Authors
Affiliations (5)
Affiliations (5)
- School of Automation, Central South University, Changsha, 410083, China.
- School of Automation, Central South University, Changsha, 410083, China. Electronic address: [email protected].
- School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
- Department of Pathology, College of Medicine, The Ohio State University Wexner Medical Center, Columbus, OH 43210, USA.
- School of Intelligent Manufacturing and Smart Transportation, Suzhou City University, Suzhou City, 215000, China.
Abstract
Alzheimer's disease (AD) is the most common neurodegenerative progressive disorder and the fifth-leading cause of death in older people. The detection of AD is a very challenging task for clinicians and radiologists due to the complex nature of this disease, thus requiring automatic data-driven machine-learning models to enhance diagnostic accuracy and support expert decision-making. However, machine learning models are hindered by three key limitations, in AD classification:(i) diffuse and subtle structural changes in the brain that make it difficult to capture global pathology (ii) non-uniform alterations across MRI planes, which limit single-view learning and (iii) the lack of deep integration of demographic context, which is often ignored despite its clinical importance. To address these challenges in this paper, we propose a novel multi-modal deep learning framework, named AlzFormer, that dynamically integrates 3D MRI with demographic features represented as knowledge graph embeddings for AD classification. Specifically, (i) to capture global and volumetric features, a 3D CNN is employed; (ii) to model plane-specific information, three parallel 2D CNNs are used for tri-planar processing (axial, coronal, sagittal), combined with a Transformer encoder; and (iii) to incorporate demographic context, we integrate demographic features as knowledge graph embeddings through a novel Adaptive Attention Gating mechanism that balances contributions from both modalities (i.e., MRI and demographics). Comprehensive experiments on two real-world datasets, including generalization tests, ablation studies, and robustness evaluation under noisy conditions, demonstrate that the proposed model provides a robust and effective solution for AD diagnosis. These results suggest strong potential for integration into Clinical Decision Support Systems (CDSS), offering a more interpretable and personalized approach to early Alzheimer's detection.