Modeling inter-slice dependencies with temporal graph learning for Alzheimer's disease.
Authors
Affiliations (5)
Affiliations (5)
- Department of Medicine, LSU Health Shreveport, Shreveport, LA, USA; Department of Software Engineering, Faculty of Engineering, Topkapı University, Istanbul, Turkey.
- Department of Pathology and Translational Pathobiology, LSU Health Shreveport, Shreveport, LA 71103, USA.
- Department of Medicine, LSU Health Shreveport, Shreveport, LA, USA; Department of Pediatrics, LSU Health Shreveport, Shreveport, LA 71103, USA.
- Department of Pediatrics, LSU Health Shreveport, Shreveport, LA 71103, USA.
- Department of Medicine, LSU Health Shreveport, Shreveport, LA, USA; Department of Pathology and Translational Pathobiology, LSU Health Shreveport, Shreveport, LA 71103, USA. Electronic address: [email protected].
Abstract
Accurate and early diagnosis of Alzheimer's disease (AD) remains a major clinical challenge, particularly in distinguishing mild cognitive impairment (MCI) from cognitively normal (CN) aging. Conventional approaches that rely solely on pre-trained 2D models often fail to capture the full spatial context of three-dimensional MRI volumes, as well as the temporal dependencies that exist across consecutive slices. In this study, we propose TGL-AD, a novel framework that integrates Vision Transformer (ViT) slice embeddings with temporal graph learning for subject-level AD classification. Standardized 3D MRI volumes were decomposed into 2D slices, encoded by a pre-trained ViT to generate 768-dimensional feature embeddings, and subsequently modeled as temporal graphs to preserve inter-slice continuity. Contextual information was transmitted across slices by graph neural networks (GNNs), and subject-level representations were generated for the final classification through global pooling. On the ADNI1 dataset, TGL-AD achieved an overall accuracy of 0.92 on the Complete 1Yr 1.5 T cohort and 0.98 on the Complete 3Yr 3 T cohort. For all diagnostic categories (AD, MCI, and CN), precision, recall, and F1-scores stayed consistently high across both groups. The macro-averaged and weighted-averaged F1-scores for the 1.5 T and 3 T data were 0.92 and 0.98, respectively. A comparative assessment against leading-edge architectures further confirms that TGL-AD surpasses CNN-based and transformer-based sequential baselines, attaining the highest recall (0.93 for 1.5 T and 0.98 for 3 T) and F1-score (0.92 for 1.5 T and 0.98 for 3 T). These findings show that the integration of transformer-based slice encoders and temporal graph modeling efficiently captures inter-slice dependencies and enhances classification performance across different acquisition settings.