HDFT-MViT: a progressive core-enhanced mix framework for Alzheimer's disease classification using MRI images.
Authors
Affiliations (2)
Affiliations (2)
- College of Information Engineering, Henan University of Science and Technology, Luoyang, China.
- The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, Luoyang, China.
Abstract
Early and accurate diagnosis of Alzheimer's disease (AD) is critical. In MRI-based computer-aided diagnosis, convolutional neural networks (CNNs) excel at extracting local features but struggle to model long-range dependencies, while Vision Transformers (ViTs) offer strong global modeling capabilities but suffer from high computational complexity, limiting their deployment in resource-constrained settings. This paper proposes HDFT-MViT, a lightweight hybrid architecture based on MobileViT that integrates a hierarchical dynamic filter with a lightweight Transformer. The model adopts a progressive Core-Enhanced Mix design: Shallow layers employ MobileNetV2 inverted residual blocks for efficient local feature extraction; intermediate and deep layers incorporate a dual-branch module that integrates a dynamic filter for frequency-domain global modulation and a lightweight Transformer for spatial long-range dependency modeling, followed by hierarchical fusion via learnable weights. A channel attention mechanism is further introduced to enhance feature discriminability. Evaluations on the public ADNI-1 (3-class) and ADNI-2 (4-class) MRI datasets show that HDFT-MViT achieves state-of-the-art classification accuracies of 98.85 ± 0.27% and 98.07 ± 0.54%, respectively, while maintaining a lightweight profile with only 3.46 M parameters, confirming its effectiveness and efficiency. HDFT-MViT achieves an optimal balance between local detail perception and global semantic understanding within a computationally efficient framework, offering a promising tool for clinical AD diagnosis. Code will be released upon acceptance.