Early detection of Alzheimer's disease progression stages using hybrid of CNN and transformer encoder models.

Authors

Almalki H,Khadidos AO,Alhebaishi N,Senan EM

Affiliations (6)

  • Department of Information Technology, College of Technology for Communications and Information, Technical and Vocational training Corporation, Riyadh, Saudi Arabia. [email protected].
  • Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia. [email protected].
  • Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia.
  • Center of Research Excellence in Artificial Intelligence and Data Science, King Abdulaziz University, Jeddah, Saudi Arabia.
  • Department of Computer Science, College of Applied Sciences, Hajjah University, Hajjah, Yemen.
  • Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Al-Razi University, Sana'a, Yemen.

Abstract

Alzheimer's disease (AD) is a neurodegenerative disorder that affects memory and cognitive functions. Manual diagnosis is prone to human error, often leading to misdiagnosis or delayed detection. MRI techniques help visualize the fine tissues of the brain cells, indicating the stage of disease progression. Artificial intelligence techniques analyze MRI with high accuracy and extract subtle features that are difficult to diagnose manually. In this study, a modern methodology was designed that combines the power of CNN models (ResNet101 and GoogLeNet) to extract local deep features and the power of Vision Transformer (ViT) models to extract global features and find relationships between image spots. First, the MRI images of the Open Access Imaging Studies Series (OASIS) dataset were improved by two filters: the adaptive median filter (AMF) and Laplacian filter. The ResNet101 and GoogLeNet models were modified to suit the feature extraction task and reduce computational cost. The ViT architecture was modified to reduce the computational cost while increasing the number of attention vertices to further discover global features and relationships between image patches. The enhanced images were fed into the proposed ViT-CNN methodology. The enhanced images were fed to the modified ResNet101 and GoogLeNet models to extract the deep feature maps with high accuracy. Deep feature maps were fed into the modified ViT model. The deep feature maps were partitioned into 32 feature maps using ResNet101 and 16 feature maps using GoogLeNet, both with a size of 64 features. The feature maps were encoded to recognize the spatial arrangement of the patch and preserve the relationship between patches, helping the self-attention layers distinguish between patches based on their positions. They were fed to the transformer encoder, which consisted of six blocks and multiple vertices to focus on different patterns or regions simultaneously. Finally, the MLP classification layers classify each image into one of four dataset classes. The improved ResNet101-ViT hybrid methodology outperformed the GoogLeNet-ViT hybrid methodology. ResNet101-ViT achieved 98.7% accuracy, 95.05% AUC, 96.45% precision, 99.68% sensitivity, and 97.78% specificity.

Topics

Alzheimer DiseaseNeural Networks, ComputerJournal Article
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.