Early Dementia Diagnosis in Older Adults through Machine Learning: A Cross-Sectional fMRI Data Analysis
Authors
Affiliations (1)
Affiliations (1)
- Arizona State University
Abstract
BackgroundEarly diagnosis of dementia can significantly improve care planning and patient outcomes while delaying progression. Machine learning algorithms can identify patterns in clinical and neuroimaging data that may aid in the early detection of dementia risk factors. ObjectiveTo evaluate the performance of the ensemble machine learning pipeline for classifying dementia status utilizing demographic, clinical, and imaging features, and to identify the most predictive variables contributing to model accuracy. MethodsA cross-sectional study analyzed 373 MRI scans from 150 subjects aged 60-98 years. Variables included cognitive scores (MMSE, CDR), volumetric brain measures (eTIV, nWBV, ASF), demographic features (age, sex, education), and socioeconomic status. After preprocessing and imputing missing values with random forests, tree-based variable selection was performed, and the dataset was split into training and test sets, with 5-fold cross-validation used for model validation. An ensemble of 8 machine learning models was used to classify patients as demented or non-demented. ResultsModel performance was assessed using the area under the receiver operating characteristic (ROC) curve (AUC), accuracy, sensitivity, specificity, precision, F1 score, and Matthews Correlation Coefficient (MCC). Random Forest achieved the highest AUC (0.963), while MLP demonstrated the highest accuracy (94.6%), F1-score (0.943), and MCC (0.893). CDR, MMSE, and ASF were identified as the top predictors. Performance was robust across folds in 5-fold CV, and feature importance analyses supported clinical relevance. ConclusionsEnsemble ML approaches offer high predictive performance in dementia classification. ML frameworks have the potential to be integrated into diagnostic support tools, enabling more accurate and earlier detection of dementia using clinical and imaging data.