Predicting future dementia from routine clinical MRI and linked healthcare data.
Authors
Affiliations (8)
Affiliations (8)
- Division of Population Health and Genomics, University of Dundee, Dundee, UK. [email protected].
- Health Informatics Centre, University of Dundee, Dundee, UK. [email protected].
- Division of Neuroscience, University of Dundee, Dundee, UK.
- College of Engineering, University of Wasit, Kut, Iraq.
- Division of Population Health and Genomics, University of Dundee, Dundee, UK.
- Health Informatics Centre, University of Dundee, Dundee, UK.
- School of Science and Engineering, Computing, VAMPIRE Project, University of Dundee, Dundee, UK.
- Health Data Research UK, London, UK.
Abstract
Early identification of individuals at risk of dementia is essential for preventive care and timely enrolment into disease-modifying interventions. However, most existing prediction approaches rely on invasive, costly, or research-only biomarkers that are not scalable within public healthcare systems. Routinely acquired National Health Service (NHS) brain magnetic resonance imaging (MRI) scans, when linked with electronic health records, represent a widely available and privacy-preserving resource for population-level dementia risk stratification. A key challenge for clinical translation is ensuring that machine-learning predictions are reliable, interpretable, and safe to apply, particularly when models are used years before clinical diagnosis. We conducted a retrospective case-control study entirely within a secure NHS Trusted Research Environment using routine T1-weighted brain MRI scans linked to electronic health records from Tayside and Fife, Scotland. The study included 518 participants: 259 individuals who subsequently developed dementia and 259 age- and sex-matched controls. Structural brain features were derived from MRI data and analysed using a support-vector-machine classifier with nested cross-validation to minimise overfitting. Prediction confidence was quantified using distance-from-hyperplane (DFH) calibration, enabling stratification of model outputs by certainty. Primary outcomes were classification accuracy and area under the receiver-operating-characteristic curve (AUC). Secondary analyses examined DFH-stratified performance and the relationship between prediction accuracy and time from scan to first recorded dementia diagnosis. The model predicted future dementia up to five years before first recorded NHS diagnosis with an AUC of 0.71, a performance consistent with real-world clinical imaging rather than research-optimised datasets. Model sensitivity increased for scans acquired closer to diagnosis, indicating stronger predictive signal as disease onset approached. Confidence-based stratification identified a high-confidence subgroup comprising approximately 35% of scans, within which prediction accuracy increased to around 80%. Performance was consistent across heterogeneous routine NHS scanners and imaging protocols, demonstrating robustness and generalisability to real-world clinical data rather than research-optimised acquisitions. Routinely collected NHS brain MRI data can be used to predict future dementia several years before clinical diagnosis. Incorporating confidence calibration transforms a conventional machine-learning classifier into a safety-aware and clinically interpretable framework by enabling selective use of high-certainty predictions. This approach supports scalable early detection, population-level risk stratification, and targeted recruitment into preventive or disease-modifying clinical trials, with clear potential for integration into public health systems.