Machine Learning-Derived Cardiovascular Aging Phenotypes From Cardiac Function and Stroke Risk in the UK Biobank: Cohort Study.
Authors
Affiliations (3)
Affiliations (3)
- Department of Neurology, Affiliated Jinling Hospital, Medical School of Nanjing University, 305 Zhongshan East Road, Xuanwu District, Nanjing, Jiangsu Province, 210002, China, 86 2584801861, 86 2584805169.
- Department of Neurology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
- Department of Neurology, Centre for Leading Medicine and Advanced Technologies of IHM, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
Abstract
Cardiovascular magnetic resonance (CMR) is widely used across various cardiac conditions and systematically assesses cardiac anatomical structures and functional dynamics. Machine learning (ML) can accurately predict outcomes and understand the inherent features of clinical data. This study aimed to derive CMR phenotypes related to cardiovascular aging, investigate the relationship between these phenotypes and stroke risk, and relearn these phenotypes using supervised ML. We enrolled 36,467 participants without stroke and extracted CMR parameters from the UK Biobank, with follow-up data extending until September 30, 2023. Using the generative topographic mapping technique, we identified latent grid nodes among participants and then derived phenotypes through agglomerative hierarchical clustering. We used supervised ML models to predict cardiac function phenotypes and used Cox proportional hazards models to assess the association between these phenotypes and long-term stroke risk. We enrolled 36,467 participants in the study. The mean age was 54.9 (SD 7.5) years, with 17,442 (47.8%) male participants. During a mean follow-up time of 14.7 (SD 1.1) years, 500 (1.4%) participants developed stroke and 664 (1.8%) participants died, respectively. After generative topographic mapping modeling, we identified 2 distinct phenotypes: phenotype 1, characterized by adverse cardiac function and an accumulation of cardiovascular risk factors, reflecting cardiovascular aging; and phenotype 2, associated with a lower risk of stroke (hazard ratio 0.695, 95% CI 0.559-0.864; P=.001), which remained significant after accounting for competing mortality (hazard ratio 0.578, 95% CI 0.484-0.691; P<.001). We selected the random forest model as the optimal model for the phenotypes, demonstrating high accuracy (area under the curve 0.914, 95% CI 0.911-0.918 for training and 0.867, 95% CI 0.858-0.876 for validation) and calibration ability (Brier score 0.111, 95% CI 0.109-0.113 for training and 0.132, 95% CI 0.127-0.137 for validation). By integrating unsupervised and supervised ML methods, we identified cardiovascular aging-related phenotypes that demonstrate robust predictive ability for incident stroke, which may have the potential to improve preventive and therapeutic strategies for high-risk populations.