Multi-modal Risk Stratification in Heart Failure with Preserved Ejection Fraction Using Clinical and CMR-derived Features: An Approach Incorporating Model Explainability.
Authors
Affiliations (4)
Affiliations (4)
- Department of General Practice, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China (S.Z.).
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, Jiangsu, China (Y.L., D.H., Y.P., T.G., H.G., J.Z.).
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, Jiangsu, China (Y.L., D.H., Y.P., T.G., H.G., J.Z.); Department of Radiology, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China (T.G.).
- School of Medical Imaging, Xuzhou Medical University, Xuzhou, Jiangsu, China (Y.L., D.H., Y.P., T.G., H.G., J.Z.); PET Center, Yale University, New Haven (J.Z.). Electronic address: [email protected].
Abstract
Heart failure with preserved ejection fraction (HFpEF) poses significant diagnostic and prognostic challenges due to its clinical heterogeneity. This study proposes a multi-modal, explainable machine learning framework that integrates clinical variables and cardiac magnetic resonance (CMR)-derived features, particularly epicardial adipose tissue (EAT) volume, to improve risk stratification and outcome prediction in patients with HFpEF. A retrospective cohort of 301 participants (171 in the HFpEF group and 130 in the control group) was analyzed. Baseline characteristics, CMR-derived EAT volume, and laboratory biomarkers were integrated into machine learning models. Model performance was evaluated using accuracy, precision, recall, and F1-score. Additionally, receiver operating characteristic area under the curve (ROC-AUC) and precision-recall area under the curve (PR-AUC) were employed to assess discriminative power across varying decision thresholds. Hyperparameter optimization and ensemble techniques were applied to enhance predictive performance. HFpEF patients exhibited significantly higher EAT volume (70.9±27.3 vs. 41.9±18.3 mL, p<0.001) and NT-proBNP levels (1574 [963,2722] vs. 33 [10,100] pg/mL, p<0.001), along with a greater prevalence of comorbidities. The voting classifier demonstrated the highest accuracy for HFpEF diagnosis (0.94), with a precision of 0.96, recall of 0.94, and an F1-score of 0.95. For prognostic tasks, AdaBoost, XGBoost and Random Forest yielded superior performance in predicting adverse clinical outcomes, including rehospitalization and all-cause mortality (accuracy: 0.95). Key predictive features identified included EAT volume, right atrioventricular groove (Right AVG), tricuspid regurgitation velocity (TRV), and metabolic syndrome. Explainable models combining clinical and CMR-derived features, especially EAT volume, improve support for HFpEF diagnosis and outcome prediction. These findings highlight the value of a data-driven, interpretable approach to characterizing HFpEF phenotypes and may facilitate individualized risk assessment in selected populations.