Assessing the robustness of an artificial intelligence segmentation model for quantitative cardiovascular magnetic resonance imaging across cardiac phenotypes.
Authors
Affiliations (10)
Affiliations (10)
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany.
- Working Group on CMR, Experimental and Clinical Research Center, Max Delbrück Center for Molecular Medicine in the Helmholtz Association and Charité - Universitätsmedizin Berlin, Berlin, Germany.
- Department of Cardiology and Nephrology, Helios Hospital Berlin-Buch, Berlin, Germany.
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany.
- Siemens Healthcare GmbH, Hamburg, Germany.
- Research & Clinical Translation, Magnetic Resonance, Siemens Healthineers AG, Erlangen, Germany.
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany. [email protected].
- Working Group on CMR, Experimental and Clinical Research Center, Max Delbrück Center for Molecular Medicine in the Helmholtz Association and Charité - Universitätsmedizin Berlin, Berlin, Germany. [email protected].
- Department of Cardiology and Nephrology, Helios Hospital Berlin-Buch, Berlin, Germany. [email protected].
- DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany. [email protected].
Abstract
To introduce an artificial intelligence-based cardiovascular magnetic resonance segmentation algorithm (Nick) for automated quantification of function and parametric mapping across cardiac phenotypes reflecting clinical routine. Nick was compared to manual gold standard (GS) segmentations in 359 multi-centre cases at 1.5T and 3T, consisting of 104 healthy individuals and 255 diseased patients with various cardiac phenotypes. Left and right ventricular (LV, RV) volumes and LV mass (LVM) were derived from short-axis segmentations. For parametric mapping, the LV myocardium was segmented to quantify T1 and T2 relaxation times. Statistical analysis comprised mean differences, correlation coefficients (R²), Bland-Altman analysis, tolerance range assessments, and paired boxplots. The number of slices and contours requiring manual correction was estimated based on slice-level differences. Nick demonstrated high agreement with the GS for LV and RV volume estimations (R²≥0.93) and LVM quantification (R²=0.86). For the ejection fractions, correlations were slightly lower (R²=0.85/0.72 for LV/RV) with small mean differences (+ 1.14%/-2.48% for LV/RV). T1 and T2 mapping values showed excellent agreement with manual reference values (R²≥0.92) and minimal biases (-1.64/0.14 ms for T1/T2). Nick underestimated LV volumes at end-diastole (-4.48 ml) and end-systole (-3.28 ml) as well as the RV end-diastolic volume (-5.14 ml) and stroke volume (-6.75 ml). Nonetheless, tolerance testing for mean deviations revealed clinically acceptable biases for all comparisons, and less than two slices per case required correction on average. Comparison to expert segmentations revealed robust performance of Nick in routine clinical cases with variable pathology, supporting its future integration into clinical workflows.