A comparison of vendor artificial intelligence solutions for automated post-processing of short-axis cine images in cardiovascular magnetic resonance imaging.
Authors
Affiliations (9)
Affiliations (9)
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität Zu Berlin, ECRC Experimental and Clinical Research Center, Berlin, Germany. [email protected].
- Working Group On CMR, Experimental and Clinical Research Center, a cooperation between the Max Delbrück Center for Molecular Medicine in the Helmholtz Association and the Charité - Universitätsmedizin Berlin, Lindenberger Weg 80 13125, Berlin, Germany. [email protected].
- DZHK (German Centre for Cardiovascular Research), Berlin, Germany. [email protected].
- Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität Zu Berlin, ECRC Experimental and Clinical Research Center, Berlin, Germany.
- Working Group On CMR, Experimental and Clinical Research Center, a cooperation between the Max Delbrück Center for Molecular Medicine in the Helmholtz Association and the Charité - Universitätsmedizin Berlin, Lindenberger Weg 80 13125, Berlin, Germany.
- DZHK (German Centre for Cardiovascular Research), Berlin, Germany.
- Department of Cardiology and Nephrology, Helios Hospital Berlin-Buch, Berlin, Germany.
- Digital Technology and Innovation, Siemens Healthineers AG, Erlangen, Germany.
- Research & Clinical Translation, Magnetic Resonance, Siemens Healthineers AG, Erlangen, Germany.
Abstract
Automated segmentation of cardiac magnetic resonance (CMR) imaging is integrated into clinical workflows, yet comparative performance across vendor AI solutions remains insufficiently characterized. This study assessed three models (two commercial, one research) for short-axis cine segmentation in a diverse cohort of 346 cases, including dilated cardiomyopathy (DCM), left ventricular hypertrophy (LVH), healthy volunteers, and other cardiac diseases. Clinical parameter agreement between AI-derived and expert-derived ventricular volumes and left ventricular mass (LVM) was evaluated using correlations and mean differences, segmentation agreement with Dice coefficient, and slice detection was characterized with false positive and negative rates (FPR/FNR). Papillary muscle (PM) inclusion was examined with subgroup analyses. AI-derived clinical parameters agreed strongly with expert measurements (r > 0.8). Nevertheless, inter-model biases included differing ventricular volume estimates. Midventricular segmentation was reliable (Dice > 80%), whereas apical slices were poor (Dice < 65%) with minor area impact (< 1cm<sup>2</sup>). Basal slice detection varied substantially, with AI1 and AI2 over- and AI3 under-detecting slices (e.g. RV FPR: AI1 24%, AI2 14%, AI3 FNR: 32%), producing large area differences. Due to PM exclusion AI2 overestimated volumes and underestimated LVM - particularly LVH-cases. While AI-expert agreement is high, AI solutions are not interchangeable and produce clinically relevant differences to experts across cardiac regions and disease groups.