Back to all papers

Etiologic classification of suspected MINOCA using cardiovascular magnetic resonance reports: a comparison of a large language model and human readers.

March 23, 2026pubmed logopapers

Authors

Károlyi M,Wilzeck VC,Tramèr L,Groenhoff L,von Spiczak J,Bigvava T,Alkadhi H,Manka R

Affiliations (6)

  • Diagnostic and Interventional Radiology, University Hospital Zurich, University of Zurich, Zurich, Switzerland.
  • Department of Cardiology, University Heart Center, University Hospital Zurich, University of Zurich, Raemistrasse 100, Zurich, 8091, Switzerland.
  • Institute for Biomedical Engineering, University and ETH Zurich, Zurich, Switzerland.
  • Diagnostic and Interventional Radiology, University Hospital Zurich, University of Zurich, Zurich, Switzerland. [email protected].
  • Department of Cardiology, University Heart Center, University Hospital Zurich, University of Zurich, Raemistrasse 100, Zurich, 8091, Switzerland. [email protected].
  • Institute for Biomedical Engineering, University and ETH Zurich, Zurich, Switzerland. [email protected].

Abstract

The study explored the feasibility of using a large language model (LLM) for etiologic classification in patients with suspected myocardial infarction with non-obstructive coronary arteries (MINOCA) based on cardiovascular magnetic resonance (CMR) reports. We included 156 patients with MINOCA from a prospective (n = 50) and a retrospective (n = 106) pooled cohort. A large language model and three human readers with different experience levels independently classified CMR reports into eight predefined diagnostic categories, using the final expert CMR diagnosis as the reference standard. Performance and agreement were assessed using standard multiclass classification metrics. In the pooled cohort, the LLM achieved an exact-match diagnostic accuracy of 67.3% (95% CI 59.6-74.2%), lower than the expert reader (80.1%, 95% CI 73.2-85.6) but comparable to the intermediate and junior readers (both 68.6%, 95% CI 60.9-75.4%). Diagnosis-specific analysis showed consistently high specificity for the LLM (mean 94.9%), with sensitivities up to 86.0% and F1-scores up to 85.1% for common etiologies. Agreement between the LLM and the reference standard was substantial (ICC 0.84, 95% CI: 0.79-0.89) and comparable to experienced readers, whereas agreement with the junior reader was markedly lower, indicating greater diagnostic variability despite similar accuracy. A large language model shows promise as a supportive tool for etiologic classification in suspected MINOCA from CMR reports, with performance comparable to less experienced readers and high agreement with the final expert CMR diagnosis. Integration into structured reporting workflows may enhance diagnostic consistency in routine clinical practice.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.