Back to all papers

CMR-LLaMA: a Finetuned Large Language Model for Identifying Findings and Associated Attributes in CMR Reports.

November 13, 2025pubmed logopapers

Authors

Fang MZ,Nakashima M,Singh K,Galvani E,Sun X,Sorathia S,Dorocak K,Kwon D,Nguyen C,Chen D

Affiliations (4)

  • Cleveland Clinic Lerner College of Medicine at Case Western, Cleveland, Ohio, United States.
  • Heart Vascular and Thoracic Institute, Cleveland Clinic, Ohio, United States; Cardiovascular Innovation Research Center, Cleveland Clinic, Ohio, United States.
  • Heart Vascular and Thoracic Institute, Cleveland Clinic, Ohio, United States.
  • Cardiovascular Innovation Research Center, Cleveland Clinic, Ohio, United States; Diagnostics Institute, Cleveland Clinic, Ohio, United States. Electronic address: [email protected].

Abstract

Cardiac magnetic resonance imaging (CMR) studies contain a wealth of information on a patient's cardiovascular status. The ability to extract this data from free-text reports could serve to automate clinical decision support tools and generate data for retrospective clinical knowledge discovery, and clinical operational purposes. Few studies have examined the automatic extraction of data from free-text CMR reports, and the existing studies that do have key limitations including small sample size and disease specific data extraction. Existing studies also fail to extract features associated with the cardiovascular conditions that reflect nuances in natural language, such as uncertainty, severity, subtype and anatomical locations of the condition. The goal of this study was to build a broad named entity recognition model to automatically extract a broad variety of common CMR findings and their associated attributes from CMR reports. We fine-tuned a large language model Meta AI (LLaMA) model trained to identify 34 cardiovascular conditions and their associated attributes, including certainty, severity, location, and subtype of the condition. This model was trained on 1,778 MRI reports and tested on 397 reports in an held-out test set and another 428 reports from another site in our hospital system with independent radiology practice and scanners. Our model shows robust performance in predicting the mention of the 31 cardiovascular conditions (average F1 = 0.85). It also showed strong performance predicting attributes including certainty (average F1 = 0.97) and severity (average F1 = 0.97). Model performance on the external validation set was generally slightly lower than the internal validation set but performance was still strong (average F1 = 0.78 for mention, 0.97 for certainty, and 0.96 for severity). CMR-LLaMA has strong performance identifying a variety of concept mentions and moderate accuracies in extracting a selection of other associated attributes. NLP models can be used to automate the extraction of data from CMR reports to potentially assist with clinical and research workflow.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.