Role of Large Language Models for Suggesting Nerve Involvement in Upper Limbs MRI Reports with Muscle Denervation Signs.
Authors
Affiliations (6)
Affiliations (6)
- MRI unit, Radiology department, HT medica, Carmelo Torres n°2, 23007, Jaén, Spain. [email protected].
- NLP unit, HT medica, Carmelo Torres n°2, 23007, Jaén, Spain.
- MRI unit, Radiology department, HT medica, Carmelo Torres n°2, 23007, Jaén, Spain.
- Department of Nuclear Medicine, Virgen de las Nieves University Hospital, Av. de las Fuerzas Armadas, 2, 18014, Granada, Spain.
- IBS Granada Bio-Health Research Institute, Av. de Madrid, 15, 18012, Granada, Spain.
- Department of Signal Theory, Networking and Communications, University of Granada, Avenida de Fuente Nueva, s/n, 18071, Granada, Spain.
Abstract
Determining the involvement of specific peripheral nerves (PNs) in the upper limb associated with signs of muscle denervation can be challenging. This study aims to develop, compare, and validate various large language models (LLMs) to automatically identify and establish potential relationships between denervated muscles and their corresponding PNs. We collected 300 retrospective MRI reports in Spanish from upper limb examinations conducted between 2018 and 2024 that showed signs of muscle denervation. An expert radiologist manually annotated these reports based on the affected peripheral nerves (median, ulnar, radial, axillary, and suprascapular). BERT, DistilBERT, mBART, RoBERTa, and Medical-ELECTRA models were fine-tuned and evaluated on the reports. Additionally, an automatic voting system was implemented to consolidate predictions through majority voting. The voting system achieved the highest F1 scores for the median, ulnar, and radial nerves, with scores of 0.88, 1.00, and 0.90, respectively. Medical-ELECTRA also performed well, achieving F1 scores above 0.82 for the axillary and suprascapular nerves. In contrast, mBART demonstrated lower performance, particularly with an F1 score of 0.38 for the median nerve. Our voting system generally outperforms the individually tested LLMs in determining the specific PN likely associated with muscle denervation patterns detected in upper limb MRI reports. This system can thereby assist radiologists by suggesting the implicated PN when generating their radiology reports.