Large language models and large multimodal models in radiology: opportunities, challenges, and the path toward sustainable long-term clinical integration.
Authors
Affiliations (3)
Affiliations (3)
- Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada. [email protected].
- Department of Geomatics Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada. [email protected].
- Department of Medical Imaging, Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
Abstract
Large language models (LLMs), built on transformer architecture, have emerged as a fundamental tool in natural language processing and contextual reasoning, and have been extended to multimodal data interpretation, which has been termed large multimodal models (LMMs). Radiology as a medical discipline, having undergone full digital transformation over the past two decades, is uniquely positioned at the forefront of medicine to benefit from this recent technological advancement. The integration of LLMs and LMMs into radiology workflows has demonstrated promise in improving reporting efficiency, decision support, and clinical communication. However, real-world adoption remains limited due to concerns about reliability, hallucinations, drift, workflow disruption, medicolegal uncertainty, and the absence of standardized integration to existing clinical systems/infrastructures. This review highlights the need for transparent model behavior, standardized software integration tools, and high-quality radiologist-curated local datasets. While early studies suggest that LLMs and LMMs may reduce cognitive load and enhance reporting efficiency, their clinical value depends on alignment with radiologists' needs, well-planned deployment, and rigorous evaluation and maintenance. We conclude with suggestions toward effective integration of LLMs and LMMs into modern, constantly evolving radiology practices.