Rapid review: Growing usage of Multimodal Large Language Models in healthcare.

Authors

Gupta P,Zhang Z,Song M,Michalowski M,Hu X,Stiglic G,Topaz M

Affiliations (6)

  • Columbia University, School of Nursing, NY, United States. Electronic address: [email protected].
  • Columbia University, School of Nursing, NY, United States; Columbia University, Data Science Institute, NY, United States.
  • University of Minnesota, School of Nursing, Minneapolis, United States.
  • Center for Data Science, Emory University, Atlanta, United States; Nell Hodgson Woodruff School of Nursing, Emory University, Atlanta, United States; Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, United States; Department of Computer Science, College of Arts and Sciences, Emory University, Atlanta, United States; Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, United States.
  • Faculty of Health Sciences, University of Maribor, Maribor, Slovenia; Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia; Usher Institute, University of Edinburgh, Edinburgh, United Kingdom.
  • Columbia University, School of Nursing, NY, United States; Columbia University, Data Science Institute, NY, United States; VNS Health, NY, United States. Electronic address: [email protected].

Abstract

Recent advancements in large language models (LLMs) have led to multimodal LLMs (MLLMs), which integrate multiple data modalities beyond text. Although MLLMs show promise, there is a gap in the literature that empirically demonstrates their impact in healthcare. This paper summarizes the applications of MLLMs in healthcare, highlighting their potential to transform health practices. A rapid literature review was conducted in August 2024 using World Health Organization (WHO) rapid-review methodology and PRISMA standards, with searches across four databases (Scopus, Medline, PubMed and ACM Digital Library) and top-tier conferences-including NeurIPS, ICML, AAAI, MICCAI, CVPR, ACL and EMNLP. Articles on MLLMs healthcare applications were included for analysis based on inclusion and exclusion criteria. The search yielded 115 articles, 39 included in the final analysis. Of these, 77% appeared online (preprints and published) in 2024, reflecting the emergence of MLLMs. 80% of studies were from Asia and North America (mainly China and US), with Europe lagging. Studies split evenly between pre-built MLLMs evaluations (60% focused on GPT versions) and custom MLLMs/frameworks development with task-specific customizations. About 81% of studies examined MLLMs for diagnosis and reporting in radiology, pathology, and ophthalmology, with additional applications in education, surgery, and mental health. Prompting strategies, used in 80% of studies, improved performance in nearly half. However, evaluation practices were inconsistent with 67% reported accuracy. Error analysis was mostly anecdotal, with only 18% categorized failure types. Only 13% validated explainability through clinician feedback. Clinical deployment was demonstrated in just 3% of studies, and workflow integration, governance, and safety were rarely addressed. MLLMs offer substantial potential for healthcare transformation through multimodal data integration. Yet, methodological inconsistencies, limited validation, and underdeveloped deployment strategies highlight the need for standardized evaluation metrics, structured error analysis, and human-centered design to support safe, scalable, and trustworthy clinical adoption.

Topics

Journal ArticleReview

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.