Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data.

August 23, 2025

papers

DOI: 10.1038/s41467-025-62385-7 PMID: 40849424

Authors

Wu C,Zhang X,Zhang Y,Hui H,Wang Y,Xie W

Affiliations (6)

Shanghai Jiao Tong University, Shanghai, China.
Shanghai Artificial Intelligence Laboratory, Shanghai, China.
Shanghai Jiao Tong University, Shanghai, China. [email protected].
Shanghai Artificial Intelligence Laboratory, Shanghai, China. [email protected].
Shanghai Jiao Tong University, Shanghai, China. [email protected].
Shanghai Artificial Intelligence Laboratory, Shanghai, China. [email protected].

Abstract

In this study, as a proof-of-concept, we aim to initiate the development of Radiology Foundation Model, termed as RadFM. We consider three perspectives: dataset construction, model design, and thorough evaluation, concluded as follows: (i), we contribute 4 multimodal datasets with 13M 2D images and 615K 3D scans. When combined with a vast collection of existing datasets, this forms our training dataset, termed as Medical Multi-modal Dataset, MedMD. (ii), we propose an architecture that enables to integrate text input with 2D or 3D medical scans, and generates responses for diverse radiologic tasks, including diagnosis, visual question answering, report generation, and rationale diagnosis; (iii), beyond evaluation on 9 existing datasets, we propose a new benchmark, RadBench, comprising three tasks aiming to assess foundation models comprehensively. We conduct both automatic and human evaluations on RadBench. RadFM outperforms former accessible multi-modal foundation models, including GPT-4V. Additionally, we adapt RadFM for diverse public benchmarks, surpassing various existing SOTAs.

View Source Full Text PDF

Topics

Imaging, Three-DimensionalInternetRadiologyJournal Article

Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data.

Authors

Affiliations (6)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?