Development of a large-scale grounded vision language dataset for chest CT analysis.

October 10, 2025

papers

DOI: 10.1038/s41597-025-05922-9 PMID: 41073455

Authors

Zhang X,Wu C,Zhao Z,Lei J,Tian W,Zhang Y,Xie W,Wang Y

Affiliations (8)

Shanghai Jiao Tong University, Shanghai, China.
Shanghai AI Laboratory, Shanghai, China.
University of Science and Technology of China, Anhui, China.
Fudan University, Shanghai, China.
Shanghai Jiao Tong University, Shanghai, China. [email protected].
Shanghai AI Laboratory, Shanghai, China. [email protected].
Shanghai Jiao Tong University, Shanghai, China. [email protected].
Shanghai AI Laboratory, Shanghai, China. [email protected].

Abstract

Developing generalist foundation model has recently attracted tremendous attention in the field of AI for Medicine, which requires open-source medical image datasets that incorporate diverse supervision signals across various imaging modalities. In this paper, we introduce RadGenome-Chest CT, a comprehensive, large-scale, region-guided 3D chest CT interpretation dataset based on CT-RATE. Specifically, we leverage the latest powerful universal segmentation model and large language models, to extend the original datasets from the following aspects: organ-level segmentation masks covering 197 categories, which provide intermediate reasoning visual clues for interpretation; 665K multigranularity grounded reports, where each sentence of the report is linked to the corresponding anatomical region of CT volume with a segmentation mask; 1.2M grounded VQA pairs, where questions and answers are all linked with reference segmentation masks, enabling models to associate visual evidence with textual explanations. We believe that RadGenome-Chest CT can significantly advance the development of multimodal medical foundation models, by training to generate texts based on given segmentation regions, which is unattainable with previous relevant datasets.

View Source Full Text PDF

Topics

Tomography, X-Ray ComputedThoraxRadiography, ThoracicJournal ArticleDataset

Development of a large-scale grounded vision language dataset for chest CT analysis.

Authors

Affiliations (8)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?