Latest Papers on Radiology AI. Tags: Open Dataset

SAM-Med3D: A Vision Foundation Model for General-Purpose Segmentation on Volumetric Medical Images.

Wang H, Guo S, Ye J, Deng Z, Cheng J, Li T, Chen J, Su Y, Huang Z, Shen Y, zzzzFu B, Zhang S, He J

•papers•Jul 31 2025

Existing volumetric medical image segmentation models are typically task-specific, excelling at specific targets but struggling to generalize across anatomical structures or modalities. This limitation restricts their broader clinical use. In this article, we introduce segment anything model (SAM)-Med3D, a vision foundation model (VFM) for general-purpose segmentation on volumetric medical images. Given only a few 3-D prompt points, SAM-Med3D can accurately segment diverse anatomical structures and lesions across various modalities. To achieve this, we gather and preprocess a large-scale 3-D medical image segmentation dataset, SA-Med3D-140K, from 70 public datasets and 8K licensed private cases from hospitals. This dataset includes 22K 3-D images and 143K corresponding masks. SAM-Med3D, a promptable segmentation model characterized by its fully learnable 3-D structure, is trained on this dataset using a two-stage procedure and exhibits impressive performance on both seen and unseen segmentation targets. We comprehensively evaluate SAM-Med3D on 16 datasets covering diverse medical scenarios, including different anatomical structures, modalities, targets, and zero-shot transferability to new/unseen tasks. The evaluation demonstrates the efficiency and efficacy of SAM-Med3D, as well as its promising application to diverse downstream tasks as a pretrained model. Our approach illustrates that substantial medical resources can be harnessed to develop a general-purpose medical AI for various potential applications. Our dataset, code, and models are available at: https://github.com/uni-medical/SAM-Med3D.

Mixed Modality Segmentation Whole Body Methodology In Silico Open Dataset Open Code Benchmark SOTA

CX-Mind: A Pioneering Multimodal Large Language Model for Interleaved Reasoning in Chest X-ray via Curriculum-Guided Reinforcement Learning

Wenjie Li, Yujie Zhang, Haoran Sun, Yueqi Li, Fanrui Zhang, Mengzhe Xu, Victoria Borja Clausich, Sade Mellin, Renhao Yang, Chenrun Wang, Jethro Zih-Shuo Wang, Shiyi Yao, Gen Li, Yidong Xu, Hanyu Wang, Yilin Huang, Angela Lin Wang, Chen Shi, Yin Zhang, Jianan Guo, Luqi Yang, Renxuan Li, Yang Xu, Jiawei Liu, Yao Zhang, Lei Liu, Carlos Gutiérrez SanRomán, Lei Wang

•preprint•Jul 31 2025

Chest X-ray (CXR) imaging is one of the most widely used diagnostic modalities in clinical practice, encompassing a broad spectrum of diagnostic tasks. Recent advancements have seen the extensive application of reasoning-based multimodal large language models (MLLMs) in medical imaging to enhance diagnostic efficiency and interpretability. However, existing multimodal models predominantly rely on "one-time" diagnostic approaches, lacking verifiable supervision of the reasoning process. This leads to challenges in multi-task CXR diagnosis, including lengthy reasoning, sparse rewards, and frequent hallucinations. To address these issues, we propose CX-Mind, the first generative model to achieve interleaved "think-answer" reasoning for CXR tasks, driven by curriculum-based reinforcement learning and verifiable process rewards (CuRL-VPR). Specifically, we constructed an instruction-tuning dataset, CX-Set, comprising 708,473 images and 2,619,148 samples, and generated 42,828 high-quality interleaved reasoning data points supervised by clinical reports. Optimization was conducted in two stages under the Group Relative Policy Optimization framework: initially stabilizing basic reasoning with closed-domain tasks, followed by transfer to open-domain diagnostics, incorporating rule-based conditional process rewards to bypass the need for pretrained reward models. Extensive experimental results demonstrate that CX-Mind significantly outperforms existing medical and general-domain MLLMs in visual understanding, text generation, and spatiotemporal alignment, achieving an average performance improvement of 25.1% over comparable CXR-specific models. On real-world clinical dataset (Rui-CXR), CX-Mind achieves a mean recall@1 across 14 diseases that substantially surpasses the second-best results, with multi-center expert evaluations further confirming its clinical utility across multiple dimensions.

X-Ray LLM Radiology Report Chest Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

Efficacy of image similarity as a metric for augmenting small dataset retinal image segmentation.

Wallace T, Heng IS, Subasic S, Messenger C

•papers•Jul 30 2025

Synthetic images are an option for augmenting limited medical imaging datasets to improve the performance of various machine learning models. A common metric for evaluating synthetic image quality is the Fréchet Inception Distance (FID) which measures the similarity of two image datasets. In this study we evaluate the relationship between this metric and the improvement which synthetic images, generated by a Progressively Growing Generative Adversarial Network (PGGAN), grant when augmenting Diabetes-related Macular Edema (DME) intraretinal fluid segmentation performed by a U-Net model with limited amounts of training data. We find that the behaviour of augmenting with standard and synthetic images agrees with previously conducted experiments. Additionally, we show that dissimilar (high FID) datasets do not improve segmentation significantly. As FID between the training and augmenting datasets decreases, the augmentation datasets are shown to contribute to significant and robust improvements in image segmentation. Finally, we find that there is significant evidence to suggest that synthetic and standard augmentations follow separate log-normal trends between FID and improvements in model performance, with synthetic data proving more effective than standard augmentation techniques. Our findings show that more similar datasets (lower FID) will be more effective at improving U-Net performance, however, the results also suggest that this improvement may only occur when images are sufficiently dissimilar.

OCT Segmentation Methodology In Silico Open Dataset

Validating an explainable radiomics approach in non-small cell lung cancer combining high energy physics with clinical and biological analyses.

Monteleone M, Camagni F, Percio S, Morelli L, Baroni G, Gennai S, Govoni P, Paganelli C

•papers•Jul 30 2025

This study aims at establishing a validation framework for an explainable radiomics-based model, specifically targeting classification of histopathological subtypes in non-small cell lung cancer (NSCLC) patients. We developed an explainable radiomics pipeline using open-access CT images from the cancer imaging archive (TCIA). Our approach incorporates three key prongs: SHAP-based feature selection for explainability within the radiomics pipeline, a technical validation of the explainable technique using high energy physics (HEP) data, and a biological validation using RNA-sequencing data and clinical observations. Our radiomic model achieved an accuracy of 0.84 in the classification of the histological subtype. The technical validation performed on the HEP domain over 150 numerically equivalent datasets, maintaining consistent sample size and class imbalance, confirmed the reliability of SHAP-based input features. Biological analysis found significant correlations between gene expression and CT-based radiomic features. In particular, gene MUC21 achieved the highest correlation with the radiomic feature describing the10th percentile of voxel intensities (r = 0.46, p < 0.05). This study presents a validation framework for explainable CT-based radiomics in lung cancer, combining HEP-driven technical validation with biological validation to enhance interpretability, reliability, and clinical relevance of XAI models.

CT Classification Chest Methodology In Silico Ethics Open Dataset

High-Resolution Ultrasound Data for AI-Based Segmentation in Mouse Brain Tumor.

Dorosti S, Landry T, Brewer K, Forbes A, Davis C, Brown J

•papers•Jul 30 2025

Glioblastoma multiforme (GBM) is the most aggressive type of brain cancer, making effective treatments essential to improve patient survival. To advance the understanding of GBM and develop more effective therapies, preclinical studies commonly use mouse models due to their genetic and physiological similarities to humans. In particular, the GL261 mouse glioma model is employed for its reproducible tumor growth and ability to mimic key aspects of human gliomas. Ultrasound imaging is a valuable modality in preclinical studies, offering real-time, non-invasive tumor monitoring and facilitating treatment response assessment. Furthermore, its potential therapeutic applications, such as in tumor ablation, expand its utility in preclinical studies. However, real-time segmentation of GL261 tumors during surgery introduces significant complexities, such as precise tumor boundary delineation and maintaining processing efficiency. Automated segmentation offers a solution, but its success relies on high-quality datasets with precise labeling. Our study introduces the first publicly available ultrasound dataset specifically developed to improve tumor segmentation in GL261 glioblastomas, providing 1,856 annotated images to support AI model development in preclinical research. This dataset bridges preclinical insights and clinical practice, laying the foundation for developing more accurate and effective tumor resection techniques.

Ultrasound Segmentation Neurological Dataset Release Phantom/Animal Open Dataset

Recovering Diagnostic Value: Super-Resolution-Aided Echocardiographic Classification in Resource-Constrained Imaging

Krishan Agyakari Raja Babu, Om Prabhu, Annu, Mohanasankar Sivaprakasam

•preprint•Jul 30 2025

Automated cardiac interpretation in resource-constrained settings (RCS) is often hindered by poor-quality echocardiographic imaging, limiting the effectiveness of downstream diagnostic models. While super-resolution (SR) techniques have shown promise in enhancing magnetic resonance imaging (MRI) and computed tomography (CT) scans, their application to echocardiography-a widely accessible but noise-prone modality-remains underexplored. In this work, we investigate the potential of deep learning-based SR to improve classification accuracy on low-quality 2D echocardiograms. Using the publicly available CAMUS dataset, we stratify samples by image quality and evaluate two clinically relevant tasks of varying complexity: a relatively simple Two-Chamber vs. Four-Chamber (2CH vs. 4CH) view classification and a more complex End-Diastole vs. End-Systole (ED vs. ES) phase classification. We apply two widely used SR models-Super-Resolution Generative Adversarial Network (SRGAN) and Super-Resolution Residual Network (SRResNet), to enhance poor-quality images and observe significant gains in performance metric-particularly with SRResNet, which also offers computational efficiency. Our findings demonstrate that SR can effectively recover diagnostic value in degraded echo scans, making it a viable tool for AI-assisted care in RCS, achieving more with less.

Ultrasound Classification Cardiac Methodology In Silico Benchmark SOTA Open Dataset

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Yutao Hu, Ying Zheng, Shumei Miao, Xiaolei Zhang, Jiahao Xia, Yaolei Qi, Yiyang Zhang, Yuting He, Qian Chen, Jing Ye, Hongyan Qiao, Xiuhua Hu, Lei Xu, Jiayin Zhang, Hui Liu, Minwen Zheng, Yining Wang, Daimin Zhang, Ji Zhang, Wenqi Shao, Yun Liu, Longjiang Zhang, Guanyu Yang

•preprint•Jul 29 2025

Foundation models have demonstrated remarkable potential in medical domain. However, their application to complex cardiovascular diagnostics remains underexplored. In this paper, we present Cardiac-CLIP, a multi-modal foundation model designed for 3D cardiac CT images. Cardiac-CLIP is developed through a two-stage pre-training strategy. The first stage employs a 3D masked autoencoder (MAE) to perform self-supervised representation learning from large-scale unlabeled volumetric data, enabling the visual encoder to capture rich anatomical and contextual features. In the second stage, contrastive learning is introduced to align visual and textual representations, facilitating cross-modal understanding. To support the pre-training, we collect 16641 real clinical CT scans, supplemented by 114k publicly available data. Meanwhile, we standardize free-text radiology reports into unified templates and construct the pathology vectors according to diagnostic attributes, based on which the soft-label matrix is generated to supervise the contrastive learning process. On the other hand, to comprehensively evaluate the effectiveness of Cardiac-CLIP, we collect 6,722 real-clinical data from 12 independent institutions, along with the open-source data to construct the evaluation dataset. Specifically, Cardiac-CLIP is comprehensively evaluated across multiple tasks, including cardiovascular abnormality classification, information retrieval and clinical analysis. Experimental results demonstrate that Cardiac-CLIP achieves state-of-the-art performance across various downstream tasks in both internal and external data. Particularly, Cardiac-CLIP exhibits great effectiveness in supporting complex clinical tasks such as the prospective prediction of acute coronary syndrome, which is notoriously difficult in real-world scenarios.

CT Classification Cardiac Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

AI generated annotations for Breast, Brain, Liver, Lungs, and Prostate cancer collections in the National Cancer Institute Imaging Data Commons.

Murugesan GK, McCrumb D, Soni R, Kumar J, Nuernberg L, Pei L, Wagner U, Granger S, Fedorov AY, Moore S, Van Oss J

•papers•Jul 29 2025

The Artificial Intelligence in Medical Imaging (AIMI) initiative aims to enhance the National Cancer Institute's (NCI) Image Data Commons (IDC) by releasing fully reproducible nnU-Net models, along with AI-assisted segmentation for cancer radiology images. In this extension of our earlier work, we created high-quality, AI-annotated imaging datasets for 11 IDC collections, spanning computed tomography (CT) and magnetic resonance imaging (MRI) of the lungs, breast, brain, kidneys, prostate, and liver. Each nnU-Net model was trained on open-source datasets, and a portion of the AI-generated annotations was reviewed and corrected by board-certified radiologists. Both the AI and radiologist annotations were encoded in compliance with the Digital Imaging and Communications in Medicine (DICOM) standard, ensuring seamless integration into the IDC collections. By making these models, images, and annotations publicly accessible, we aim to facilitate further research and development in cancer imaging.

Mixed Modality Segmentation Dataset Release In Silico Consortium Open Dataset Open Code Reproducibility

Evaluating the accuracy of artificial intelligence-powered chest X-ray diagnosis for paediatric pulmonary tuberculosis (EVAL-PAEDTBAID): Study protocol for a multi-centre diagnostic accuracy study.

Aurangzeb B, Robert D, Baard C, Qureshi AA, Shaheen A, Ambreen A, McFarlane D, Javed H, Bano I, Chiramal JA, Workman L, Pillay T, Franckling-Smith Z, Mustafa T, Andronikou S, Zar HJ

•papers•Jul 28 2025

Diagnosing pulmonary tuberculosis (PTB) in children is challenging owing to paucibacillary disease, non-specific symptoms and signs and challenges in microbiological confirmation. Chest X-ray (CXR) interpretation is fundamental for diagnosis and classifying disease as severe or non-severe. In adults with PTB, there is substantial evidence showing the usefulness of artificial intelligence (AI) in CXR interpretation, but very limited data exist in children. A prospective two-stage study of children with presumed PTB in three sites (one in South Africa and two in Pakistan) will be conducted. In stage I, eligible children will be enrolled and comprehensively investigated for PTB. A CXR radiological reference standard (RRS) will be established by an expert panel of blinded radiologists. CXRs will be classified into those with findings consistent with PTB or not based on RRS. Cases will be classified as confirmed, unconfirmed or unlikely PTB according to National Institutes of Health definitions. Data from 300 confirmed and unconfirmed PTB cases and 250 unlikely PTB cases will be collected. An AI-CXR algorithm (qXR) will be used to process CXRs. The primary endpoint will be sensitivity and specificity of AI to detect confirmed and unconfirmed PTB cases (composite reference standard); a secondary endpoint will be evaluated for confirmed PTB cases (microbiological reference standard). In stage II, a multi-reader multi-case study using a cross-over design will be conducted with 16 readers and 350 CXRs to assess the usefulness of AI-assisted CXR interpretation for readers (clinicians and radiologists). The primary endpoint will be the difference in the area under the receiver operating characteristic curve of readers with and without AI assistance in correctly classifying CXRs as per RRS. The study has been approved by a local institutional ethics committee at each site. Results will be published in academic journals and presented at conferences. Data will be made available as an open-source database. PACTR202502517486411.

X-Ray Classification Chest Prospective Clinical Pilot Academic Lab Open Dataset

AI-driven preclinical disease risk assessment using imaging in UK biobank.

Seletkov D, Starck S, Mueller TT, Zhang Y, Steinhelfer L, Rueckert D, Braren R

•papers•Jul 26 2025

Identifying disease risk and detecting disease before clinical symptoms appear are essential for early intervention and improving patient outcomes. In this context, the integration of medical imaging in a clinical workflow offers a unique advantage by capturing detailed structural and functional information. Unlike non-image data, such as lifestyle, sociodemographic, or prior medical conditions, which often rely on self-reported information susceptible to recall biases and subjective perceptions, imaging offers more objective and reliable insights. Although the use of medical imaging in artificial intelligence (AI)-driven risk assessment is growing, its full potential remains underutilized. In this work, we demonstrate how imaging can be integrated into routine screening workflows, in particular by taking advantage of neck-to-knee whole-body magnetic resonance imaging (MRI) data available in the large prospective study UK Biobank. Our analysis focuses on three-year risk assessment for a broad spectrum of diseases, including cardiovascular, digestive, metabolic, inflammatory, degenerative, and oncologic conditions. We evaluate AI-based pipelines for processing whole-body MRI and demonstrate that using image-derived radiomics features provides the best prediction performance, interpretability, and integration capability with non-image data.

MRI Classification Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Dataset

Filter Papers

Tags

SAM-Med3D: A Vision Foundation Model for General-Purpose Segmentation on Volumetric Medical Images.

CX-Mind: A Pioneering Multimodal Large Language Model for Interleaved Reasoning in Chest X-ray via Curriculum-Guided Reinforcement Learning

Efficacy of image similarity as a metric for augmenting small dataset retinal image segmentation.

Validating an explainable radiomics approach in non-small cell lung cancer combining high energy physics with clinical and biological analyses.

High-Resolution Ultrasound Data for AI-Based Segmentation in Mouse Brain Tumor.

Recovering Diagnostic Value: Super-Resolution-Aided Echocardiographic Classification in Resource-Constrained Imaging

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

AI generated annotations for Breast, Brain, Liver, Lungs, and Prostate cancer collections in the National Cancer Institute Imaging Data Commons.

Evaluating the accuracy of artificial intelligence-powered chest X-ray diagnosis for paediatric pulmonary tuberculosis (EVAL-PAEDTBAID): Study protocol for a multi-centre diagnostic accuracy study.

AI-driven preclinical disease risk assessment using imaging in UK biobank.

Ready to Sharpen Your Edge?