Latest Papers on Radiology AI. Tags: None

Performance of GPT-5 in Brain Tumor MRI Reasoning

Mojtaba Safari, Shansong Wang, Mingzhe Hu, Zach Eidex, Qiang Li, Xiaofeng Yang

•preprint•Aug 14 2025

Accurate differentiation of brain tumor types on magnetic resonance imaging (MRI) is critical for guiding treatment planning in neuro-oncology. Recent advances in large language models (LLMs) have enabled visual question answering (VQA) approaches that integrate image interpretation with natural language reasoning. In this study, we evaluated GPT-4o, GPT-5-nano, GPT-5-mini, and GPT-5 on a curated brain tumor VQA benchmark derived from 3 Brain Tumor Segmentation (BraTS) datasets - glioblastoma (GLI), meningioma (MEN), and brain metastases (MET). Each case included multi-sequence MRI triplanar mosaics and structured clinical features transformed into standardized VQA items. Models were assessed in a zero-shot chain-of-thought setting for accuracy on both visual and reasoning tasks. Results showed that GPT-5-mini achieved the highest macro-average accuracy (44.19%), followed by GPT-5 (43.71%), GPT-4o (41.49%), and GPT-5-nano (35.85%). Performance varied by tumor subtype, with no single model dominating across all cohorts. These findings suggest that GPT-5 family models can achieve moderate accuracy in structured neuro-oncological VQA tasks, but not at a level acceptable for clinical use.

MRI LLM Radiology Report Neurological Retrospective Clinical In Silico GenAI

SimAQ: Mitigating Experimental Artifacts in Soft X-Ray Tomography using Simulated Acquisitions

Jacob Egebjerg, Daniel Wüstner

•preprint•Aug 14 2025

Soft X-ray tomography (SXT) provides detailed structural insight into whole cells but is hindered by experimental artifacts such as the missing wedge and by limited availability of annotated datasets. We present \method, a simulation pipeline that generates realistic cellular phantoms and applies synthetic artifacts to produce paired noisy volumes, sinograms, and reconstructions. We validate our approach by training a neural network primarily on synthetic data and demonstrate effective few-shot and zero-shot transfer learning on real SXT tomograms. Our model delivers accurate segmentations, enabling quantitative analysis of noisy tomograms without relying on large labeled datasets or complex reconstruction methods.

CT Segmentation Whole Body Methodology In Silico Reproducibility

Medico 2025: Visual Question Answering for Gastrointestinal Imaging

Sushant Gautam, Vajira Thambawita, Michael Riegler, Pål Halvorsen, Steven Hicks

•preprint•Aug 14 2025

The Medico 2025 challenge addresses Visual Question Answering (VQA) for Gastrointestinal (GI) imaging, organized as part of the MediaEval task series. The challenge focuses on developing Explainable Artificial Intelligence (XAI) models that answer clinically relevant questions based on GI endoscopy images while providing interpretable justifications aligned with medical reasoning. It introduces two subtasks: (1) answering diverse types of visual questions using the Kvasir-VQA-x1 dataset, and (2) generating multimodal explanations to support clinical decision-making. The Kvasir-VQA-x1 dataset, created from 6,500 images and 159,549 complex question-answer (QA) pairs, serves as the benchmark for the challenge. By combining quantitative performance metrics and expert-reviewed explainability assessments, this task aims to advance trustworthy Artificial Intelligence (AI) in medical image analysis. Instructions, data access, and an updated guide for participation are available in the official competition repository: https://github.com/simula/MediaEval-Medico-2025

Mixed Modality Report Generation Abdominal Dataset Release In Silico Consortium Open Dataset

Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis.

Jiang Y, Lemarechal Y, Bafaro J, Abi-Rjeile J, Joubert P, Despres P, Manem V

•papers•Aug 14 2025

With the rapid development of artificial intelligence (AI), AI-assisted medical imaging analysis demonstrates remarkable performance in early lung cancer screening. However, the costly annotation process and privacy concerns limit the construction of large-scale medical datasets, hampering the further application of AI in healthcare. To address the data scarcity in lung cancer screening, we propose Lung-DDPM, a thoracic CT image synthesis approach that effectively generates high-fidelity 3D synthetic CT images, which prove helpful in downstream lung nodule segmentation tasks. Our method is based on semantic layout-guided denoising diffusion probabilistic models (DDPM), enabling anatomically reasonable, seamless, and consistent sample generation even from incomplete semantic layouts. Our results suggest that the proposed method outperforms other state-of-the-art (SOTA) generative models in image quality evaluation and downstream lung nodule segmentation tasks. Specifically, Lung-DDPM achieved superior performance on our large validation cohort, with a Fréchet inception distance (FID) of 0.0047, maximum mean discrepancy (MMD) of 0.0070, and mean squared error (MSE) of 0.0024. These results were 7.4×, 3.1×, and 29.5× better than the second-best competitors, respectively. Furthermore, the lung nodule segmentation model, trained on a dataset combining real and Lung-DDPM-generated synthetic samples, attained a Dice Coefficient (Dice) of 0.3914 and sensitivity of 0.4393. This represents 8.8% and 18.6% improvements in Dice and sensitivity compared to the model trained solely on real samples. The experimental results highlight Lung-DDPM's potential for a broader range of medical imaging applications, such as general tumor segmentation, cancer survival estimation, and risk prediction. The code and pretrained models are available at https://github.com/Manem-Lab/Lung-DDPM/.

CT Image Synthesis Chest Methodology In Silico Academic Lab Open Code

A novel hybrid convolutional and recurrent neural network model for automatic pituitary adenoma classification using dynamic contrast-enhanced MRI.

Motamed M, Bastam M, Tabatabaie SM, Elhaie M, Shahbazi-Gahrouei D

•papers•Aug 14 2025

Pituitary adenomas, ranging from subtle microadenomas to mass-effect macroadenomas, pose diagnostic challenges for radiologists due to increasing scan volumes and the complexity of dynamic contrast-enhanced MRI interpretation. A hybrid CNN-LSTM model was trained and validated on a multi-center dataset of 2,163 samples from Tehran and Babolsar, Iran. Transfer learning and preprocessing techniques (e.g., Wiener filters) were utilized to improve classification performance for microadenomas (< 10 mm) and macroadenomas (> 10 mm). The model achieved 90.5% accuracy, an area under the receiver operating characteristic curve (AUROC) of 0.92, and 89.6% sensitivity (93.5% for microadenomas, 88.3% for macroadenomas), outperforming standard CNNs by 5-18% across metrics. With a processing time of 0.17 s per scan, the model demonstrated robustness to variations in imaging conditions, including scanner differences and contrast variations, excelling in real-time detection and differentiation of adenoma subtypes. This dual-path approach, the first to synergize spatial and temporal MRI features for pituitary diagnostics, offers high precision and efficiency. Supported by comparisons with existing models, it provides a scalable, reproducible tool to improve patient outcomes, with potential adaptability to broader neuroimaging challenges.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

Optimized AI-based Neural Decoding from BOLD fMRI Signal for Analyzing Visual and Semantic ROIs in the Human Visual System.

Veronese L, Moglia A, Pecco N, Della Rosa P, Scifo P, Mainardi LT, Cerveri P

•papers•Aug 14 2025

AI-based neural decoding reconstructs visual perception by leveraging generative models to map brain activity measured through functional MRI (fMRI) into the observed visual stimulus. Traditionally, ridge linear models transform fMRI into a latent space, which is then decoded using variational autoencoders (VAE) or latent diffusion models (LDM). Owing to the complexity and noisiness of fMRI data, newer approaches split the reconstruction into two sequential stages, the first one providing a rough visual approximation using a VAE, the second one incorporating semantic information through the adoption of LDM guided by contrastive language-image pre-training (CLIP) embeddings. This work addressed some key scientific and technical gaps of the two-stage neural decoding by: 1) implementing a gated recurrent unit (GRU)-based architecture to establish a non-linear mapping between the fMRI signal and the VAE latent space, 2) optimizing the dimensionality of the VAE latent space, 3) systematically evaluating the contribution of the first reconstruction stage, and 4) analyzing the impact of different brain regions of interest (ROIs) on reconstruction quality. Experiments on the Natural Scenes Dataset, containing 73,000 unique natural images, along with fMRI of eight subjects, demonstrated that the proposed architecture maintained competitive performance while reducing the complexity of its first stage by 85%. The sensitivity analysis showcased that the first reconstruction stage is essential for preserving high structural similarity in the final reconstructions. Restricting analysis to semantic ROIs, while excluding early visual areas, diminished visual coherence, preserving semantics though. The inter-subject repeatability across ROIs was about 92 and 98% for visual and sematic metrics, respectively. This study represents a key step toward optimized neural decoding architectures leveraging non-linear models for stimulus prediction. Sensitivity analysis highlighted the interplay between the two reconstruction stages, while ROI-based analysis provided strong evidence that the two-stage AI model reflects the brain's hierarchical processing of visual information.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab

Multimodal artificial intelligence for subepithelial lesion classification and characterization: a multicenter comparative study (with video).

Li J, Jing X, Zhang Q, Wang X, Wang L, Shan J, Zhou Z, Fan L, Gong X, Sun X, He S

•papers•Aug 14 2025

Subepithelial lesions (SELs) present significant diagnostic challenges in gastrointestinal endoscopy, particularly in differentiating malignant types, such as gastrointestinal stromal tumors (GISTs) and neuroendocrine tumors, from benign types like leiomyomas. Misdiagnosis can lead to unnecessary interventions or delayed treatment. To address this challenge, we developed ECMAI-WME, a parallel fusion deep learning model integrating white light endoscopy (WLE) and microprobe endoscopic ultrasonography (EUS), to improve SEL classification and lesion characterization. A total of 523 SELs from four hospitals were used to develop serial and parallel fusion AI models. The Parallel Model, demonstrating superior performance, was designated as ECMAI-WME. The model was tested on an external validation cohort (n = 88) and a multicenter test cohort (n = 274). Diagnostic performance, lesion characterization, and clinical decision-making support were comprehensively evaluated and compared with endoscopists' performance. The ECMAI-WME model significantly outperformed endoscopists in diagnostic accuracy (96.35% vs. 63.87-86.13%, p < 0.001) and treatment decision-making accuracy (96.35% vs. 78.47-86.13%, p < 0.001). It achieved 98.72% accuracy in internal validation, 94.32% in external validation, and 96.35% in multicenter testing. For distinguishing gastric GISTs from leiomyomas, the model reached 91.49% sensitivity, 100% specificity, and 96.38% accuracy. Lesion characteristics were identified with a mean accuracy of 94.81% (range: 90.51-99.27%). The model maintained robust performance despite class imbalance, confirmed by five complementary analyses. Subgroup analyses showed consistent accuracy across lesion size, location, or type (p > 0.05), demonstrating strong generalizability. The ECMAI-WME model demonstrates excellent diagnostic performance and robustness in the multiclass SEL classification and characterization, supporting its potential for real-time deployment to enhance diagnostic consistency and guide clinical decision-making.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab

Contrast-enhanced ultrasound radiomics model for predicting axillary lymph node metastasis and prognosis in breast cancer: a multicenter study.

Li SY, Li YM, Fang YQ, Jin ZY, Li JK, Zou XM, Huang SS, Niu RL, Fu NQ, Shao YH, Gong XT, Li MR, Wang W, Wang ZL

•papers•Aug 14 2025

To construct a multimodal ultrasound (US) radiomics model for predicting axillary lymph node metastasis (ALNM) in breast cancer and evaluated its application value in predicting ALNM and patient prognosis. From March 2014 to December 2022, data from 682 breast cancer patients from four hospitals were collected, including preoperative grayscale US, color Doppler flow imaging (CDFI), contrast-enhanced ultrasound (CEUS) imaging data, and clinical information. Data from the First Medical Center of PLA General Hospital were used as the training and internal validation sets, while data from Peking University First Hospital, the Cancer Hospital of the Chinese Academy of Medical Sciences, and the Fourth Medical Center of PLA General Hospital were used as the external validation set. LASSO regression was employed to select radiomic features (RFs), while eight machine learning algorithms were utilized to construct radiomic models based on US, CDFI, and CEUS. The prediction efficiency of ALNM was assessed to identify the optimal model. In the meantime, Radscore was computed and integrated with immunoinflammatory markers to forecast Disease-Free Survival (DFS) in breast cancer patients. Follow-up methods included telephone outreach and in-person hospital visits. The analysis employed Cox regression to pinpoint prognostic factors, while clinical-imaging models were developed accordingly. The performance of the model was evaluated using the C-index, Receiver Operating Characteristic (ROC) curves, calibration curves, and Decision Curve Analysis (DCA). In the training cohort (n = 400), 40% of patients had ALNM, with a mean age of 55 ± 10 years. The US + CDFI + CEUS-based radiomics model achieved Area Under the Curves (AUCs) of 0.88, 0.81, and 0.77 for predicting N0 versus N+ (≥ 1) in the training, internal, and external validation sets, respectively, outperforming the US-only model (P < 0.05). For distinguishing N+ (1-2) from N+ (≥ 3), the model achieved AUCs of 0.89, 0.74, and 0.75. Combining radiomics scores with clinical immunoinflammatory markers (platelet count and neutrophil-to-lymphocyte ratio) yielded a clinical-radiomics model predicting disease-free survival (DFS), with C-indices of 0.80, 0.73, and 0.79 across the three cohorts. In the external validation cohort, the clinical-radiomics model achieved higher AUCs for predicting 2-, 3-, and 5-year DFS compared to the clinical model alone (2-year: 0.79 vs. 0.66; 3-year: 0.83 vs. 0.70; 5-year: 0.78 vs. 0.64; all P < 0.05). Calibration and decision curve analyses demonstrated good model agreement and clinical utility. The multimodal ultrasound radiomics model based on US, CDFI, and CEUS could effectively predict ALNM in breast cancer. Furthermore, the combined application of radiomics and immune inflammation markers might predict the DFS of breast cancer patients to some extent.

Ultrasound Classification Breast Retrospective Clinical In Silico Academic Lab

Artificial Intelligence based fractional flow reserve.

Bednarek A, Gąsior P, Jaguszewski M, Buszman PP, Milewski K, Hawranek M, Gil R, Wojakowski W, Kochman J, Tomaniak M

•papers•Aug 14 2025

Fractional flow reserve (FFR) - a physiological indicator of coronary stenosis significance - has now become a widely used parameter also in the guidance of percutaneous coronary intervention (PCI). Several studies have shown the superiority of FFR compared to visual assessment, contributing to the reduction in clinical endpoints. However, the current approach to FFR assessment requires coronary instrumentation with a dedicated pressure wire and thus increasing invasiveness, cost, and duration of the procedure. Alternative, noninvasive methods of FFR assessment based on computational fluid dynamics are being widely tested; these approaches are generally not fully automated and may sometimes require substantial computational power. Nowadays, one of the most rapidly expanding fields in medicine is the use of artificial intelligence (AI) in therapy optimization, diagnosis, treatment, and risk stratification. AI usage contributes to the development of more sophisticated methods of imaging analysis and allows for the derivation of clinically important parameters in a faster and more accurate way. Over the recent years, AI utility in deriving FFR in a noninvasive manner has been increasingly reported. In this review, we critically summarize current knowledge in the field of AI-derived FFR based on data from computed tomography angiography, invasive angiography, optical coherence tomography, and intravascular ultrasound. Available solutions, possible future directions in optimizing cathlab performance, including the use of mixed reality, as well as current limitations standing behind the wide adoption of these techniques, are overviewed.

Mixed Modality Classification Cardiac Review In Silico Academic Lab GenAI

Enhancing cardiac MRI reliability at 3 T using motion-adaptive B0 shimming.

Huang Y, Malagi AV, Li X, Guan X, Yang CC, Huang LT, Long Z, Zepeda J, Zhang X, Yoosefian G, Bi X, Gao C, Shang Y, Binesh N, Lee HL, Li D, Dharmakumar R, Han H, Yang HR

•papers•Aug 14 2025

Magnetic susceptibility differences at the heart-lung interface introduce B0-field inhomogeneities that challenge cardiac MRI at high field strengths (≥ 3 T). Although hardware-based shimming has advanced, conventional approaches often neglect dynamic variations in thoracic anatomy caused by cardiac and respiratory motion, leading to residual off-resonance artifacts. This study aims to characterize motion-induced B0-field fluctuations in the heart and evaluate a deep learning-enabled motion-adaptive B0 shimming pipeline to mitigate them. A motion-resolved B0 mapping sequence was implemented at 3 T to quantify cardiac and respiratory-induced B0 variations. A motion-adaptive shimming framework was then developed and validated through numerical simulations and human imaging studies. B0-field homogeneity and T2* mapping accuracy were assessed in multiple breath-hold positions using standard and motion-adaptive shimming. Respiratory motion significantly altered myocardial B0 fields (p < 0.01), whereas cardiac motion had minimal impact (p = 0.49). Compared with conventional scanner shimming, motion-adaptive B0 shimming yielded significantly improved field uniformity across both inspiratory (post-shim SDratio: 0.68 ± 0.10 vs. 0.89 ± 0.11; p < 0.05) and expiratory (0.65 ± 0.16 vs. 0.84 ± 0.20; p < 0.05) breath-hold states. Corresponding improvements in myocardial T2* map homogeneity were observed, with reduced coefficient of variation (0.44 ± 0.19 vs. 0.39 ± 0.22; 0.59 ± 0.30 vs. 0.46 ± 0.21; both p < 0.01). The proposed motion-adaptive B0 shimming approach effectively compensates for respiration-induced B0 fluctuations, enhancing field homogeneity and reducing off-resonance artifacts. This strategy improves the robustness and reproducibility of T2* mapping, enabling more reliable high-field cardiac MRI.

MRI Reconstruction Cardiac Methodology In Silico

Filter Papers

Tags

Performance of GPT-5 in Brain Tumor MRI Reasoning

SimAQ: Mitigating Experimental Artifacts in Soft X-Ray Tomography using Simulated Acquisitions

Medico 2025: Visual Question Answering for Gastrointestinal Imaging

Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image Synthesis.

A novel hybrid convolutional and recurrent neural network model for automatic pituitary adenoma classification using dynamic contrast-enhanced MRI.

Optimized AI-based Neural Decoding from BOLD fMRI Signal for Analyzing Visual and Semantic ROIs in the Human Visual System.

Multimodal artificial intelligence for subepithelial lesion classification and characterization: a multicenter comparative study (with video).

Contrast-enhanced ultrasound radiomics model for predicting axillary lymph node metastasis and prognosis in breast cancer: a multicenter study.

Artificial Intelligence based fractional flow reserve.

Enhancing cardiac MRI reliability at 3 T using motion-adaptive B<sub>0</sub> shimming.

Ready to Sharpen Your Edge?