Latest Papers on Radiology AI. Tags: Image Synthesis

Synthetizing SWI from 3T to 7T by generative diffusion network for deep medullary veins visualization.

Li S, Deng X, Li Q, Zhen Z, Han L, Chen K, Zhou C, Chen F, Huang P, Zhang R, Chen H, Zhang T, Chen W, Tan T, Liu C

•papers•Sep 19 2025

Ultrahigh-field susceptibility-weighted imaging (SWI) provides excellent tissue contrast and anatomical details of brain. However, ultrahigh-field magnetic resonance (MR) scanner often expensive and provides uncomfortable noise experience for patient. Therefore, some deep learning approaches have been proposed to synthesis high-field MR images from low-filed MR images, most existing methods rely on generative adversarial network (GAN) and achieve acceptable results. While the dilemma in train process of GAN, generally recognized, limits the synthesis performance in SWI images for its microvascular structure. Diffusion models, as a promising alternative, indirectly characterize the gaussian noise to the target image with a slow sampling through a considerable number of steps. To address this limitation, we presented a generative diffusion-based deep learning imaging model, named conditional denoising diffusion probabilistic model (CDDPM), for synthesizing high-field (7 Tesla) SWI images form low-field (3 Tesla) SWI images and assess clinical applicability. Crucially, the experiment results demonstrate that the diffusion-based model that synthesizes 7T SWI from 3T SWI images is potentially to providing an alternative way to achieve the advantages of ultra-high field 7T MR images for deep medullary veins visualization.

MRI Image Synthesis Neurological Methodology In Silico

Radiology Report Conditional 3D CT Generation with Multi Encoder Latent diffusion Model

Sina Amirrajab, Zohaib Salahuddin, Sheng Kuang, Henry C. Woodruff, Philippe Lambin

•preprint•Sep 18 2025

Text to image latent diffusion models have recently advanced medical image synthesis, but applications to 3D CT generation remain limited. Existing approaches rely on simplified prompts, neglecting the rich semantic detail in full radiology reports, which reduces text image alignment and clinical fidelity. We propose Report2CT, a radiology report conditional latent diffusion framework for synthesizing 3D chest CT volumes directly from free text radiology reports, incorporating both findings and impression sections using multiple text encoder. Report2CT integrates three pretrained medical text encoders (BiomedVLP CXR BERT, MedEmbed, and ClinicalBERT) to capture nuanced clinical context. Radiology reports and voxel spacing information condition a 3D latent diffusion model trained on 20000 CT volumes from the CT RATE dataset. Model performance was evaluated using Frechet Inception Distance (FID) for real synthetic distributional similarity and CLIP based metrics for semantic alignment, with additional qualitative and quantitative comparisons against GenerateCT model. Report2CT generated anatomically consistent CT volumes with excellent visual quality and text image alignment. Multi encoder conditioning improved CLIP scores, indicating stronger preservation of fine grained clinical details in the free text radiology reports. Classifier free guidance further enhanced alignment with only a minor trade off in FID. We ranked first in the VLM3D Challenge at MICCAI 2025 on Text Conditional CT Generation and achieved state of the art performance across all evaluation metrics. By leveraging complete radiology reports and multi encoder text conditioning, Report2CT advances 3D CT synthesis, producing clinically faithful and high quality synthetic data.

CT Image Synthesis Chest Methodology In Silico Academic Lab GenAI Benchmark SOTA

A Deep Learning Framework for Synthesizing Longitudinal Infant Brain MRI during Early Development.

Fang Y, Xiong H, Huang J, Liu F, Shen Z, Cai X, Zhang H, Wang Q

•papers•Sep 17 2025

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a three-stage, age-and modality-conditioned framework to synthesize longitudinal infant brain MRI scans, and account for rapid structural and contrast changes during early brain development. Materials and Methods This retrospective study used T1- and T2-weighted MRI scans (848 scans) from 139 infants in the Baby Connectome Project, collected since September 2016. The framework models three critical image cues related: volumetric expansion, cortical folding, and myelination, predicting missing time points with age and modality as predictive factors. The method was compared with LGAN, CounterSyn, and Diffusion-based approach using peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM) and the Dice similarity coefficient (DSC). Results The framework was trained on 119 participants (average age: 11.25 ± 6.16 months, 60 female, 59 male) and tested on 20 (average age: 12.98 ± 6.59 months, 11 female, 9 male). For T1-weighted images, PSNRs were 25.44 ± 1.95 and 26.93 ± 2.50 for forward and backward MRI synthesis, and SSIMs of 0.87 ± 0.03 and 0.90 ± 0.02. For T2-weighted images, PSNRs were 26.35 ± 2.30 and 26.40 ± 2.56, with SSIMs of 0.87 ± 0.03 and 0.89 ± 0.02, significantly outperforming competing methods (P < .001). The framework also excelled in tissue segmentation (P < .001) and cortical reconstruction, achieving DSC of 0.85 for gray matter and 0.86 for white matter, with intraclass correlation coefficients exceeding 0.8 in most cortical regions. Conclusion The proposed three-stage framework effectively synthesized age-specific infant brain MRI scans, outperforming competing methods in image quality and tissue segmentation with strong performance in cortical reconstruction, demonstrating potential for developmental modeling and longitudinal analyses. ©RSNA, 2025.

MRI Image Synthesis Neurological Retrospective Clinical In Silico Benchmark SOTA

Exploring the Capabilities of LLM Encoders for Image-Text Retrieval in Chest X-rays

Hanbin Ko, Gihun Cho, Inhyeok Baek, Donguk Kim, Joonbeom Koo, Changi Kim, Dongheon Lee, Chang Min Park

•preprint•Sep 17 2025

Vision-language pretraining has advanced image-text alignment, yet progress in radiology remains constrained by the heterogeneity of clinical reports, including abbreviations, impression-only notes, and stylistic variability. Unlike general-domain settings where more data often leads to better performance, naively scaling to large collections of noisy reports can plateau or even degrade model learning. We ask whether large language model (LLM) encoders can provide robust clinical representations that transfer across diverse styles and better guide image-text alignment. We introduce LLM2VEC4CXR, a domain-adapted LLM encoder for chest X-ray reports, and LLM2CLIP4CXR, a dual-tower framework that couples this encoder with a vision backbone. LLM2VEC4CXR improves clinical text understanding over BERT-based baselines, handles abbreviations and style variation, and achieves strong clinical alignment on report-level metrics. LLM2CLIP4CXR leverages these embeddings to boost retrieval accuracy and clinically oriented scores, with stronger cross-dataset generalization than prior medical CLIP variants. Trained on 1.6M CXR studies from public and private sources with heterogeneous and noisy reports, our models demonstrate that robustness -- not scale alone -- is the key to effective multimodal learning. We release models to support further research in medical image-text representation learning.

X-Ray Image Synthesis Chest Methodology In Silico Academic Lab Open Code

Data fusion of medical imaging in neurological disorders.

Mirzaei G, Gupta A, Adeli H

•papers•Sep 16 2025

Medical imaging plays a crucial role in the accurate diagnosis and prognosis of various medical conditions, with each modality offering unique and complementary insights into the body's structure and function. However, no single imaging technique can capture the full spectrum of necessary information. Data fusion has emerged as a powerful tool to integrate information from different perspectives, including multiple modalities, views, temporal sequences, and spatial scales. By combining data, fusion techniques provide a more comprehensive understanding, significantly enhancing the precision and reliability of clinical analyses. This paper presents an overview of data fusion approaches - covering multi-view, multi-modal, and multi-scale strategies - across imaging modalities such as MRI, CT, PET, SPECT, EEG, and MEG, with a particular emphasis on applications in neurological disorders. Furthermore, we highlight the latest advancements in data fusion methods and key studies published since 2016, illustrating the progress and growing impact of this interdisciplinary field.

Mixed Modality Image Synthesis Neurological Review Concept

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

Sajad Amiri, Shahram Taeb, Sara Gharibi, Setareh Dehghanfard, Somayeh Sadat Mehrnia, Mehrdad Oveisi, Ilker Hacihaliloglu, Arman Rahmim, Mohammad R. Salmanpour

•preprint•Sep 13 2025

Gadolinium-based contrast agents (GBCAs) are central to glioma imaging but raise safety, cost, and accessibility concerns. Predicting contrast enhancement from non-contrast MRI using machine learning (ML) offers a safer alternative, as enhancement reflects tumor aggressiveness and informs treatment planning. Yet scanner and cohort variability hinder robust model selection. We propose a stability-aware framework to identify reproducible ML pipelines for multicenter prediction of glioma MRI contrast enhancement. We analyzed 1,446 glioma cases from four TCIA datasets (UCSF-PDGM, UPENN-GB, BRATS-Africa, BRATS-TCGA-LGG). Non-contrast T1WI served as input, with enhancement derived from paired post-contrast T1WI. Using PyRadiomics under IBSI standards, 108 features were extracted and combined with 48 dimensionality reduction methods and 25 classifiers, yielding 1,200 pipelines. Rotational validation was trained on three datasets and tested on the fourth. Cross-validation prediction accuracies ranged from 0.91 to 0.96, with external testing achieving 0.87 (UCSF-PDGM), 0.98 (UPENN-GB), and 0.95 (BRATS-Africa), with an average of 0.93. F1, precision, and recall were stable (0.87 to 0.96), while ROC-AUC varied more widely (0.50 to 0.82), reflecting cohort heterogeneity. The MI linked with ETr pipeline consistently ranked highest, balancing accuracy and stability. This framework demonstrates that stability-aware model selection enables reliable prediction of contrast enhancement from non-contrast glioma MRI, reducing reliance on GBCAs and improving generalizability across centers. It provides a scalable template for reproducible ML in neuro-oncology and beyond.

MRI Image Synthesis Neurological Retrospective Clinical In Silico Academic Lab Reproducibility Benchmark SOTA

Mechanistic Learning with Guided Diffusion Models to Predict Spatio-Temporal Brain Tumor Growth

Daria Laslo, Efthymios Georgiou, Marius George Linguraru, Andreas Rauschecker, Sabine Muller, Catherine R. Jutzeler, Sarah Bruningk

•preprint•Sep 11 2025

Predicting the spatio-temporal progression of brain tumors is essential for guiding clinical decisions in neuro-oncology. We propose a hybrid mechanistic learning framework that combines a mathematical tumor growth model with a guided denoising diffusion implicit model (DDIM) to synthesize anatomically feasible future MRIs from preceding scans. The mechanistic model, formulated as a system of ordinary differential equations, captures temporal tumor dynamics including radiotherapy effects and estimates future tumor burden. These estimates condition a gradient-guided DDIM, enabling image synthesis that aligns with both predicted growth and patient anatomy. We train our model on the BraTS adult and pediatric glioma datasets and evaluate on 60 axial slices of in-house longitudinal pediatric diffuse midline glioma (DMG) cases. Our framework generates realistic follow-up scans based on spatial similarity metrics. It also introduces tumor growth probability maps, which capture both clinically relevant extent and directionality of tumor growth as shown by 95th percentile Hausdorff Distance. The method enables biologically informed image generation in data-limited scenarios, offering generative-space-time predictions that account for mechanistic priors.

MRI Image Synthesis Neurological Methodology In Silico GenAI

FlexiD-Fuse: Flexible number of inputs multi-modal medical image fusion based on diffusion model

Yushen Xu, Xiaosong Li, Yuchun Wang, Xiaoqi Cheng, Huafeng Li, Haishu Tan

•preprint•Sep 11 2025

Different modalities of medical images provide unique physiological and anatomical information for diseases. Multi-modal medical image fusion integrates useful information from different complementary medical images with different modalities, producing a fused image that comprehensively and objectively reflects lesion characteristics to assist doctors in clinical diagnosis. However, existing fusion methods can only handle a fixed number of modality inputs, such as accepting only two-modal or tri-modal inputs, and cannot directly process varying input quantities, which hinders their application in clinical settings. To tackle this issue, we introduce FlexiD-Fuse, a diffusion-based image fusion network designed to accommodate flexible quantities of input modalities. It can end-to-end process two-modal and tri-modal medical image fusion under the same weight. FlexiD-Fuse transforms the diffusion fusion problem, which supports only fixed-condition inputs, into a maximum likelihood estimation problem based on the diffusion process and hierarchical Bayesian modeling. By incorporating the Expectation-Maximization algorithm into the diffusion sampling iteration process, FlexiD-Fuse can generate high-quality fused images with cross-modal information from source images, independently of the number of input images. We compared the latest two and tri-modal medical image fusion methods, tested them on Harvard datasets, and evaluated them using nine popular metrics. The experimental results show that our method achieves the best performance in medical image fusion with varying inputs. Meanwhile, we conducted extensive extension experiments on infrared-visible, multi-exposure, and multi-focus image fusion tasks with arbitrary numbers, and compared them with the perspective SOTA methods. The results of the extension experiments consistently demonstrate the effectiveness and superiority of our method.

Mixed Modality Image Synthesis Methodology In Silico

Virtual staining for 3D X-ray histology of bone implants

Sarah C. Irvine, Christian Lucas, Diana Krüger, Bianca Guedert, Julian Moosmann, Berit Zeller-Plumhoff

•preprint•Sep 11 2025

Three-dimensional X-ray histology techniques offer a non-invasive alternative to conventional 2D histology, enabling volumetric imaging of biological tissues without the need for physical sectioning or chemical staining. However, the inherent greyscale image contrast of X-ray tomography limits its biochemical specificity compared to traditional histological stains. Within digital pathology, deep learning-based virtual staining has demonstrated utility in simulating stained appearances from label-free optical images. In this study, we extend virtual staining to the X-ray domain by applying cross-modality image translation to generate artificially stained slices from synchrotron-radiation-based micro-CT scans. Using over 50 co-registered image pairs of micro-CT and toluidine blue-stained histology from bone-implant samples, we trained a modified CycleGAN network tailored for limited paired data. Whole slide histology images were downsampled to match the voxel size of the CT data, with on-the-fly data augmentation for patch-based training. The model incorporates pixelwise supervision and greyscale consistency terms, producing histologically realistic colour outputs while preserving high-resolution structural detail. Our method outperformed Pix2Pix and standard CycleGAN baselines across SSIM, PSNR, and LPIPS metrics. Once trained, the model can be applied to full CT volumes to generate virtually stained 3D datasets, enhancing interpretability without additional sample preparation. While features such as new bone formation were able to be reproduced, some variability in the depiction of implant degradation layers highlights the need for further training data and refinement. This work introduces virtual staining to 3D X-ray imaging and offers a scalable route for chemically informative, label-free tissue characterisation in biomedical research.

CT Image Synthesis Musculoskeletal Methodology In Silico Academic Lab GenAI

RoentMod: A Synthetic Chest X-Ray Modification Model to Identify and Correct Image Interpretation Model Shortcuts

Lauren H. Cooke, Matthias Jung, Jan M. Brendel, Nora M. Kerkovits, Borek Foldyna, Michael T. Lu, Vineet K. Raghu

•preprint•Sep 10 2025

Chest radiographs (CXRs) are among the most common tests in medicine. Automated image interpretation may reduce radiologists\' workload and expand access to diagnostic expertise. Deep learning multi-task and foundation models have shown strong performance for CXR interpretation but are vulnerable to shortcut learning, where models rely on spurious and off-target correlations rather than clinically relevant features to make decisions. We introduce RoentMod, a counterfactual image editing framework that generates anatomically realistic CXRs with user-specified, synthetic pathology while preserving unrelated anatomical features of the original scan. RoentMod combines an open-source medical image generator (RoentGen) with an image-to-image modification model without requiring retraining. In reader studies with board-certified radiologists and radiology residents, RoentMod-produced images appeared realistic in 93\% of cases, correctly incorporated the specified finding in 89-99\% of cases, and preserved native anatomy comparable to real follow-up CXRs. Using RoentMod, we demonstrate that state-of-the-art multi-task and foundation models frequently exploit off-target pathology as shortcuts, limiting their specificity. Incorporating RoentMod-generated counterfactual images during training mitigated this vulnerability, improving model discrimination across multiple pathologies by 3-19\% AUC in internal validation and by 1-11\% for 5 out of 6 tested pathologies in external testing. These findings establish RoentMod as a broadly applicable tool for probing and correcting shortcut learning in medical AI. By enabling controlled counterfactual interventions, RoentMod enhances the robustness and interpretability of CXR interpretation models and provides a generalizable strategy for improving foundation models in medical imaging.

X-Ray Image Synthesis Chest Methodology In Silico Academic Lab GenAI Benchmark SOTA Open Code

Filter Papers

Tags

Synthetizing SWI from 3T to 7T by generative diffusion network for deep medullary veins visualization.

Radiology Report Conditional 3D CT Generation with Multi Encoder Latent diffusion Model

A Deep Learning Framework for Synthesizing Longitudinal Infant Brain MRI during Early Development.

Exploring the Capabilities of LLM Encoders for Image-Text Retrieval in Chest X-rays

Data fusion of medical imaging in neurological disorders.

Enhancement Without Contrast: Stability-Aware Multicenter Machine Learning for Glioma MRI Imaging

Mechanistic Learning with Guided Diffusion Models to Predict Spatio-Temporal Brain Tumor Growth

FlexiD-Fuse: Flexible number of inputs multi-modal medical image fusion based on diffusion model

Virtual staining for 3D X-ray histology of bone implants

RoentMod: A Synthetic Chest X-Ray Modification Model to Identify and Correct Image Interpretation Model Shortcuts

Ready to Sharpen Your Edge?