Latest Papers on Radiology AI. Tags: Image Synthesis

Multi-modal MRI synthesis with conditional latent diffusion models for data augmentation in tumor segmentation.

Kebaili A, Lapuyade-Lahorgue J, Vera P, Ruan S

•papers•Jul 1 2025

Multimodality is often necessary for improving object segmentation tasks, especially in the case of multilabel tasks, such as tumor segmentation, which is crucial for clinical diagnosis and treatment planning. However, a major challenge in utilizing multimodality with deep learning remains: the limited availability of annotated training data, primarily due to the time-consuming acquisition process and the necessity for expert annotations. Although deep learning has significantly advanced many tasks in medical imaging, conventional augmentation techniques are often insufficient due to the inherent complexity of volumetric medical data. To address this problem, we propose an innovative slice-based latent diffusion architecture for the generation of 3D multi-modal images and their corresponding multi-label masks. Our approach enables the simultaneous generation of the image and mask in a slice-by-slice fashion, leveraging a positional encoding and a Latent Aggregation module to maintain spatial coherence and capture slice sequentiality. This method effectively reduces the computational complexity and memory demands typically associated with diffusion models. Additionally, we condition our architecture on tumor characteristics to generate a diverse array of tumor variations and enhance texture using a refining module that acts like a super-resolution mechanism, mitigating the inherent blurriness caused by data scarcity in the autoencoder. We evaluate the effectiveness of our synthesized volumes using the BRATS2021 dataset to segment the tumor with three tissue labels and compare them with other state-of-the-art diffusion models through a downstream segmentation task, demonstrating the superior performance and efficiency of our method. While our primary application is tumor segmentation, this method can be readily adapted to other modalities. Code is available here : https://github.com/Arksyd96/multi-modal-mri-and-mask-synthesis-with-conditional-slice-based-ldm.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab Open Code

A systematic review of generative AI approaches for medical image enhancement: Comparing GANs, transformers, and diffusion models.

Oulmalme C, Nakouri H, Jaafar F

•papers•Jul 1 2025

Medical imaging is a vital diagnostic tool that provides detailed insights into human anatomy but faces challenges affecting its accuracy and efficiency. Advanced generative AI models offer promising solutions. Unlike previous reviews with a narrow focus, a comprehensive evaluation across techniques and modalities is necessary. This systematic review integrates the three state-of-the-art leading approaches, GANs, Diffusion Models, and Transformers, examining their applicability, methodologies, and clinical implications in improving medical image quality. Using the PRISMA framework, 63 studies from 989 were selected via Google Scholar and PubMed, focusing on GANs, Transformers, and Diffusion Models. Articles from ACM, IEEE Xplore, and Springer were analyzed. Generative AI techniques show promise in improving image resolution, reducing noise, and enhancing fidelity. GANs generate high-quality images, Transformers utilize global context, and Diffusion Models are effective in denoising and reconstruction. Challenges include high computational costs, limited dataset diversity, and issues with generalizability, with a focus on quantitative metrics over clinical applicability. This review highlights the transformative impact of GANs, Transformers, and Diffusion Models in advancing medical imaging. Future research must address computational and generalization challenges, emphasize open science, and validate these techniques in diverse clinical settings to unlock their full potential. These efforts could enhance diagnostic accuracy, lower costs, and improve patient outcome.

Mixed Modality Image Synthesis Review Concept Academic Lab GenAI Open Dataset

Quantitative Ischemic Lesions of Portable Low-Field Strength MRI Using Deep Learning-Based Super-Resolution.

Bian Y, Wang L, Li J, Yang X, Wang E, Li Y, Liu Y, Xiang L, Yang Q

•papers•Jul 1 2025

Deep learning-based synthetic super-resolution magnetic resonance imaging (SynthMRI) may improve the quantitative lesion performance of portable low-field strength magnetic resonance imaging (LF-MRI). The aim of this study is to evaluate whether SynthMRI improves the diagnostic performance of LF-MRI in assessing ischemic lesions. We retrospectively included 178 stroke patients and 104 healthy controls with both LF-MRI and high-field strength magnetic resonance imaging (HF-MRI) examinations. Using HF-MRI as the ground truth, the deep learning-based super-resolution framework (SCUNet [Swin-Conv-UNet]) was pretrained using large-scale open-source data sets to generate SynthMRI images from LF-MRI images. Participants were split into a training set (64.2%) to fine-tune the pretrained SCUNet, and a testing set (35.8%) to evaluate the performance of SynthMRI. Sensitivity and specificity of LF-MRI and SynthMRI were assessed. Agreement with HF-MRI for Alberta Stroke Program Early CT Score in the anterior and posterior circulation (diffusion-weighted imaging-Alberta Stroke Program Early CT Score and diffusion-weighted imaging-posterior circulation Alberta Stroke Program Early CT Score) was evaluated using intraclass correlation coefficients (ICCs). Agreement with HF-MRI for lesion volume and mean apparent diffusion coefficient (ADC) within lesions was assessed using both ICCs and Pearson correlation coefficients. SynthMRI demonstrated significantly higher sensitivity and specificity than LF-MRI (89.0% [83.3%-94.6%] versus 77.1% [69.5%-84.7%]; P<0.001 and 91.3% [84.7%-98.0%] versus 71.0% [60.3%-81.7%]; P<0.001, respectively). The ICCs of diffusion-weighted imaging-Alberta Stroke Program Early CT Score between SynthMRI and HF-MRI were also better than that between LF-MRI and HF-MRI (0.952 [0.920-0.972] versus 0.797 [0.678-0.876], P<0.001). For lesion volume and mean apparent diffusion coefficient within lesions, SynthMRI showed significantly higher agreement (P<0.001) with HF-MRI (ICC>0.85, r>0.78) than LF-MRI (ICC>0.45, r>0.35). Furthermore, for lesions during various poststroke phases, SynthMRI exhibited significantly higher agreement with HF-MRI than LF-MRI during the early hyperacute and subacute phases. SynthMRI demonstrates high agreement with HF-MRI in detecting and quantifying ischemic lesions and is better than LF-MRI, particularly for lesions during the early hyperacute and subacute phases.

MRI Image Synthesis Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Multi-label pathology editing of chest X-rays with a Controlled Diffusion Model.

Chu H, Qi X, Wang H, Liang Y

•papers•Jul 1 2025

Large-scale generative models have garnered significant attention in the field of medical imaging, particularly for image editing utilizing diffusion models. However, current research has predominantly concentrated on pathological editing involving single or a limited number of labels, making it challenging to achieve precise modifications. Inaccurate alterations may lead to substantial discrepancies between the generated and original images, thereby impacting the clinical applicability of these models. This paper presents a diffusion model with untangling capabilities applied to chest X-ray image editing, incorporating a mask-based mechanism for bone and organ information. We successfully perform multi-label pathological editing of chest X-ray images without compromising the integrity of the original thoracic structure. The proposed technology comprises a chest X-ray image classifier and an intricate organ mask; the classifier supplies essential feature labels that require untangling for the stabilized diffusion model, while the complex organ mask facilitates directed and controllable edits to chest X-rays. We assessed the outcomes of our proposed algorithm, named Chest X-rays_Mpe, using MS-SSIM and CLIP scores alongside qualitative evaluations conducted by radiology experts. The results indicate that our approach surpasses existing algorithms across both quantitative and qualitative metrics.

X-Ray Image Synthesis Chest Methodology In Silico Academic Lab GenAI

Medical image translation with deep learning: Advances, datasets and perspectives.

Chen J, Ye Z, Zhang R, Li H, Fang B, Zhang LB, Wang W

•papers•Jul 1 2025

Traditional medical image generation often lacks patient-specific clinical information, limiting its clinical utility despite enhancing downstream task performance. In contrast, medical image translation precisely converts images from one modality to another, preserving both anatomical structures and cross-modal features, thus enabling efficient and accurate modality transfer and offering unique advantages for model development and clinical practice. This paper reviews the latest advancements in deep learning(DL)-based medical image translation. Initially, it elaborates on the diverse tasks and practical applications of medical image translation. Subsequently, it provides an overview of fundamental models, including convolutional neural networks (CNNs), transformers, and state space models (SSMs). Additionally, it delves into generative models such as Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), Autoregressive Models (ARs), diffusion Models, and flow Models. Evaluation metrics for assessing translation quality are discussed, emphasizing their importance. Commonly used datasets in this field are also analyzed, highlighting their unique characteristics and applications. Looking ahead, the paper identifies future trends, challenges, and proposes research directions and solutions in medical image translation. It aims to serve as a valuable reference and inspiration for researchers, driving continued progress and innovation in this area.

Mixed Modality Image Synthesis Review Concept Academic Lab GenAI

A lung structure and function information-guided residual diffusion model for predicting idiopathic pulmonary fibrosis progression.

Jiang C, Xing X, Nan Y, Fang Y, Zhang S, Walsh S, Yang G, Shen D

•papers•Jul 1 2025

Idiopathic Pulmonary Fibrosis (IPF) is a progressive lung disease that continuously scars and thickens lung tissue, leading to respiratory difficulties. Timely assessment of IPF progression is essential for developing treatment plans and improving patient survival rates. However, current clinical standards require multiple (usually two) CT scans at certain intervals to assess disease progression. This presents a dilemma: the disease progression is identified only after the disease has already progressed. To address this issue, a feasible solution is to generate the follow-up CT image from the patient's initial CT image to achieve early prediction of IPF. To this end, we propose a lung structure and function information-guided residual diffusion model. The key components of our model include (1) using a 2.5D generation strategy to reduce computational cost of generating 3D images with the diffusion model; (2) designing structural attention to mitigate negative impact of spatial misalignment between the two CT images on generation performance; (3) employing residual diffusion to accelerate model training and inference while focusing more on differences between the two CT images (i.e., the lesion areas); and (4) developing a CLIP-based text extraction module to extract lung function test information and further using such extracted information to guide the generation. Extensive experiments demonstrate that our method can effectively predict IPF progression and achieve superior generation performance compared to state-of-the-art methods.

CT Image Synthesis Chest Methodology In Silico Academic Lab GenAI

Magnetic resonance image generation using enhanced TransUNet in temporomandibular disorder patients.

Ha EG, Jeon KJ, Lee C, Kim DH, Han SS

•papers•Jul 1 2025

Temporomandibular disorder (TMD) patients experience a variety of clinical symptoms, and MRI is the most effective tool for diagnosing temporomandibular joint (TMJ) disc displacement. This study aimed to develop a transformer-based deep learning model to generate T2-weighted (T2w) images from proton density-weighted (PDw) images, reducing MRI scan time for TMD patients. A dataset of 7226 images from 178 patients who underwent TMJ MRI examinations was used. The proposed model employed a generative adversarial network framework with a TransUNet architecture as the generator for image translation. Additionally, a disc segmentation decoder was integrated to improve image quality in the TMJ disc region. The model performance was evaluated using metrics such as the structural similarity index measure (SSIM), learned perceptual image patch similarity (LPIPS), and Fréchet inception distance (FID). Three experienced oral radiologists also performed a qualitative assessment through the mean opinion score (MOS). The model demonstrated high performance in generating T2w images from PDw images, achieving average SSIM, LPIPS, and FID values of 82.28%, 2.46, and 23.85, respectively, in the disc region. The model also obtained an average MOS score of 4.58, surpassing other models. Additionally, the model showed robust segmentation capabilities for the TMJ disc. The proposed model, integrating a transformer and a disc segmentation task, demonstrated strong performance in MR image generation, both quantitatively and qualitatively. This suggests its potential clinical significance in reducing MRI scan times for TMD patients while maintaining high image quality.

MRI Image Synthesis Methodology In Silico Academic Lab GenAI

Deep Learning Estimation of Small Airway Disease from Inspiratory Chest Computed Tomography: Clinical Validation, Repeatability, and Associations with Adverse Clinical Outcomes in Chronic Obstructive Pulmonary Disease.

Chaudhary MFA, Awan HA, Gerard SE, Bodduluri S, Comellas AP, Barjaktarevic IZ, Barr RG, Cooper CB, Galban CJ, Han MK, Curtis JL, Hansel NN, Krishnan JA, Menchaca MG, Martinez FJ, Ohar J, Vargas Buonfiglio LG, Paine R, Bhatt SP, Hoffman EA, Reinhardt JM

•papers•Jul 1 2025

Rationale: Quantifying functional small airway disease (fSAD) requires additional expiratory computed tomography (CT) scans, limiting clinical applicability. Artificial intelligence (AI) could enable fSAD quantification from chest CT scans at total lung capacity (TLC) alone (fSADTLC). Objectives: To evaluate an AI model for estimating fSADTLC, compare it with dual-volume parametric response mapping fSAD (fSADPRM), and assess its clinical associations and repeatability in chronic obstructive pulmonary disease (COPD). Methods: We analyzed 2,513 participants from SPIROMICS (the Subpopulations and Intermediate Outcome Measures in COPD Study). Using a randomly sampled subset (n = 1,055), we developed a generative model to produce virtual expiratory CT scans for estimating fSADTLC in the remaining 1,458 SPIROMICS participants. We compared fSADTLC with dual-volume fSADPRM. We investigated univariate and multivariable associations of fSADTLC with FEV1, FEV1/FVC ratio, 6-minute-walk distance, St. George's Respiratory Questionnaire score, and FEV1 decline. The results were validated in a subset of patients from the COPDGene (Genetic Epidemiology of COPD) study (n = 458). Multivariable models were adjusted for age, race, sex, body mass index, baseline FEV1, smoking pack-years, smoking status, and percent emphysema. Measurements and Main Results: Inspiratory fSADTLC showed a strong correlation with fSADPRM in SPIROMICS (Pearson's R = 0.895) and COPDGene (R = 0.897) cohorts. Higher fSADTLC levels were significantly associated with lower lung function, including lower postbronchodilator FEV1 (in liters) and FEV1/FVC ratio, and poorer quality of life reflected by higher total St. George's Respiratory Questionnaire scores independent of percent CT emphysema. In SPIROMICS, individuals with higher fSADTLC experienced an annual decline in FEV1 of 1.156 ml (relative decrease; 95% confidence interval [CI], 0.613-1.699; P < 0.001) per year for every 1% increase in fSADTLC. The rate of decline in the COPDGene cohort was slightly lower at 0.866 ml/yr (relative decrease; 95% CI, 0.345-1.386; P < 0.001) per 1% increase in fSADTLC. Inspiratory fSADTLC demonstrated greater consistency between repeated measurements, with a higher intraclass correlation coefficient of 0.99 (95% CI, 0.98-0.99) compared with fSADPRM (0.83; 95% CI, 0.76-0.88). Conclusions: Small airway disease can be reliably assessed from a single inspiratory CT scan using generative AI, eliminating the need for an additional expiratory CT scan. fSAD estimation from inspiratory CT correlates strongly with fSADPRM, demonstrates a significant association with FEV1 decline, and offers greater repeatability.

CT Image Synthesis Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

In-silico CT simulations of deep learning generated heterogeneous phantoms.

Salinas CS, Magudia K, Sangal A, Ren L, Segars PW

•papers•Jun 30 2025

Current virtual imaging phantoms primarily emphasize geometricaccuracy of anatomical structures. However, to enhance realism, it is also importantto incorporate intra-organ detail. Because biological tissues are heterogeneous incomposition, virtual phantoms should reflect this by including realistic intra-organtexture and material variation.We propose training two 3D Double U-Net conditional generative adversarialnetworks (3D DUC-GAN) to generate sixteen unique textures that encompass organsfound within the torso. The model was trained on 378 CT image-segmentationpairs taken from a publicly available dataset with 18 additional pairs reserved fortesting. Textured phantoms were generated and imaged using DukeSim, a virtual CTsimulation platform.Results showed that the deep learning model was able to synthesize realisticheterogeneous phantoms from a set of homogeneous phantoms. These phantoms werecompared with original CT scans and had a mean absolute difference of 46.15 ± 1.06HU. The structural similarity index (SSIM) and peak signal-to-noise ratio (PSNR)were 0.86 ± 0.004 and 28.62 ± 0.14, respectively. The maximum mean discrepancybetween the generated and actual distribution was 0.0016. These metrics markedan improvement of 27%, 5.9%, 6.2%, and 28% respectively, compared to currenthomogeneous texture methods. The generated phantoms that underwent a virtualCT scan had a closer visual resemblance to the true CT scan compared to the previousmethod.The resulting heterogeneous phantoms offer a significant step toward more realisticin silico trials, enabling enhanced simulation of imaging procedures with greater fidelityto true anatomical variation.

CT Image Synthesis Abdominal Methodology In Silico Academic Lab Open Dataset

VAP-Diffusion: Enriching Descriptions with MLLMs for Enhanced Medical Image Generation

Peng Huang, Junhu Fu, Bowen Guo, Zeju Li, Yuanyuan Wang, Yi Guo

•preprint•Jun 30 2025

As the appearance of medical images is influenced by multiple underlying factors, generative models require rich attribute information beyond labels to produce realistic and diverse images. For instance, generating an image of skin lesion with specific patterns demands descriptions that go beyond diagnosis, such as shape, size, texture, and color. However, such detailed descriptions are not always accessible. To address this, we explore a framework, termed Visual Attribute Prompts (VAP)-Diffusion, to leverage external knowledge from pre-trained Multi-modal Large Language Models (MLLMs) to improve the quality and diversity of medical image generation. First, to derive descriptions from MLLMs without hallucination, we design a series of prompts following Chain-of-Thoughts for common medical imaging tasks, including dermatologic, colorectal, and chest X-ray images. Generated descriptions are utilized during training and stored across different categories. During testing, descriptions are randomly retrieved from the corresponding category for inference. Moreover, to make the generator robust to unseen combination of descriptions at the test time, we propose a Prototype Condition Mechanism that restricts test embeddings to be similar to those from training. Experiments on three common types of medical imaging across four datasets verify the effectiveness of VAP-Diffusion.

Mixed Modality Image Synthesis Chest Methodology In Silico Academic Lab GenAI

Filter Papers

Tags

Multi-modal MRI synthesis with conditional latent diffusion models for data augmentation in tumor segmentation.

A systematic review of generative AI approaches for medical image enhancement: Comparing GANs, transformers, and diffusion models.

Quantitative Ischemic Lesions of Portable Low-Field Strength MRI Using Deep Learning-Based Super-Resolution.

Multi-label pathology editing of chest X-rays with a Controlled Diffusion Model.

Medical image translation with deep learning: Advances, datasets and perspectives.

A lung structure and function information-guided residual diffusion model for predicting idiopathic pulmonary fibrosis progression.

Magnetic resonance image generation using enhanced TransUNet in temporomandibular disorder patients.

Deep Learning Estimation of Small Airway Disease from Inspiratory Chest Computed Tomography: Clinical Validation, Repeatability, and Associations with Adverse Clinical Outcomes in Chronic Obstructive Pulmonary Disease.

In-silico CT simulations of deep learning generated heterogeneous phantoms.

VAP-Diffusion: Enriching Descriptions with MLLMs for Enhanced Medical Image Generation

Ready to Sharpen Your Edge?