Latest Papers on Radiology AI. Tags: Image Synthesis

Influence of high-performance image-to-image translation networks on clinical visual assessment and outcome prediction: utilizing ultrasound to MRI translation in prostate cancer.

Salmanpour MR, Mousavi A, Xu Y, Weeks WB, Hacihaliloglu I

•papers•Jul 19 2025

Image-to-image (I2I) translation networks have emerged as promising tools for generating synthetic medical images; however, their clinical reliability and ability to preserve diagnostically relevant features remain underexplored. This study evaluates the performance of state-of-the-art 2D/3D I2I networks for converting ultrasound (US) images to synthetic MRI in prostate cancer (PCa) imaging. The novelty lies in combining radiomics, expert clinical evaluation, and classification performance to comprehensively benchmark these models for potential integration into real-world diagnostic workflows. A dataset of 794 PCa patients was analyzed using ten leading I2I networks to synthesize MRI from US input. Radiomics feature (RF) analysis was performed using Spearman correlation to assess whether high-performing networks (SSIM > 0.85) preserved quantitative imaging biomarkers. A qualitative evaluation by seven experienced physicians assessed the anatomical realism, presence of artifacts, and diagnostic interpretability of synthetic images. Additionally, classification tasks using synthetic images were conducted using two machine learning and one deep learning model to assess the practical diagnostic benefit. Among all networks, 2D-Pix2Pix achieved the highest SSIM (0.855 ± 0.032). RF analysis showed that 76 out of 186 features were preserved post-translation, while the remainder were degraded or lost. Qualitative feedback revealed consistent issues with low-level feature preservation and artifact generation, particularly in lesion-rich regions. These evaluations were conducted to assess whether synthetic MRI retained clinically relevant patterns, supported expert interpretation, and improved diagnostic accuracy. Importantly, classification performance using synthetic MRI significantly exceeded that of US-based input, achieving average accuracy and AUC of ~ 0.93 ± 0.05. Although 2D-Pix2Pix showed the best overall performance in similarity and partial RF preservation, improvements are still required in lesion-level fidelity and artifact suppression. The combination of radiomics, qualitative, and classification analyses offered a holistic view of the current strengths and limitations of I2I models, supporting their potential in clinical applications pending further refinement and validation.

Mixed Modality Image Synthesis Abdominal Retrospective Clinical In Silico

Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation

Andrea Moschetto, Lemuel Puglisi, Alec Sargood, Pierluigi Dell'Acqua, Francesco Guarnera, Sebastiano Battiato, Daniele Ravì

•preprint•Jul 19 2025

Magnetic Resonance Imaging (MRI) enables the acquisition of multiple image contrasts, such as T1-weighted (T1w) and T2-weighted (T2w) scans, each offering distinct diagnostic insights. However, acquiring all desired modalities increases scan time and cost, motivating research into computational methods for cross-modal synthesis. To address this, recent approaches aim to synthesize missing MRI contrasts from those already acquired, reducing acquisition time while preserving diagnostic quality. Image-to-image (I2I) translation provides a promising framework for this task. In this paper, we present a comprehensive benchmark of generative models$\unicode{x2013}$specifically, Generative Adversarial Networks (GANs), diffusion models, and flow matching (FM) techniques$\unicode{x2013}$for T1w-to-T2w 2D MRI I2I translation. All frameworks are implemented with comparable settings and evaluated on three publicly available MRI datasets of healthy adults. Our quantitative and qualitative analyses show that the GAN-based Pix2Pix model outperforms diffusion and FM-based methods in terms of structural fidelity, image quality, and computational efficiency. Consistent with existing literature, these results suggest that flow-based models are prone to overfitting on small datasets and simpler tasks, and may require more data to match or surpass GAN performance. These findings offer practical guidance for deploying I2I translation techniques in real-world MRI workflows and highlight promising directions for future research in cross-modal medical image synthesis. Code and models are publicly available at https://github.com/AndreaMoschetto/medical-I2I-benchmark.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab Open Code Benchmark SOTA

Diagnostic interchangeability of deep-learning based Synth-STIR images generated from T1 and T2 weighted spine images.

Li J, Xu M, Jiang B, Dong Q, Xia Y, Zhou T, Lin X, Ma Y, Jiang S, Zhang Z, Xiang L, Fan L, Liu S

•papers•Jul 18 2025

To evaluate image quality and diagnostic interchangeability of synth short-tau inversion recovery (STIR) generated by deep learning in comparison with standard STIR. This prospective study recruited participants between July 2023 and August 2023. Participants were scanned with T1WI and T2WI, then generated Synth-STIR. Signal-to-noise ratios (SNR), contrast-to-noise ratios (CNR) were calculated for quantitative evaluation. Four independent, blinded radiologists performed subjective quality and lesion characteristic assessment. Wilcoxon tests were used to assess the differences in SNR, CNR, and subjective image quality. Various diagnostic findings pertinent to the spine were tested for interchangeability using the individual equivalence index (IEI). Inter-reader and intra-reader agreement and concordance were computed, and McNemar tests were performed for comprehensive evaluation. One hundred ninety-nine participants (106 male patients, mean age 46.8 ± 16.9 years) were included. Compared to standard-STIR, Synth-STIR reduces sequence scanning time by approximately 180 s, has significantly higher SNR and CNR (p < 0.001). For artifacts, noise, sharpness, and diagnostic confidence, all readers agreed that Synth-STIR was significantly better than standard-STIR (all p < 0.001). In addition, the IEI was less than 1.61%. Kappa and Kendall showed a moderate to excellent agreement in the range of 0.52-0.97. There was no significant difference in the frequencies of the major features as reported with standard-STIR and Synth-STIR (p = 0.211-1). Synth-STIR shows significantly higher SNR and CNR, and is diagnostically interchangeable with standard-STIR with a substantial overall reduction in the imaging time, thereby improving efficiency without sacrificing diagnostic value. Question Can generating STIR improve image quality while reducing spine MRI acquisition time in order to increase clinical spine MRI throughput? Findings With reduced acquisition time, Synth-STIR has significantly higher SNR and CNR than standard-STIR and can be interchangeably diagnosed with standard-STIR in detecting spinal abnormalities. Clinical relevance Our Synth-STIR provides the same high-quality images for clinical diagnosis as standard-STIR, while reducing scanning time for spine MRI protocols. Increase clinical spine MRI throughput.

MRI Image Synthesis Musculoskeletal Prospective Clinical Pilot Academic Lab Benchmark SOTA

Converting T1-weighted MRI from 3T to 7T quality using deep learning

Malo Gicquel, Ruoyi Zhao, Anika Wuestefeld, Nicola Spotorno, Olof Strandberg, Kalle Åström, Yu Xiao, Laura EM Wisse, Danielle van Westen, Rik Ossenkoppele, Niklas Mattsson-Carlgren, David Berron, Oskar Hansson, Gabrielle Flood, Jacob Vogel

•preprint•Jul 18 2025

Ultra-high resolution 7 tesla (7T) magnetic resonance imaging (MRI) provides detailed anatomical views, offering better signal-to-noise ratio, resolution and tissue contrast than 3T MRI, though at the cost of accessibility. We present an advanced deep learning model for synthesizing 7T brain MRI from 3T brain MRI. Paired 7T and 3T T1-weighted images were acquired from 172 participants (124 cognitively unimpaired, 48 impaired) from the Swedish BioFINDER-2 study. To synthesize 7T MRI from 3T images, we trained two models: a specialized U-Net, and a U-Net integrated with a generative adversarial network (GAN U-Net). Our models outperformed two additional state-of-the-art 3T-to-7T models in image-based evaluation metrics. Four blinded MRI professionals judged our synthetic 7T images as comparable in detail to real 7T images, and superior in subjective visual quality to 7T images, apparently due to the reduction of artifacts. Importantly, automated segmentations of the amygdalae of synthetic GAN U-Net 7T images were more similar to manually segmented amygdalae (n=20), than automated segmentations from the 3T images that were used to synthesize the 7T images. Finally, synthetic 7T images showed similar performance to real 3T images in downstream prediction of cognitive status using MRI derivatives (n=3,168). In all, we show that synthetic T1-weighted brain images approaching 7T quality can be generated from 3T images, which may improve image quality and segmentation, without compromising performance in downstream tasks. Future directions, possible clinical use cases, and limitations are discussed.

MRI Image Synthesis Neurological Methodology In Silico Academic Lab Benchmark SOTA

Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images

Zahra TehraniNasab, Amar Kumar, Tal Arbel

•preprint•Jul 17 2025

Medical image synthesis presents unique challenges due to the inherent complexity and high-resolution details required in clinical contexts. Traditional generative architectures such as Generative Adversarial Networks (GANs) or Variational Auto Encoder (VAEs) have shown great promise for high-resolution image generation but struggle with preserving fine-grained details that are key for accurate diagnosis. To address this issue, we introduce Pixel Perfect MegaMed, the first vision-language foundation model to synthesize images at resolutions of 1024x1024. Our method deploys a multi-scale transformer architecture designed specifically for ultra-high resolution medical image generation, enabling the preservation of both global anatomical context and local image-level details. By leveraging vision-language alignment techniques tailored to medical terminology and imaging modalities, Pixel Perfect MegaMed bridges the gap between textual descriptions and visual representations at unprecedented resolution levels. We apply our model to the CheXpert dataset and demonstrate its ability to generate clinically faithful chest X-rays from text prompts. Beyond visual quality, these high-resolution synthetic images prove valuable for downstream tasks such as classification, showing measurable performance gains when used for data augmentation, particularly in low-data regimes. Our code is accessible through the project website - https://tehraninasab.github.io/pixelperfect-megamed.

X-Ray Image Synthesis Chest Methodology In Silico Academic Lab GenAI Open Code

Single Inspiratory Chest CT-based Generative Deep Learning Models to Evaluate Functional Small Airway Disease.

Zhang D, Zhao M, Zhou X, Li Y, Guan Y, Xia Y, Zhang J, Dai Q, Zhang J, Fan L, Zhou SK, Liu S

•papers•Jul 16 2025

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a deep learning model that uses a single inspiratory chest CT scan to generate parametric response maps (PRM) and predict functional small airway disease (fSAD). Materials and Methods In this retrospective study, predictive and generative deep learning models for PRM using inspiratory chest CT were developed using a model development dataset with fivefold cross-validation, with PRM derived from paired respiratory CT as the reference standard. Voxel-wise metrics, including sensitivity, area under the receiver operating characteristic curve (AUC), and structural similarity, were used to evaluate model performance in predicting PRM and expiratory CT images. The best performing model was tested on three internal test sets and an external test set. Results The model development dataset of 308 patients (median age, 67 years, [IQR: 62-70 years]; 113 female) was divided into the training set (n = 216), the internal validation set (n = 31), and the first internal test set (n = 61). The generative model outperformed the predictive model in detecting fSAD (sensitivity 86.3% vs 38.9%; AUC 0.86 vs 0.70). The generative model performed well in the second internal (AUCs of 0.64, 0.84, 0.97 for emphysema, fSAD and normal lung tissue), the third internal (AUCs of 0.63, 0.83, 0.97), and the external (AUCs of 0.58, 0.85, 0.94) test sets. Notably, the model exhibited exceptional performance in the PRISm group of the fourth internal test set (AUC = 0.62, 0.88, and 0.96). Conclusion The proposed generative model, using a single inspiratory CT, outperformed existing algorithms in PRM evaluation, achieved comparable results to paired respiratory CT. Published under a CC BY 4.0 license.

CT Image Synthesis Chest Retrospective Clinical In Silico Benchmark SOTA

Cross-Modal conditional latent diffusion model for Brain MRI to Ultrasound image translation.

Jiang S, Wang L, Li Y, Yang Z, Zhou Z, Li B

•papers•Jul 16 2025

Intraoperative brain ultrasound (US) provides real-time information on lesions and tissues, making it crucial for brain tumor resection. However, due to limitations such as imaging angles and operator techniques, US data is limited in size and difficult to annotate, hindering advancements in intelligent image processing. In contrast, Magnetic Resonance Imaging (MRI) data is more abundant and easier to annotate. If MRI data and models can be effectively transferred to the US domain, generating high-quality US data would greatly enhance US image processing and improve intraoperative US readability.Approach. We propose a Cross-Modal Conditional Latent Diffusion Model (CCLD) for brain MRI-to-US image translation. We employ a noise mask restoration strategy to pretrain an efficient encoder-decoder, enhancing feature extraction, compression, and reconstruction capabilities while reducing computational costs. Furthermore, CCLD integrates the Frequency-Decomposed Feature Optimization Module (FFOM) and the Adaptive Multi-Frequency Feature Fusion Module (AMFM) to effectively leverage MRI structural information and US texture characteristics, ensuring structural accuracy while enhancing texture details in the synthetic US images.Main results. Compared with state-of-the-art methods, our approach achieves superior performance on the ReMIND dataset, obtaining the best Learned Perceptual Image Patch Similarity (LPIPS) score of 19.1%, Mean Absolute Error (MAE) of 4.21%, as well as the highest Peak Signal-to-Noise Ratio (PSNR) of 25.36 dB and Structural Similarity Index (SSIM) of 86.91%. Significance. Experimental results demonstrate that CCLD effectively improves the quality and realism of synthetic ultrasound images, offering a new research direction for the generation of high-quality US datasets and the enhancement of ultrasound image readability.&#xD.

Mixed Modality Image Synthesis Neurological Methodology In Silico GenAI Open Dataset

A diffusion model for universal medical image enhancement.

Fei B, Li Y, Yang W, Gao H, Xu J, Ma L, Yang Y, Zhou P

•papers•Jul 15 2025

The development of medical imaging techniques has made a significant contribution to clinical decision-making. However, the existence of suboptimal imaging quality, as indicated by irregular illumination or imbalanced intensity, presents significant obstacles in automating disease screening, analysis, and diagnosis. Existing approaches for natural image enhancement are mostly trained with numerous paired images, presenting challenges in data collection and training costs, all while lacking the ability to generalize effectively. Here, we introduce a pioneering training-free Diffusion Model for Universal Medical Image Enhancement, named UniMIE. UniMIE demonstrates its unsupervised enhancement capabilities across various medical image modalities without the need for any fine-tuning. It accomplishes this by relying solely on a single pre-trained model from ImageNet. We conduct a comprehensive evaluation on 13 imaging modalities and over 15 medical types, demonstrating better qualities, robustness, and accuracy than other modality-specific and data-inefficient models. By delivering high-quality enhancement and corresponding accuracy downstream tasks across a wide range of tasks, UniMIE exhibits considerable potential to accelerate the advancement of diagnostic tools and customized treatment plans. UniMIE represents a transformative approach to medical image enhancement, offering a versatile and robust solution that adapts to diverse imaging conditions. By improving image quality and facilitating better downstream analyses, UniMIE has the potential to revolutionize clinical workflows and enhance diagnostic accuracy across a wide range of medical applications.

Mixed Modality Image Synthesis Methodology In Silico Academic Lab Breakthrough

Human-Guided Shade Artifact Suppression in CBCT-to-MDCT Translation via Schrödinger Bridge with Conditional Diffusion

Sung Ho Kang, Hyun-Cheol Park

•preprint•Jul 15 2025

We present a novel framework for CBCT-to-MDCT translation, grounded in the Schrodinger Bridge (SB) formulation, which integrates GAN-derived priors with human-guided conditional diffusion. Unlike conventional GANs or diffusion models, our approach explicitly enforces boundary consistency between CBCT inputs and pseudo targets, ensuring both anatomical fidelity and perceptual controllability. Binary human feedback is incorporated via classifier-free guidance (CFG), effectively steering the generative process toward clinically preferred outcomes. Through iterative refinement and tournament-based preference selection, the model internalizes human preferences without relying on a reward model. Subtraction image visualizations reveal that the proposed method selectively attenuates shade artifacts in key anatomical regions while preserving fine structural detail. Quantitative evaluations further demonstrate superior performance across RMSE, SSIM, LPIPS, and Dice metrics on clinical datasets -- outperforming prior GAN- and fine-tuning-based feedback methods -- while requiring only 10 sampling steps. These findings underscore the effectiveness and efficiency of our framework for real-time, preference-aligned medical image translation.

CT Image Synthesis Methodology In Silico Academic Lab Breakthrough

X-ray2CTPA: leveraging diffusion models to enhance pulmonary embolism classification.

Cahan N, Klang E, Aviram G, Barash Y, Konen E, Giryes R, Greenspan H

•papers•Jul 14 2025

Chest X-rays or chest radiography (CXR), commonly used for medical diagnostics, typically enables limited imaging compared to computed tomography (CT) scans, which offer more detailed and accurate three-dimensional data, particularly contrast-enhanced scans like CT Pulmonary Angiography (CTPA). However, CT scans entail higher costs, greater radiation exposure, and are less accessible than CXRs. In this work, we explore cross-modal translation from a 2D low contrast-resolution X-ray input to a 3D high contrast and spatial-resolution CTPA scan. Driven by recent advances in generative AI, we introduce a novel diffusion-based approach to this task. We employ the synthesized 3D images in a classification framework and show improved AUC in a Pulmonary Embolism (PE) categorization task, using the initial CXR input. Furthermore, we evaluate the model's performance using quantitative metrics, ensuring diagnostic relevance of the generated images. The proposed method is generalizable and capable of performing additional cross-modality translations in medical imaging. It may pave the way for more accessible and cost-effective advanced diagnostic tools. The code for this project is available: https://github.com/NoaCahan/X-ray2CTPA .

Mixed Modality Image Synthesis Chest Methodology In Silico Academic Lab Open Code GenAI

Filter Papers

Tags