Sort by:
Page 1 of 984 results
Next

Contrast-enhanced image synthesis using latent diffusion model for precise online tumor delineation in MRI-guided adaptive radiotherapy for brain metastases.

Ma X, Ma Y, Wang Y, Li C, Liu Y, Chen X, Dai J, Bi N, Men K

pubmed logopapersJun 25 2025
&#xD;Magnetic resonance imaging-guided adaptive radiotherapy (MRIgART) is a promising technique for long-course RT of large-volume brain metastasis (BM), due to the capacity to track tumor changes throughout treatment course. Contrast-enhanced T1-weighted (T1CE) MRI is essential for BM delineation, yet is often unavailable during online treatment concerning the requirement of contrast agent injection. This study aims to develop a synthetic T1CE (sT1CE) generation method to facilitate accurate online adaptive BM delineation.&#xD;Approach:&#xD;We developed a novel ControlNet-coupled latent diffusion model (CTN-LDM) combined with a personalized transfer learning strategy and a denoising diffusion implicit model (DDIM) inversion method to generate high quality sT1CE images from online T2-weighted (T2) or fluid attenuated inversion recovery (FLAIR) images. Visual quality of sT1CE images generated by the CTN-LDM was compared with classical deep learning models. BM delineation results using the combination of our sT1CE images and online T2/FLAIR images were compared with the results solely using online T2/FLAIR images, which is the current clinical method.&#xD;Main results:&#xD;Visual quality of sT1CE images from our CTN-LDM was superior to classical models both quantitatively and qualitatively. Leveraging sT1CE images, radiation oncologists achieved significant higher precision of adaptive BM delineation, with average Dice similarity coefficient of 0.93 ± 0.02 vs. 0.86 ± 0.04 (p < 0.01), compared with only using online T2/FLAIR images. &#xD;Significance:&#xD;The proposed method could generate high quality sT1CE images and significantly improve accuracy of online adaptive tumor delineation for long-course MRIgART of large-volume BM, potentially enhancing treatment outcomes and minimizing toxicity.

Med-Art: Diffusion Transformer for 2D Medical Text-to-Image Generation

Changlu Guo, Anders Nymark Christensen, Morten Rieger Hannemose

arxiv logopreprintJun 25 2025
Text-to-image generative models have achieved remarkable breakthroughs in recent years. However, their application in medical image generation still faces significant challenges, including small dataset sizes, and scarcity of medical textual data. To address these challenges, we propose Med-Art, a framework specifically designed for medical image generation with limited data. Med-Art leverages vision-language models to generate visual descriptions of medical images which overcomes the scarcity of applicable medical textual data. Med-Art adapts a large-scale pre-trained text-to-image model, PixArt-$\alpha$, based on the Diffusion Transformer (DiT), achieving high performance under limited data. Furthermore, we propose an innovative Hybrid-Level Diffusion Fine-tuning (HLDF) method, which enables pixel-level losses, effectively addressing issues such as overly saturated colors. We achieve state-of-the-art performance on two medical image datasets, measured by FID, KID, and downstream classification performance.

BronchoGAN: anatomically consistent and domain-agnostic image-to-image translation for video bronchoscopy.

Soliman A, Keuth R, Himstedt M

pubmed logopapersJun 25 2025
Purpose The limited availability of bronchoscopy images makes image synthesis particularly interesting for training deep learning models. Robust image translation across different domains-virtual bronchoscopy, phantom as well as in vivo and ex vivo image data-is pivotal for clinical applications. Methods This paper proposes BronchoGAN introducing anatomical constraints for image-to-image translation being integrated into a conditional GAN. In particular, we force bronchial orifices to match across input and output images. We further propose to use foundation model-generated depth images as intermediate representation ensuring robustness across a variety of input domains establishing models with substantially less reliance on individual training datasets. Moreover, our intermediate depth image representation allows to easily construct paired image data for training. Results Our experiments showed that input images from different domains (e.g., virtual bronchoscopy, phantoms) can be successfully translated to images mimicking realistic human airway appearance. We demonstrated that anatomical settings (i.e., bronchial orifices) can be robustly preserved with our approach which is shown qualitatively and quantitatively by means of improved FID, SSIM and dice coefficients scores. Our anatomical constraints enabled an improvement in the Dice coefficient of up to 0.43 for synthetic images. Conclusion Through foundation models for intermediate depth representations and bronchial orifice segmentation integrated as anatomical constraints into conditional GANs, we are able to robustly translate images from different bronchoscopy input domains. BronchoGAN allows to incorporate public CT scan data (virtual bronchoscopy) in order to generate large-scale bronchoscopy image datasets with realistic appearance. BronchoGAN enables to bridge the gap of missing public bronchoscopy images.

Generalizable medical image enhancement using structure-preserved diffusion models.

Chen L, Yu X, Li H, Lin H, Niu K, Li H

pubmed logopapersJun 25 2025
Clinical medical images often suffer from compromised quality, which negatively impacts the diagnostic process by both clinicians and AI algorithms. While GAN-based enhancement methods have been commonly developed in recent years, delicate model training is necessary due to issues with artifacts, mode collapse, and instability. Diffusion models have shown promise in generating high-quality images superior to GANs, but challenges in training data collection and domain gaps hinder applying them for medical image enhancement. Additionally, preserving fine structures in enhancing medical images with diffusion models is still an area that requires further exploration. To overcome these challenges, we propose structure-preserved diffusion models for generalizable medical image enhancement (GEDM). GEDM leverages joint supervision from enhancement and segmentation to boost structure preservation and generalizability. Specifically, synthetic data is used to collect high-low quality paired training data with structure masks, and the Laplace transform is employed to reduce domain gaps and introduce multi-scale conditions. GEDM conducts medical image enhancement and segmentation jointly, supervised by high-quality references and structure masks from the training data. Four datasets of two medical imaging modalities were collected to implement the experiments, where GEDM outperformed state-of-the-art methods in image enhancement, as well as follow-up medical analysis tasks.

Angio-Diff: Learning a Self-Supervised Adversarial Diffusion Model for Angiographic Geometry Generation

Zhifeng Wang, Renjiao Yi, Xin Wen, Chenyang Zhu, Kai Xu, Kunlun He

arxiv logopreprintJun 24 2025
Vascular diseases pose a significant threat to human health, with X-ray angiography established as the gold standard for diagnosis, allowing for detailed observation of blood vessels. However, angiographic X-rays expose personnel and patients to higher radiation levels than non-angiographic X-rays, which are unwanted. Thus, modality translation from non-angiographic to angiographic X-rays is desirable. Data-driven deep approaches are hindered by the lack of paired large-scale X-ray angiography datasets. While making high-quality vascular angiography synthesis crucial, it remains challenging. We find that current medical image synthesis primarily operates at pixel level and struggles to adapt to the complex geometric structure of blood vessels, resulting in unsatisfactory quality of blood vessel image synthesis, such as disconnections or unnatural curvatures. To overcome this issue, we propose a self-supervised method via diffusion models to transform non-angiographic X-rays into angiographic X-rays, mitigating data shortages for data-driven approaches. Our model comprises a diffusion model that learns the distribution of vascular data from diffusion latent, a generator for vessel synthesis, and a mask-based adversarial module. To enhance geometric accuracy, we propose a parametric vascular model to fit the shape and distribution of blood vessels. The proposed method contributes a pipeline and a synthetic dataset for X-ray angiography. We conducted extensive comparative and ablation experiments to evaluate the Angio-Diff. The results demonstrate that our method achieves state-of-the-art performance in synthetic angiography image quality and more accurately synthesizes the geometric structure of blood vessels. The code is available at https://github.com/zfw-cv/AngioDiff.

Enabling PSO-Secure Synthetic Data Sharing Using Diversity-Aware Diffusion Models

Mischa Dombrowski, Bernhard Kainz

arxiv logopreprintJun 22 2025
Synthetic data has recently reached a level of visual fidelity that makes it nearly indistinguishable from real data, offering great promise for privacy-preserving data sharing in medical imaging. However, fully synthetic datasets still suffer from significant limitations: First and foremost, the legal aspect of sharing synthetic data is often neglected and data regulations, such as the GDPR, are largley ignored. Secondly, synthetic models fall short of matching the performance of real data, even for in-domain downstream applications. Recent methods for image generation have focused on maximising image diversity instead of fidelity solely to improve the mode coverage and therefore the downstream performance of synthetic data. In this work, we shift perspective and highlight how maximizing diversity can also be interpreted as protecting natural persons from being singled out, which leads to predicate singling-out (PSO) secure synthetic datasets. Specifically, we propose a generalisable framework for training diffusion models on personal data which leads to unpersonal synthetic datasets achieving performance within one percentage point of real-data models while significantly outperforming state-of-the-art methods that do not ensure privacy. Our code is available at https://github.com/MischaD/Trichotomy.

Generative deep-learning-model based contrast enhancement for digital subtraction angiography using a text-conditioned image-to-image model.

Takata T, Yamada K, Yamamoto M, Kondo H

pubmed logopapersJun 20 2025
Digital subtraction angiography (DSA) is an essential imaging technique in interventional radiology, enabling detailed visualization of blood vessels by subtracting pre- and post-contrast images. However, reduced contrast, either accidental or intentional, can impair the clarity of vascular structures. This issue becomes particularly critical in patients with chronic kidney disease (CKD), where minimizing iodinated contrast is necessary to reduce the risk of contrast-induced nephropathy (CIN). This study explored the potential of using a generative deep-learning-model based contrast enhancement technique for DSA. A text-conditioned image-to-image model was developed using Stable Diffusion, augmented with ControlNet to reduce hallucinations and Low-Rank Adaptation for model fine-tuning. A total of 1207 DSA series were used for training and testing, with additional low-contrast images generated through data augmentation. The model was trained using tagged text labels and evaluated using metrics such as Root Mean Square (RMS) contrast, Michelson contrast, signal-to-noise ratio (SNR), and entropy. Evaluation results indicated significant improvements, with RMS contrast, Michelson contrast, and entropy respectively increased from 7.91 to 17.7, 0.875 to 0.992, and 3.60 to 5.60, reflecting enhanced detail. However, SNR decreased from 21.3 to 8.50, indicating increased noise. This study demonstrated the feasibility of deep learning-based contrast enhancement for DSA images and highlights the potential for generative deep-learning-model to improve angiographic imaging. Further refinements, particularly in artifact suppression and clinical validation, are necessary for practical implementation in medical settings.

D2Diff : A Dual Domain Diffusion Model for Accurate Multi-Contrast MRI Synthesis

Sanuwani Dayarathna, Himashi Peiris, Kh Tohidul Islam, Tien-Tsin Wong, Zhaolin Chen

arxiv logopreprintJun 18 2025
Multi contrast MRI synthesis is inherently challenging due to the complex and nonlinear relationships among different contrasts. Each MRI contrast highlights unique tissue properties, but their complementary information is difficult to exploit due to variations in intensity distributions and contrast specific textures. Existing methods for multi contrast MRI synthesis primarily utilize spatial domain features, which capture localized anatomical structures but struggle to model global intensity variations and distributed patterns. Conversely, frequency domain features provide structured inter contrast correlations but lack spatial precision, limiting their ability to retain finer details. To address this, we propose a dual domain learning framework that integrates spatial and frequency domain information across multiple MRI contrasts for enhanced synthesis. Our method employs two mutually trained denoising networks, one conditioned on spatial domain and the other on frequency domain contrast features through a shared critic network. Additionally, an uncertainty driven mask loss directs the models focus toward more critical regions, further improving synthesis accuracy. Extensive experiments show that our method outperforms SOTA baselines, and the downstream segmentation performance highlights the diagnostic value of the synthetic results.

DM-FNet: Unified multimodal medical image fusion via diffusion process-trained encoder-decoder

Dan He, Weisheng Li, Guofen Wang, Yuping Huang, Shiqiang Liu

arxiv logopreprintJun 18 2025
Multimodal medical image fusion (MMIF) extracts the most meaningful information from multiple source images, enabling a more comprehensive and accurate diagnosis. Achieving high-quality fusion results requires a careful balance of brightness, color, contrast, and detail; this ensures that the fused images effectively display relevant anatomical structures and reflect the functional status of the tissues. However, existing MMIF methods have limited capacity to capture detailed features during conventional training and suffer from insufficient cross-modal feature interaction, leading to suboptimal fused image quality. To address these issues, this study proposes a two-stage diffusion model-based fusion network (DM-FNet) to achieve unified MMIF. In Stage I, a diffusion process trains UNet for image reconstruction. UNet captures detailed information through progressive denoising and represents multilevel data, providing a rich set of feature representations for the subsequent fusion network. In Stage II, noisy images at various steps are input into the fusion network to enhance the model's feature recognition capability. Three key fusion modules are also integrated to process medical images from different modalities adaptively. Ultimately, the robust network structure and a hybrid loss function are integrated to harmonize the fused image's brightness, color, contrast, and detail, enhancing its quality and information density. The experimental results across various medical image types demonstrate that the proposed method performs exceptionally well regarding objective evaluation metrics. The fused image preserves appropriate brightness, a comprehensive distribution of radioactive tracers, rich textures, and clear edges. The code is available at https://github.com/HeDan-11/DM-FNet.

Frequency-Calibrated Membership Inference Attacks on Medical Image Diffusion Models

Xinkai Zhao, Yuta Tokuoka, Junichiro Iwasawa, Keita Oda

arxiv logopreprintJun 17 2025
The increasing use of diffusion models for image generation, especially in sensitive areas like medical imaging, has raised significant privacy concerns. Membership Inference Attack (MIA) has emerged as a potential approach to determine if a specific image was used to train a diffusion model, thus quantifying privacy risks. Existing MIA methods often rely on diffusion reconstruction errors, where member images are expected to have lower reconstruction errors than non-member images. However, applying these methods directly to medical images faces challenges. Reconstruction error is influenced by inherent image difficulty, and diffusion models struggle with high-frequency detail reconstruction. To address these issues, we propose a Frequency-Calibrated Reconstruction Error (FCRE) method for MIAs on medical image diffusion models. By focusing on reconstruction errors within a specific mid-frequency range and excluding both high-frequency (difficult to reconstruct) and low-frequency (less informative) regions, our frequency-selective approach mitigates the confounding factor of inherent image difficulty. Specifically, we analyze the reverse diffusion process, obtain the mid-frequency reconstruction error, and compute the structural similarity index score between the reconstructed and original images. Membership is determined by comparing this score to a threshold. Experiments on several medical image datasets demonstrate that our FCRE method outperforms existing MIA methods.
Page 1 of 984 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.