Sort by:
Page 4 of 19185 results

Enhancing Synthetic Pelvic CT Generation from CBCT using Vision Transformer with Adaptive Fourier Neural Operators.

Bhaskara R, Oderinde OM

pubmed logopapersJul 28 2025
This study introduces a novel approach to improve Cone Beam CT (CBCT) image quality by developing a synthetic CT (sCT) generation method using CycleGAN with a Vision Transformer (ViT) and an Adaptive Fourier Neural Operator (AFNO). 

Approach: A dataset of 20 prostate cancer patients who received stereotactic body radiation therapy (SBRT) was used, consisting of paired CBCT and planning CT (pCT) images. The dataset was preprocessed by registering pCTs to CBCTs using deformation registration techniques, such as B-spline, followed by resampling to uniform voxel sizes and normalization. The model architecture integrates a CycleGAN with bidirectional generators, where the UNet generator is enhanced with a ViT at the bottleneck. AFNO functions as the attention mechanism for the ViT, operating on the input data in the Fourier domain. AFNO's innovations handle varying resolutions, mesh invariance, and efficient long-range dependency capture.

Main Results: Our model improved significantly in preserving anatomical details and capturing complex image dependencies. The AFNO mechanism processed global image information effectively, adapting to interpatient variations for accurate sCT generation. Evaluation metrics like Mean Absolute Error (MAE), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Normalized Cross Correlation (NCC), demonstrated the superiority of our method. Specifically, the model achieved an MAE of 9.71, PSNR of 37.08 dB, SSIM of 0.97, and NCC of 0.99, confirming its efficacy. 

Significance: The integration of AFNO within the CycleGAN UNet framework addresses Cone Beam CT image quality limitations. The model generates synthetic CTs that allow adaptive treatment planning during SBRT, enabling adjustments to the dose based on tumor response, thus reducing radiotoxicity from increased doses. This method's ability to preserve both global and local anatomical features shows potential for improving tumor targeting, adaptive radiotherapy planning, and clinical decision-making.

Dosimetric evaluation of synthetic kilo-voltage CT images generated from megavoltage CT for head and neck tomotherapy using a conditional GAN network.

Choghazardi Y, Tavakoli MB, Abedi I, Roayaei M, Hemati S, Shanei A

pubmed logopapersJul 28 2025
The lower image contrast of megavoltage computed tomography (MVCT), which corresponds to kilovoltage computed tomography (kVCT), can inhibit accurate dosimetric assessments. This study proposes a deep learning approach, specifically the pix2pix network, to generate high-quality synthetic kVCT (skVCT) images from MVCT data. The model was trained on a dataset of 25 paired patient images and evaluated on a test set of 15 paired images. We performed visual inspections to assess the quality of the generated skVCT images and calculated the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Dosimetric equivalence was evaluated by comparing the gamma pass rates of treatment plans derived from skVCT and kVCT images. Results showed that skVCT images exhibited significantly higher quality than MVCT images, with PSNR and SSIM values of 31.9 ± 1.1 dB and 94.8% ± 1.3%, respectively, compared to 26.8 ± 1.7 dB and 89.5% ± 1.5% for MVCT-to-kVCT comparisons. Furthermore, treatment plans based on skVCT images achieved excellent gamma pass rates of 99.78 ± 0.14% and 99.82 ± 0.20% for 2 mm/2% and 3 mm/3% criteria, respectively, comparable to those obtained from kVCT-based plans (99.70 ± 0.31% and 99.79 ± 1.32%). This study demonstrates the potential of pix2pix models for generating high-quality skVCT images, which could significantly enhance Adaptive Radiation Therapy (ART).

A novel multimodal medical image fusion model for Alzheimer's and glioma disease detection based on hybrid fusion strategies in non-subsampled shearlet transform domain.

Alabduljabbar A, Khan SU, Altherwy YN, Almarshad F, Alsuhaibani A

pubmed logopapersJul 27 2025
BackgroundMedical professionals may increase diagnostic accuracy using multimodal medical image fusion techniques to peer inside organs and tissues.ObjectiveThis research work aims to propose a solution for diverse medical diagnostic challenges.MethodsWe propose a dual-purpose model. Initially, we developed a pair of images using the intensity, hue, and saturation (IHS) approach. Next, we applied non-subsampled shearlet transform (NSST) decomposition to these images to obtain the low-frequency and high-frequency coefficients. We then enhanced the structure and background details of the low-frequency coefficients using a novel structure feature modification technique. For the high-frequency coefficients, we utilized the layer-weighted pulse coupled neural network fusion technique to acquire complementary pixel-level information. Finally, we employed reversed NSST and IHS to generate the fused resulting image.ResultsThe proposed approach has been verified on 1350 image sets from two different diseases, Alzheimer's and glioma, across numerous imaging modalities. Our proposed method beats existing cutting-edge models, as proven by both qualitative and quantitative evaluations, and provides valuable information for medical diagnosis. In the majority of cases, our proposed method performed well in terms of entropy, structure similarity index, standard deviation, average distance, and average pixel intensity due to the careful selection of unique fusion strategies in our model. However, in a few cases, NSSTSIPCA performs better than our proposed work in terms of intensity variations (mean absolute error and average distance).ConclusionsThis research work utilized various fusion strategies in the NSST domain to efficiently enhance structural, anatomical, and spectral information.

Unpaired T1-weighted MRI synthesis from T2-weighted data using unsupervised learning.

Zhao J, Zeng N, Zhao L, Li N

pubmed logopapersJul 27 2025
Magnetic Resonance Imaging (MRI) is indispensable for modern diagnostics because of its detailed anatomical and functional information without the use of ionizing radiation. However, acquiring multiple imaging sequences - such as T1-weighted (T1w) and T2-weighted (T2w) scans - can prolong scan times, increase patient discomfort, and raise healthcare costs. In this study, we propose an unsupervised framework based on a contrast-sensitive domain translation network with adaptive feature normalization to translate unpaired T2w MRI images into clinically acceptable T1w images. Our method employs adversarial training, along with cycle consistency, identity, and attention-guided loss functions. These components ensure that the generated images not only preserve essential anatomical details but also exhibit high visual fidelity compared to ground truth T1w images. Quantitative evaluation on a publicly available MRI dataset yielded a mean Peak Signal-to-Noise Ratio (PSNR) of 22.403 dB, a mean Structural Similarity Index (SSIM) of 0.775, Root Mean Squared Error (RMSE) of 0.078, and Mean Absolute Error (MAE) of 0.036. Additional analysis of pixel intensity and grayscale distributions further supported the consistency between the generated and ground truth images. Qualitative assessment included visual comparison to assess perceptual fidelity. These promising results suggest that a contrast-sensitive domain translation network with an adaptive feature normalization framework can effectively generate realistic T1w images from T2w inputs, potentially reducing the need for acquiring multiple sequences and thereby streamlining MRI protocols.

KC-UNIT: Multi-kernel conversion using unpaired image-to-image translation with perceptual guidance in chest computed tomography imaging.

Choi C, Kim D, Park S, Lee H, Kim H, Lee SM, Kim N

pubmed logopapersJul 26 2025
Computed tomography (CT) images are reconstructed from raw datasets including sinogram using various convolution kernels through back projection. Kernels are typically chosen depending on the anatomical structure being imaged and the specific purpose of the scan, balancing the trade-off between image sharpness and pixel noise. Generally, a sinogram requires large storage capacity, and storage space is often limited in clinical settings. Thus, CT images are generally reconstructed with only one specific kernel in clinical settings, and the sinogram is typically discarded after a week. Therefore, many researchers have proposed deep learning-based image-to-image translation methods for CT kernel conversion. However, transferring the style of the target kernel while preserving anatomical structure remains challenging, particularly when translating CT images from a source domain to a target domain in an unpaired manner, which is often encountered in real-world settings. Thus, we propose a novel kernel conversion method using unpaired image-to-image translation (KC-UNIT). This approach utilizes discriminator regularization, using feature maps from the generator to improve semantic representation learning. To capture content and style features, cosine similarity content and contrastive style losses were defined between the feature map of generator and semantic label map of discriminator. This can be easily incorporated by modifying the discriminator's architecture without requiring any additional learnable or pre-trained networks. The KC-UNIT demonstrated the ability to preserve fine-grained anatomical structure from the source domain during transfer. Our method outperformed existing generative adversarial network-based methods across most kernel conversion methods in three kernel domains. The code is available at https://github.com/cychoi97/KC-UNIT.

XVertNet: Unsupervised Contrast Enhancement of Vertebral Structures with Dynamic Self-Tuning Guidance and Multi-Stage Analysis.

Eidlin E, Hoogi A, Rozen H, Badarne M, Netanyahu NS

pubmed logopapersJul 25 2025
Chest X-ray is one of the main diagnostic tools in emergency medicine, yet its limited ability to capture fine anatomical details can result in missed or delayed diagnoses. To address this, we introduce XVertNet, a novel deep-learning framework designed to enhance vertebral structure visualization in X-ray images significantly. Our framework introduces two key innovations: (1) an unsupervised learning architecture that eliminates reliance on manually labeled training data-a persistent bottleneck in medical imaging, and (2) a dynamic self-tuned internal guidance mechanism featuring an adaptive feedback loop for real-time image optimization. Extensive validation across four major public datasets revealed that XVertNet outperforms state-of-the-art enhancement methods, as demonstrated by improvements in evaluation measures such as entropy, the Tenengrad criterion, LPC-SI, TMQI, and PIQE. Furthermore, clinical validation conducted by two board-certified clinicians confirmed that the enhanced images enabled more sensitive examination of vertebral structural changes. The unsupervised nature of XVertNet facilitates immediate clinical deployment without requiring additional training overhead. This innovation represents a transformative advancement in emergency radiology, providing a scalable and time-efficient solution to enhance diagnostic accuracy in high-pressure clinical environments.

Reconstruct or Generate: Exploring the Spectrum of Generative Modeling for Cardiac MRI

Niklas Bubeck, Yundi Zhang, Suprosanna Shit, Daniel Rueckert, Jiazhen Pan

arxiv logopreprintJul 25 2025
In medical imaging, generative models are increasingly relied upon for two distinct but equally critical tasks: reconstruction, where the goal is to restore medical imaging (usually inverse problems like inpainting or superresolution), and generation, where synthetic data is created to augment datasets or carry out counterfactual analysis. Despite shared architecture and learning frameworks, they prioritize different goals: generation seeks high perceptual quality and diversity, while reconstruction focuses on data fidelity and faithfulness. In this work, we introduce a "generative model zoo" and systematically analyze how modern latent diffusion models and autoregressive models navigate the reconstruction-generation spectrum. We benchmark a suite of generative models across representative cardiac medical imaging tasks, focusing on image inpainting with varying masking ratios and sampling strategies, as well as unconditional image generation. Our findings show that diffusion models offer superior perceptual quality for unconditional generation but tend to hallucinate as masking ratios increase, whereas autoregressive models maintain stable perceptual performance across masking levels, albeit with generally lower fidelity.

Joint Holistic and Lesion Controllable Mammogram Synthesis via Gated Conditional Diffusion Model

Xin Li, Kaixiang Yang, Qiang Li, Zhiwei Wang

arxiv logopreprintJul 25 2025
Mammography is the most commonly used imaging modality for breast cancer screening, driving an increasing demand for deep-learning techniques to support large-scale analysis. However, the development of accurate and robust methods is often limited by insufficient data availability and a lack of diversity in lesion characteristics. While generative models offer a promising solution for data synthesis, current approaches often fail to adequately emphasize lesion-specific features and their relationships with surrounding tissues. In this paper, we propose Gated Conditional Diffusion Model (GCDM), a novel framework designed to jointly synthesize holistic mammogram images and localized lesions. GCDM is built upon a latent denoising diffusion framework, where the noised latent image is concatenated with a soft mask embedding that represents breast, lesion, and their transitional regions, ensuring anatomical coherence between them during the denoising process. To further emphasize lesion-specific features, GCDM incorporates a gated conditioning branch that guides the denoising process by dynamically selecting and fusing the most relevant radiomic and geometric properties of lesions, effectively capturing their interplay. Experimental results demonstrate that GCDM achieves precise control over small lesion areas while enhancing the realism and diversity of synthesized mammograms. These advancements position GCDM as a promising tool for clinical applications in mammogram synthesis. Our code is available at https://github.com/lixinHUST/Gated-Conditional-Diffusion-Model/

Semantics versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification

Yuan Tian, Shuo Wang, Rongzhao Zhang, Zijian Chen, Yankai Jiang, Chunyi Li, Xiangyang Zhu, Fang Yan, Qiang Hu, XiaoSong Wang, Guangtao Zhai

arxiv logopreprintJul 25 2025
Medical imaging has significantly advanced computer-aided diagnosis, yet its re-identification (ReID) risks raise critical privacy concerns, calling for de-identification (DeID) techniques. Unfortunately, existing DeID methods neither particularly preserve medical semantics, nor are flexibly adjustable towards different privacy levels. To address these issues, we propose a divide-and-conquer framework comprising two steps: (1) Identity-Blocking, which blocks varying proportions of identity-related regions, to achieve different privacy levels; and (2) Medical-Semantics-Compensation, which leverages pre-trained Medical Foundation Models (MFMs) to extract medical semantic features to compensate the blocked regions. Moreover, recognizing that features from MFMs may still contain residual identity information, we introduce a Minimum Description Length principle-based feature decoupling strategy, to effectively decouple and discard such identity components. Extensive evaluations against existing approaches across seven datasets and three downstream tasks, demonstrates our state-of-the-art performance.

RealDeal: Enhancing Realism and Details in Brain Image Generation via Image-to-Image Diffusion Models

Shen Zhu, Yinzhu Jin, Tyler Spears, Ifrah Zawar, P. Thomas Fletcher

arxiv logopreprintJul 24 2025
We propose image-to-image diffusion models that are designed to enhance the realism and details of generated brain images by introducing sharp edges, fine textures, subtle anatomical features, and imaging noise. Generative models have been widely adopted in the biomedical domain, especially in image generation applications. Latent diffusion models achieve state-of-the-art results in generating brain MRIs. However, due to latent compression, generated images from these models are overly smooth, lacking fine anatomical structures and scan acquisition noise that are typically seen in real images. This work formulates the realism enhancing and detail adding process as image-to-image diffusion models, which refines the quality of LDM-generated images. We employ commonly used metrics like FID and LPIPS for image realism assessment. Furthermore, we introduce new metrics to demonstrate the realism of images generated by RealDeal in terms of image noise distribution, sharpness, and texture.
Page 4 of 19185 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.