Sort by:
Page 16 of 19185 results

Optimizing MR-based attenuation correction in hybrid PET/MR using deep learning: validation with a flatbed insert and consistent patient positioning.

Wang H, Wang Y, Xue Q, Zhang Y, Qiao X, Lin Z, Zheng J, Zhang Z, Yang Y, Zhang M, Huang Q, Huang Y, Cao T, Wang J, Li B

pubmed logopapersJun 1 2025
To address the challenges of verifying MR-based attenuation correction (MRAC) in PET/MR due to CT positional mismatches and alignment issues, this study utilized a flatbed insert and arms-down positioning during PET/CT scans to achieve precise MR-CT matching for accurate MRAC evaluation. A validation dataset of 21 patients underwent whole-body [<sup>18</sup>F]FDG PET/CT followed by [<sup>18</sup>F]FDG PET/MR. A flatbed insert ensured consistent positioning, allowing direct comparison of four MRAC methods-four-tissue and five-tissue models with discrete and continuous μ-maps-against CT-based attenuation correction (CTAC). A deep learning-based framework, trained on a dataset of 300 patients, was used to generate synthesized-CTs from MR images, forming the basis for all MRAC methods. Quantitative analyses were conducted at the whole-body, region of interest, and lesion levels, with lesion-distance analysis evaluating the impact of bone proximity on standardized uptake value (SUV) quantification. Distinct differences were observed among MRAC methods in spine and femur regions. Joint histogram analysis showed MRAC-4 (continuous μ-map) closely aligned with CTAC. Lesion-distance analysis revealed MRAC-4 minimized bone-induced SUV interference (r = 0.01, p = 0.8643). However, tissues prone to bone segmentation interference, such as the spine and liver, exhibited greater SUV variability and lower reproducibility in MRAC-4 compared to MRAC-2 (2D bone segmentation, discrete μ-map) and MRAC-3 (3D bone segmentation, discrete μ-map). Using a flatbed insert, this study validated MRAC with high precision. Continuous μ-value MRAC method (MRAC-4) demonstrated superior accuracy and minimized bone-related SUV errors but faced challenges in reproducibility, particularly in bone-rich regions.

Eliminating the second CT scan of dual-tracer total-body PET/CT via deep learning-based image synthesis and registration.

Lin Y, Wang K, Zheng Z, Yu H, Chen S, Tang W, He Y, Gao H, Yang R, Xie Y, Yang J, Hou X, Wang S, Shi H

pubmed logopapersJun 1 2025
This study aims to develop and validate a deep learning framework designed to eliminate the second CT scan of dual-tracer total-body PET/CT imaging. We retrospectively included three cohorts of 247 patients who underwent dual-tracer total-body PET/CT imaging on two separate days (time interval:1-11 days). Out of these, 167 underwent [<sup>68</sup>Ga]Ga-DOTATATE/[<sup>18</sup>F]FDG, 50 underwent [<sup>68</sup>Ga]Ga-PSMA-11/[<sup>18</sup>F]FDG, and 30 underwent [<sup>68</sup>Ga]Ga-FAPI-04/[<sup>18</sup>F]FDG. A deep learning framework was developed that integrates a registration generative adversarial network (RegGAN) with non-rigid registration techniques. This approach allows for the transformation of attenuation-correction CT (ACCT) images from the first scan into pseudo-ACCT images for the second scan, which are then used for attenuation and scatter correction (ASC) of the second tracer PET images. Additionally, the derived registration transform facilitates dual-tracer image fusion and analysis. The deep learning-based ASC PET images were evaluated using quantitative metrics, including mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) across the whole body and specific regions. Furthermore, the quantitative accuracy of PET images was assessed by calculating standardized uptake value (SUV) bias in normal organs and lesions. The MAE for whole-body pseudo-ACCT images ranged from 97.64 to 112.59 HU across four tracers. The deep learning-based ASC PET images demonstrated high similarity to the ground-truth PET images. The MAE of SUV for whole-body PET images was 0.06 for [<sup>68</sup>Ga]Ga-DOTATATE, 0.08 for [<sup>68</sup>Ga]Ga-PSMA-11, 0.06 for [<sup>68</sup>Ga]Ga-FAPI-04, and 0.05 for [<sup>18</sup>F]FDG, respectively. Additionally, the median absolute percent deviation of SUV was less than 2.6% for all normal organs, while the mean absolute percent deviation of SUV was less than 3.6% for lesions across four tracers. The proposed deep learning framework, combining RegGAN and non-rigid registration, shows promise in reducing CT radiation dose for dual-tracer total-body PET/CT imaging, with successful validation across multiple tracers.

Text-to-CT Generation via 3D Latent Diffusion Model with Contrastive Vision-Language Pretraining

Daniele Molino, Camillo Maria Caruso, Filippo Ruffini, Paolo Soda, Valerio Guarrasi

arxiv logopreprintMay 31 2025
Objective: While recent advances in text-conditioned generative models have enabled the synthesis of realistic medical images, progress has been largely confined to 2D modalities such as chest X-rays. Extending text-to-image generation to volumetric Computed Tomography (CT) remains a significant challenge, due to its high dimensionality, anatomical complexity, and the absence of robust frameworks that align vision-language data in 3D medical imaging. Methods: We introduce a novel architecture for Text-to-CT generation that combines a latent diffusion model with a 3D contrastive vision-language pretraining scheme. Our approach leverages a dual-encoder CLIP-style model trained on paired CT volumes and radiology reports to establish a shared embedding space, which serves as the conditioning input for generation. CT volumes are compressed into a low-dimensional latent space via a pretrained volumetric VAE, enabling efficient 3D denoising diffusion without requiring external super-resolution stages. Results: We evaluate our method on the CT-RATE dataset and conduct a comprehensive assessment of image fidelity, clinical relevance, and semantic alignment. Our model achieves competitive performance across all tasks, significantly outperforming prior baselines for text-to-CT generation. Moreover, we demonstrate that CT scans synthesized by our framework can effectively augment real data, improving downstream diagnostic performance. Conclusion: Our results show that modality-specific vision-language alignment is a key component for high-quality 3D medical image generation. By integrating contrastive pretraining and volumetric diffusion, our method offers a scalable and controllable solution for synthesizing clinically meaningful CT volumes from text, paving the way for new applications in data augmentation, medical education, and automated clinical simulation.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

ABCDEFGH: An Adaptation-Based Convolutional Neural Network-CycleGAN Disease-Courses Evolution Framework Using Generative Models in Health Education

Ruiming Min, Minghao Liu

arxiv logopreprintMay 31 2025
With the advancement of modern medicine and the development of technologies such as MRI, CT, and cellular analysis, it has become increasingly critical for clinicians to accurately interpret various diagnostic images. However, modern medical education often faces challenges due to limited access to high-quality teaching materials, stemming from privacy concerns and a shortage of educational resources (Balogh et al., 2015). In this context, image data generated by machine learning models, particularly generative models, presents a promising solution. These models can create diverse and comparable imaging datasets without compromising patient privacy, thereby supporting modern medical education. In this study, we explore the use of convolutional neural networks (CNNs) and CycleGAN (Zhu et al., 2017) for generating synthetic medical images. The source code is available at https://github.com/mliuby/COMP4211-Project.

pyMEAL: A Multi-Encoder Augmentation-Aware Learning for Robust and Generalizable Medical Image Translation

Abdul-mojeed Olabisi Ilyas, Adeleke Maradesa, Jamal Banzi, Jianpan Huang, Henry K. F. Mak, Kannie W. Y. Chan

arxiv logopreprintMay 30 2025
Medical imaging is critical for diagnostics, but clinical adoption of advanced AI-driven imaging faces challenges due to patient variability, image artifacts, and limited model generalization. While deep learning has transformed image analysis, 3D medical imaging still suffers from data scarcity and inconsistencies due to acquisition protocols, scanner differences, and patient motion. Traditional augmentation uses a single pipeline for all transformations, disregarding the unique traits of each augmentation and struggling with large data volumes. To address these challenges, we propose a Multi-encoder Augmentation-Aware Learning (MEAL) framework that leverages four distinct augmentation variants processed through dedicated encoders. Three fusion strategies such as concatenation (CC), fusion layer (FL), and adaptive controller block (BD) are integrated to build multi-encoder models that combine augmentation-specific features before decoding. MEAL-BD uniquely preserves augmentation-aware representations, enabling robust, protocol-invariant feature learning. As demonstrated in a Computed Tomography (CT)-to-T1-weighted Magnetic Resonance Imaging (MRI) translation study, MEAL-BD consistently achieved the best performance on both unseen- and predefined-test data. On both geometric transformations (like rotations and flips) and non-augmented inputs, MEAL-BD outperformed other competing methods, achieving higher mean peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) scores. These results establish MEAL as a reliable framework for preserving structural fidelity and generalizing across clinically relevant variability. By reframing augmentation as a source of diverse, generalizable features, MEAL supports robust, protocol-invariant learning, advancing clinically reliable medical imaging solutions.

RNN-AHF Framework: Enhancing Multi-focal Nature of Hypoxic Ischemic Encephalopathy Lesion Region in MRI Image Using Optimized Rough Neural Network Weight and Anti-Homomorphic Filter.

Thangeswari M, Muthucumaraswamy R, Anitha K, Shanker NR

pubmed logopapersMay 29 2025
Image enhancement of the Hypoxic-Ischemic Encephalopathy (HIE) lesion region in neonatal brain MR images is a challenging task due to the diffuse (i.e., multi-focal) nature, small size, and low contrast of the lesions. Classifying the stages of HIE is also difficult because of the unclear boundaries and edges of the lesions, which are dispersedthroughout the brain. Moreover, unclear boundaries and edges are due to chemical shifts, partial volume artifacts, and motion artifacts. Further, voxels may reflect signals from adjacent tissues. Existing algorithms perform poorly in HIE lesion enhancement due to artifacts, voxels, and the diffuse nature of the lesion. In this paper, we propose a Rough Neural Network and Anti-Homomorphic Filter (RNN-AHF) framework for the enhancement of the HIE lesion region. The RNN-AHF framework reduces the pixel dimensionality of the feature space, eliminates unnecessary pixels, and preserves essential pixels for lesion enhancement. The RNN efficiently learns and identifies pixel patterns and facilitates adaptive enhancement based on different weights in the neural network. The proposed RNN-AHF framework operates using optimized neural weights and an optimized training function. The hybridization of optimized weights and the training function enhances the lesion region with high contrast while preserving the boundaries and edges. The proposed RNN-AHF framework achieves a lesion image enhancement and classification accuracy of approximately 93.5%, which is better than traditional algorithms.

ImmunoDiff: A Diffusion Model for Immunotherapy Response Prediction in Lung Cancer

Moinak Bhattacharya, Judy Huang, Amna F. Sher, Gagandeep Singh, Chao Chen, Prateek Prasanna

arxiv logopreprintMay 29 2025
Accurately predicting immunotherapy response in Non-Small Cell Lung Cancer (NSCLC) remains a critical unmet need. Existing radiomics and deep learning-based predictive models rely primarily on pre-treatment imaging to predict categorical response outcomes, limiting their ability to capture the complex morphological and textural transformations induced by immunotherapy. This study introduces ImmunoDiff, an anatomy-aware diffusion model designed to synthesize post-treatment CT scans from baseline imaging while incorporating clinically relevant constraints. The proposed framework integrates anatomical priors, specifically lobar and vascular structures, to enhance fidelity in CT synthesis. Additionally, we introduce a novel cbi-Adapter, a conditioning module that ensures pairwise-consistent multimodal integration of imaging and clinical data embeddings, to refine the generative process. Additionally, a clinical variable conditioning mechanism is introduced, leveraging demographic data, blood-based biomarkers, and PD-L1 expression to refine the generative process. Evaluations on an in-house NSCLC cohort treated with immune checkpoint inhibitors demonstrate a 21.24% improvement in balanced accuracy for response prediction and a 0.03 increase in c-index for survival prediction. Code will be released soon.

Multimodal medical image-to-image translation via variational autoencoder latent space mapping.

Liang Z, Cheng M, Ma J, Hu Y, Li S, Tian X

pubmed logopapersMay 29 2025
Medical image translation has become an essential tool in modern radiotherapy, providing complementary information for target delineation and dose calculation. However, current approaches are constrained by their modality-specific nature, requiring separate model training for each pair of imaging modalities. This limitation hinders the efficient deployment of comprehensive multimodal solutions in clinical practice. To develop a unified image translation method using variational autoencoder (VAE) latent space mapping, which enables flexible conversion between different medical imaging modalities to meet clinical demands. We propose a three-stage approach to construct a unified image translation model. Initially, a VAE is trained to learn a shared latent space for various medical images. A stacked bidirectional transformer is subsequently utilized to learn the mapping between different modalities within the latent space under the guidance of the image modality. Finally, the VAE decoder is fine-tuned to improve image quality. Our internal dataset collected paired imaging data from 87 head and neck cases, with each case containing cone beam computed tomography (CBCT), computed tomography (CT), MR T1c, and MR T2W images. The effectiveness of this strategy is quantitatively evaluated on our internal dataset and a public dataset by the mean absolute error (MAE), peak-signal-to-noise ratio (PSNR), and structural similarity index (SSIM). Additionally, the dosimetry characteristics of the synthetic CT images are evaluated, and subjective quality assessments of the synthetic MR images are conducted to determine their clinical value. The VAE with the Kullback‒Leibler (KL)-16 image tokenizer demonstrates superior image reconstruction ability, achieving a Fréchet inception distance (FID) of 4.84, a PSNR of 32.80 dB, and an SSIM of 92.33%. In synthetic CT tasks, the model shows greater accuracy in intramodality translations than in cross-modality translations, as evidenced by an MAE of 21.60 ± 8.80 Hounsfield unit (HU) in the CBCT-to-CT task and 45.23 ± 13.21 HU/47.55 ± 13.88 in the MR T1c/T2w-to-CT tasks. For the cross-contrast MR translation tasks, the results are very close, with mean PSNR and SSIM values of 26.33 ± 1.36 dB and 85.21% ± 2.21%, respectively, for the T1c-to-T2w translation and 26.03 ± 1.67 dB and 85.73% ± 2.66%, respectively, for the T2w-to-T1c translation. Dosimetric results indicate that all the gamma pass rates for synthetic CTs are higher than 99% for photon intensity-modulated radiation therapy (IMRT) planning. However, the subjective quality assessment scores for synthetic MR images are lower than those for real MR images. The proposed three-stage approach successfully develops a unified image translation model that can effectively handle a wide range of medical image translation tasks. This flexibility and effectiveness make it a valuable tool for clinical applications.

Cascaded 3D Diffusion Models for Whole-body 3D 18-F FDG PET/CT synthesis from Demographics

Siyeop Yoon, Sifan Song, Pengfei Jin, Matthew Tivnan, Yujin Oh, Sekeun Kim, Dufan Wu, Xiang Li, Quanzheng Li

arxiv logopreprintMay 28 2025
We propose a cascaded 3D diffusion model framework to synthesize high-fidelity 3D PET/CT volumes directly from demographic variables, addressing the growing need for realistic digital twins in oncologic imaging, virtual trials, and AI-driven data augmentation. Unlike deterministic phantoms, which rely on predefined anatomical and metabolic templates, our method employs a two-stage generative process. An initial score-based diffusion model synthesizes low-resolution PET/CT volumes from demographic variables alone, providing global anatomical structures and approximate metabolic activity. This is followed by a super-resolution residual diffusion model that refines spatial resolution. Our framework was trained on 18-F FDG PET/CT scans from the AutoPET dataset and evaluated using organ-wise volume and standardized uptake value (SUV) distributions, comparing synthetic and real data between demographic subgroups. The organ-wise comparison demonstrated strong concordance between synthetic and real images. In particular, most deviations in metabolic uptake values remained within 3-5% of the ground truth in subgroup analysis. These findings highlight the potential of cascaded 3D diffusion models to generate anatomically and metabolically accurate PET/CT images, offering a robust alternative to traditional phantoms and enabling scalable, population-informed synthetic imaging for clinical and research applications.
Page 16 of 19185 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.