Latest Papers on Radiology AI. Tags: Whole Body, Order: Best Match, Limit: 10.

Large medical image database impact on generalizability of synthetic CT scan generation.

Boily C, Mazellier JP, Meyer P

•papers•May 21 2025

This study systematically examines the impact of training database size and the generalizability of deep learning models for synthetic medical image generation. Specifically, we employ a Cycle-Consistency Generative Adversarial Network (CycleGAN) with softly paired data to synthesize kilovoltage computed tomography (kVCT) images from megavoltage computed tomography (MVCT) scans. Unlike previous works, which were constrained by limited data availability, our study uses an extensive database comprising 4,000 patient CT scans, an order of magnitude larger than prior research, allowing for a more rigorous assessment of database size in medical image translation. We quantitatively evaluate the fidelity of the generated synthetic images using established image similarity metrics, including Mean Absolute Error (MAE) and Structural Similarity Index Measure (SSIM). Beyond assessing image quality, we investigate the model's capacity for generalization by analyzing its performance across diverse patient subgroups, considering factors such as sex, age, and anatomical region. This approach enables a more granular understanding of how dataset composition influences model robustness.

CT Image Synthesis Whole Body Methodology In Silico Academic Lab Benchmark SOTA

Synthesizing [18F]PSMA-1007 PET bone images from CT images with GAN for early detection of prostate cancer bone metastases: a pilot validation study.

Chai L, Yao X, Yang X, Na R, Yan W, Jiang M, Zhu H, Sun C, Dai Z, Yang X

•papers•May 21 2025

[18F]FDG PET/CT scan combined with [18F]PSMA-1007 PET/CT scan is commonly conducted for detecting bone metastases in prostate cancer (PCa). However, it is expensive and may expose patients to more radiation hazards. This study explores deep learning (DL) techniques to synthesize [18F]PSMA-1007 PET bone images from CT bone images for the early detection of bone metastases in PCa, which may reduce additional PET/CT scans and relieve the burden on patients. We retrospectively collected paired whole-body (WB) [18F]PSMA-1007 PET/CT images from 152 patients with clinical and pathological diagnosis results, including 123 PCa and 29 cases of benign lesions. The average age of the patients was 67.48 ± 10.87 years, and the average lesion size was 8.76 ± 15.5 mm. The paired low-dose CT and PET images were preprocessed and segmented to construct the WB bone structure images. 152 subjects were randomly stratified into training, validation, and test groups in the number of 92:41:19. Two generative adversarial network (GAN) models-Pix2pix and Cycle GAN-were trained to synthesize [18F]PSMA-1007 PET bone images from paired CT bone images. The performance of two synthesis models was evaluated using quantitative metrics of mean absolute error (MAE), mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index metrics (SSIM), as well as the target-to-background ratio (TBR). The results of DL-based image synthesis indicated that the synthesis of [18F]PSMA-1007 PET bone images from low-dose CT bone images was highly feasible. The Pix2pix model performed better with an SSIM of 0.97, PSNR of 44.96, MSE of 0.80, and MAE of 0.10, respectively. The TBRs of bone metastasis lesions calculated on DL-synthesized PET bone images were highly correlated with those of real PET bone images (Pearson's r > 0.90) and had no significant differences (p < 0.05). It is feasible to generate synthetic [18F]PSMA-1007 PET bone images from CT bone images by using DL techniques with reasonable accuracy, which can provide information for early detection of PCa bone metastases.

Mixed Modality Image Synthesis Whole Body Retrospective Clinical In Silico Academic Lab GenAI

VET-DINO: Learning Anatomical Understanding Through Multi-View Distillation in Veterinary Imaging

Andre Dourson, Kylie Taylor, Xiaoli Qiao, Michael Fitzke

•preprint•May 21 2025

Self-supervised learning has emerged as a powerful paradigm for training deep neural networks, particularly in medical imaging where labeled data is scarce. While current approaches typically rely on synthetic augmentations of single images, we propose VET-DINO, a framework that leverages a unique characteristic of medical imaging: the availability of multiple standardized views from the same study. Using a series of clinical veterinary radiographs from the same patient study, we enable models to learn view-invariant anatomical structures and develop an implied 3D understanding from 2D projections. We demonstrate our approach on a dataset of 5 million veterinary radiographs from 668,000 canine studies. Through extensive experimentation, including view synthesis and downstream task performance, we show that learning from real multi-view pairs leads to superior anatomical understanding compared to purely synthetic augmentations. VET-DINO achieves state-of-the-art performance on various veterinary imaging tasks. Our work establishes a new paradigm for self-supervised learning in medical imaging that leverages domain-specific properties rather than merely adapting natural image techniques.

X-Ray Classification Whole Body Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

Intelligent health model for medical imaging to guide laymen using neural cellular automata.

Sharma SK, Chowdhary CL, Sharma VS, Rasool A, Khan AA

•papers•May 20 2025

A layman in health systems is a person who doesn't have any knowledge about health data i.e., X-ray, MRI, CT scan, and health examination reports, etc. The motivation behind the proposed invention is to help laymen to make medical images understandable. The health model is trained using a neural network approach that analyses user health examination data; predicts the type and level of the disease and advises precaution to the user. Cellular Automata (CA) technology has been integrated with the neural networks to segment the medical image. The CA analyzes the medical images pixel by pixel and generates a robust threshold value which helps to efficiently segment the image and identify accurate abnormal spots from the medical image. The proposed method has been trained and experimented using 10000+ medical images which are taken from various open datasets. Various text analysis measures i.e., BLEU, ROUGE, and WER are used in the research to validate the produced report. The BLEU and ROUGE calculate a similarity to decide how the generated text report is closer to the original report. The BLEU and ROUGE scores of the experimented images are approximately 0.62 and 0.90, claims that the produced report is very close to the original report. The WER score 0.14, claims that the generated report contains the most relevant words. The overall summary of the proposed research is that it provides a fruitful medical report with accurate disease and precautions to the laymen.

Mixed Modality Report Generation Whole Body Methodology In Silico Academic Lab Open Dataset

Feasibility study of a general model for synthetic CT generation in MRI-guided extracranial radiotherapy.

Hsu SH, Han Z, Hu YH, Ferguson D, van Dams R, Mak RH, Leeman JE, Sudhyadhom A

•papers•May 19 2025

This study aims to investigate the feasibility of a single general model to synthesize CT images across body sites, thorax, abdomen, and pelvis, to support treatment planning for MRI-only radiotherapy. A total of 157 patients who received MRI-guided radiation therapy in the thorax, abdomen, and pelvis on a 0.35T MRIdian Linac were included. A subset of 122 cases were used for model training and the remaining 35 cases were used for model validation. All patient datasets had semi-paired CT-simulation image and 0.35T MR image acquired using TrueFISP. A conditional generative adversarial network with a multi-planar method was used to generate synthetic CT images from 0.35T MR images. The effect of preprocessing methods (with and without bias field corrections) on the quality of synthetic CT was evaluated and found to be insignificant. The general models trained on all cases performed comparably to the site-specific models trained on individual body sites. For all models, the peak signal-to-noise ratios ranged from 31.7 to 34.9 and the structural index similarity measures ranged from 0.9547 to 0.9758. For the datasets with bias field corrections, the mean-absolute-errors in HU (general model versus site-specific model) were 49.7 ± 9.4 versus 49.5 ± 8.9, 48.7 ± 7.6 versus 43 ± 7.8 and 32.8 ± 5.5 versus 31.8 ± 5.3 for the thorax, abdomen, and pelvis, respectively. When comparing plans between synthetic CTs and ground truth CTs, the dosimetric difference was on average less than 0.5% (0.2 Gy) for target coverage and less than 2.1% (0.4 Gy) for organ-at-risk metrics for all body sites with either the general or specific models. Synthetic CT plans showed good agreement with mean gamma pass rates of >94% and >99% for 1%/1 mm and 2%/2 mm, respectively. This study has demonstrated the feasibility of using a general model for multiple body sites and the potential of using synthetic CT to support an MRI-guided radiotherapy workflow.

Mixed Modality Image Synthesis Whole Body Retrospective Clinical In Silico Academic Lab

Learning Wavelet-Sparse FDK for 3D Cone-Beam CT Reconstruction

Yipeng Sun, Linda-Sophie Schneider, Chengze Ye, Mingxuan Gu, Siyuan Mei, Siming Bayer, Andreas Maier

•preprint•May 19 2025

Cone-Beam Computed Tomography (CBCT) is essential in medical imaging, and the Feldkamp-Davis-Kress (FDK) algorithm is a popular choice for reconstruction due to its efficiency. However, FDK is susceptible to noise and artifacts. While recent deep learning methods offer improved image quality, they often increase computational complexity and lack the interpretability of traditional methods. In this paper, we introduce an enhanced FDK-based neural network that maintains the classical algorithm's interpretability by selectively integrating trainable elements into the cosine weighting and filtering stages. Recognizing the challenge of a large parameter space inherent in 3D CBCT data, we leverage wavelet transformations to create sparse representations of the cosine weights and filters. This strategic sparsification reduces the parameter count by $93.75\%$ without compromising performance, accelerates convergence, and importantly, maintains the inference computational cost equivalent to the classical FDK algorithm. Our method not only ensures volumetric consistency and boosts robustness to noise, but is also designed for straightforward integration into existing CT reconstruction pipelines. This presents a pragmatic enhancement that can benefit clinical applications, particularly in environments with computational limitations.

CT Reconstruction Whole Body Methodology In Silico

A Comprehensive Review of Techniques, Algorithms, Advancements, Challenges, and Clinical Applications of Multi-modal Medical Image Fusion for Improved Diagnosis

Muhammad Zubair, Muzammil Hussai, Mousa Ahmad Al-Bashrawi, Malika Bendechache, Muhammad Owais

•preprint•May 18 2025

Multi-modal medical image fusion (MMIF) is increasingly recognized as an essential technique for enhancing diagnostic precision and facilitating effective clinical decision-making within computer-aided diagnosis systems. MMIF combines data from X-ray, MRI, CT, PET, SPECT, and ultrasound to create detailed, clinically useful images of patient anatomy and pathology. These integrated representations significantly advance diagnostic accuracy, lesion detection, and segmentation. This comprehensive review meticulously surveys the evolution, methodologies, algorithms, current advancements, and clinical applications of MMIF. We present a critical comparative analysis of traditional fusion approaches, including pixel-, feature-, and decision-level methods, and delves into recent advancements driven by deep learning, generative models, and transformer-based architectures. A critical comparative analysis is presented between these conventional methods and contemporary techniques, highlighting differences in robustness, computational efficiency, and interpretability. The article addresses extensive clinical applications across oncology, neurology, and cardiology, demonstrating MMIF's vital role in precision medicine through improved patient-specific therapeutic outcomes. Moreover, the review thoroughly investigates the persistent challenges affecting MMIF's broad adoption, including issues related to data privacy, heterogeneity, computational complexity, interpretability of AI-driven algorithms, and integration within clinical workflows. It also identifies significant future research avenues, such as the integration of explainable AI, adoption of privacy-preserving federated learning frameworks, development of real-time fusion systems, and standardization efforts for regulatory compliance.

Mixed Modality Image Synthesis Whole Body Review Concept Ethics Policy GenAI

MedSG-Bench: A Benchmark for Medical Image Sequences Grounding

Jingkun Yue, Siqi Zhang, Zinan Jia, Huihuan Xu, Zongbo Han, Xiaohong Liu, Guangyu Wang

•preprint•May 17 2025

Visual grounding is essential for precise perception and reasoning in multimodal large language models (MLLMs), especially in medical imaging domains. While existing medical visual grounding benchmarks primarily focus on single-image scenarios, real-world clinical applications often involve sequential images, where accurate lesion localization across different modalities and temporal tracking of disease progression (e.g., pre- vs. post-treatment comparison) require fine-grained cross-image semantic alignment and context-aware reasoning. To remedy the underrepresentation of image sequences in existing medical visual grounding benchmarks, we propose MedSG-Bench, the first benchmark tailored for Medical Image Sequences Grounding. It comprises eight VQA-style tasks, formulated into two paradigms of the grounding tasks, including 1) Image Difference Grounding, which focuses on detecting change regions across images, and 2) Image Consistency Grounding, which emphasizes detection of consistent or shared semantics across sequential images. MedSG-Bench covers 76 public datasets, 10 medical imaging modalities, and a wide spectrum of anatomical structures and diseases, totaling 9,630 question-answer pairs. We benchmark both general-purpose MLLMs (e.g., Qwen2.5-VL) and medical-domain specialized MLLMs (e.g., HuatuoGPT-vision), observing that even the advanced models exhibit substantial limitations in medical sequential grounding tasks. To advance this field, we construct MedSG-188K, a large-scale instruction-tuning dataset tailored for sequential visual grounding, and further develop MedSeq-Grounder, an MLLM designed to facilitate future research on fine-grained understanding across medical sequential images. The benchmark, dataset, and model are available at https://huggingface.co/MedSG-Bench

Mixed Modality Detection Whole Body Dataset Release In Silico Academic Lab Open Dataset Open Code GenAI

Uncertainty quantification for deep learning-based metastatic lesion segmentation on whole body PET/CT.

Schott B, Santoro-Fernandes V, Klanecek Z, Perlman S, Jeraj R

•papers•May 16 2025

Deep learning models are increasingly being implemented for automated medical image analysis to inform patient care. Most models, however, lack uncertainty information, without which the reliability of model outputs cannot be ensured. Several uncertainty quantification (UQ) methods exist to capture model uncertainty. Yet, it is not clear which method is optimal for a given task. The purpose of this work was to investigate several commonly used UQ methods for the critical yet understudied task of metastatic lesion segmentation on whole body PET/CT. Approach:59 whole body 68Ga-DOTATATE PET/CT images of patients undergoing theranostic treatment of metastatic neuroendocrine tumors were used in this work. A 3D U-Net was trained for lesion segmentation following five-fold cross validation. Uncertainty measures derived from four UQ methods-probability entropy, Monte Carlo dropout, deep ensembles, and test time augmentation-were investigated. Each uncertainty measure was assessed across four quantitative evaluations: (1) its ability to detect artificially degraded image data at low, medium, and high degradation magnitudes; (2) to detect false-positive (FP) predicted regions; (3) to recover false-negative (FN) predicted regions; and (3) to establish correlations with model biomarker extraction and segmentation performance metrics. Results: Test time augmentation and probability entropy respectively achieved the highest and lowest degraded image detection at low (AUC=0.54 vs. 0.68), medium (AUC=0.70 vs. 0.82), and high (AUC=0.83 vs. 0.90) degradation magnitudes. For detecting FPs, all UQ methods achieve strong performance, with AUC values ranging narrowly between 0.77 and 0.81. FN region recovery performance was strongest for test time augmentation and weakest for probability entropy. Performance for the correlation analysis was mixed, where the strongest performance was achieved by test time augmentation for SUVtotal capture (ρ=0.57) and segmentation Dice coefficient (ρ=0.72), by Monte Carlo dropout for SUVmean capture (ρ=0.35), and by probability entropy for segmentation cross entropy (ρ=0.96).Significance: Overall, test time augmentation demonstrated superior uncertainty quantification performance and is recommended for use in metastatic lesion segmentation task. It also offers the advantage of being post hoc and computationally efficient. In contrast, probability entropy performed the worst, highlighting the need for advanced UQ approaches for this task.&#xD.

Mixed Modality Segmentation Whole Body Retrospective Clinical In Silico Academic Lab

2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction.

Chen T, Hou J, Zhou Y, Xie H, Chen X, Liu Q, Guo X, Xia M, Duncan JS, Liu C, Zhou B

•papers•May 15 2025

Positron Emission Tomography (PET) is an important clinical imaging tool but inevitably introduces radiation exposure to patients and healthcare providers. Reducing the tracer injection dose and eliminating the CT acquisition for attenuation correction can reduce the overall radiation dose, but often results in PET with high noise and bias. Thus, it is desirable to develop 3D methods to translate the non-attenuation-corrected low-dose PET (NAC-LDPET) into attenuation-corrected standard-dose PET (AC-SDPET). Recently, diffusion models have emerged as a new state-of-the-art deep learning method for image-to-image translation, better than traditional CNN-based methods. However, due to the high computation cost and memory burden, it is largely limited to 2D applications. To address these challenges, we developed a novel 2.5D Multi-view Averaging Diffusion Model (MADM) for 3D image-to-image translation with application on NAC-LDPET to AC-SDPET translation. Specifically, MADM employs separate diffusion models for axial, coronal, and sagittal views, whose outputs are averaged in each sampling step to ensure the 3D generation quality from multiple views. To accelerate the 3D sampling process, we also proposed a strategy to use the CNN-based 3D generation as a prior for the diffusion model. Our experimental results on human patient studies suggested that MADM can generate high-quality 3D translation images, outperforming previous CNN-based and Diffusion-based baseline methods. The code is available at https://github.com/tianqic/MADM.

PET Reconstruction Whole Body Methodology In Silico Academic Lab Open Code Benchmark SOTA