Latest Papers on Radiology AI. Tags: Whole Body, Order: Best Match, Limit: 10.

LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models

Zhihao Chen, Tao Chen, Chenhui Wang, Qi Gao, Huidong Xie, Chuang Niu, Ge Wang, Hongming Shan

•preprint•Jul 8 2025

Low-dose computed tomography (LDCT) reduces radiation exposure but often degrades image quality, potentially compromising diagnostic accuracy. Existing deep learning-based denoising methods focus primarily on pixel-level mappings, overlooking the potential benefits of high-level semantic guidance. Recent advances in vision-language models (VLMs) suggest that language can serve as a powerful tool for capturing structured semantic information, offering new opportunities to improve LDCT reconstruction. In this paper, we introduce LangMamba, a Language-driven Mamba framework for LDCT denoising that leverages VLM-derived representations to enhance supervision from normal-dose CT (NDCT). LangMamba follows a two-stage learning strategy. First, we pre-train a Language-guided AutoEncoder (LangAE) that leverages frozen VLMs to map NDCT images into a semantic space enriched with anatomical information. Second, we synergize LangAE with two key components to guide LDCT denoising: Semantic-Enhanced Efficient Denoiser (SEED), which enhances NDCT-relevant local semantic while capturing global features with efficient Mamba mechanism, and Language-engaged Dual-space Alignment (LangDA) Loss, which ensures that denoised images align with NDCT in both perceptual and semantic spaces. Extensive experiments on two public datasets demonstrate that LangMamba outperforms conventional state-of-the-art methods, significantly improving detail preservation and visual fidelity. Remarkably, LangAE exhibits strong generalizability to unseen datasets, thereby reducing training costs. Furthermore, LangDA loss improves explainability by integrating language-guided insights into image reconstruction and offers a plug-and-play fashion. Our findings shed new light on the potential of language as a supervisory signal to advance LDCT denoising. The code is publicly available on https://github.com/hao1635/LangMamba.

CT Reconstruction Whole Body Methodology In Silico Academic Lab Open Code GenAI

AI lesion tracking in PET/CT imaging: a proposal for a Siamese-based CNN pipeline applied to PSMA PET/CT scans.

Hein SP, Schultheiss M, Gafita A, Zaum R, Yagubbayli F, Tauber R, Rauscher I, Eiber M, Pfeiffer F, Weber WA

•papers•Jul 8 2025

Assessing tumor response to systemic therapies is one of the main applications of PET/CT. Routinely, only a small subset of index lesions out of multiple lesions is analyzed. However, this operator dependent selection may bias the results due to possible significant inter-metastatic heterogeneity of response to therapy. Automated, AI-based approaches for lesion tracking hold promise in enabling the analysis of many more lesions and thus providing a better assessment of tumor response. This work introduces a Siamese CNN approach for lesion tracking between PET/CT scans. Our approach is applied on the laborious task of tracking a high number of bone lesions in full-body baseline and follow-up [68Ga]Ga- or [18F]F-PSMA PET/CT scans after two cycles of [177Lu]Lu-PSMA therapy of metastatic castration resistant prostate cancer patients. Data preparation includes lesion segmentation and affine registration. Our algorithm extracts suitable lesion patches and forwards them into a Siamese CNN trained to classify the lesion patch pairs as corresponding or non-corresponding lesions. Experiments have been performed with different input patch types and a Siamese network in 2D and 3D. The CNN model successfully learned to classify lesion assignments, reaching an accuracy of 83 % in its best configuration with an AUC = 0.91. For corresponding lesions the pipeline accomplished lesion tracking accuracy of even 89 %. We proved that a CNN may facilitate the tracking of multiple lesions in PSMA PET/CT scans. Future clinical studies are necessary if this improves the prediction of the outcome of therapies.

PET Registration Whole Body Methodology In Silico

Foundation models for radiology: fundamentals, applications, opportunities, challenges, risks, and prospects.

Akinci D'Antonoli T, Bluethgen C, Cuocolo R, Klontzas ME, Ponsiglione A, Kocak B

•papers•Jul 8 2025

Foundation models (FMs) represent a significant evolution in artificial intelligence (AI), impacting diverse fields. Within radiology, this evolution offers greater adaptability, multimodal integration, and improved generalizability compared with traditional narrow AI. Utilizing large-scale pre-training and efficient fine-tuning, FMs can support diverse applications, including image interpretation, report generation, integrative diagnostics combining imaging with clinical/laboratory data, and synthetic data creation, holding significant promise for advancements in precision medicine. However, clinical translation of FMs faces several substantial challenges. Key concerns include the inherent opacity of model decision-making processes, environmental and social sustainability issues, risks to data privacy, complex ethical considerations, such as bias and fairness, and navigating the uncertainty of regulatory frameworks. Moreover, rigorous validation is essential to address inherent stochasticity and the risk of hallucination. This international collaborative effort provides a comprehensive overview of the fundamentals, applications, opportunities, challenges, and prospects of FMs, aiming to guide their responsible and effective adoption in radiology and healthcare.

Mixed Modality LLM Radiology Report Whole Body Review Concept Consortium GenAI Ethics Policy

AI-enhanced patient-specific dosimetry in I-131 planar imaging with a single oblique view.

Jalilifar M, Sadeghi M, Emami-Ardekani A, Bitarafan-Rajabi A, Geravand K, Geramifar P

•papers•Jul 8 2025

This study aims to enhance the dosimetry accuracy in 131I planar imaging by utilizing a single oblique view and Monte Carlo (MC) validated dose point kernels (DPKs) alongside the integration of artificial intelligence (AI) for accurate dose prediction within planar imaging. Forty patients with thyroid cancers post-thyroidectomy surgery and 30 with neuroendocrine tumors underwent planar and SPECT/CT imaging. Using whole-body (WB) planar images with an additional oblique view, organ thicknesses were estimated. DPKs and organ-specific S-values were used to estimate the absorbed doses. Four AI algorithms- multilayer perceptron (MLP), linear regression, support vector regression model, decision tree, convolution neural network, and U-Net were used for dose estimation. Planar image counts, body thickness, patient BMI, age, S-values, and tissue attenuation coefficients were imported as input into the AI algorithm. To provide the ground truth, the CT-based segmentation generated binary masks for each organ, and the corresponding SPECT images were used for GATE MC dosimetry. The MLP-predicted dose values across all organs represented superior performance with the lowest mean absolute error in the liver but higher in the spleen and salivary glands. Notably, MLP-based dose estimations closely matched ground truth data with < 15% differences in most tissues. The MLP-estimated dose values present a robust patient-specific dosimetry approach capable of swiftly predicting absorbed doses in different organs using WB planar images and a single oblique view. This approach facilitates the implementation of 2D planar imaging as a pre-therapeutic technique for a more accurate assessment of the administrated activity.

SPECT Registration Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A Pan-Organ Vision-Language Model for Generalizable 3D CT Representations.

Beeche C, Kim J, Tavolinejad H, Zhao B, Sharma R, Duda J, Gee J, Dako F, Verma A, Morse C, Hou B, Shen L, Sagreiya H, Davatzikos C, Damrauer S, Ritchie MD, Rader D, Long Q, Chen T, Kahn CE, Chirinos J, Witschey WR

•papers•Jul 3 2025

Generalizable foundation models for computed tomographic (CT) medical imaging data are emerging AI tools anticipated to vastly improve clinical workflow efficiency. However, existing models are typically trained within narrow imaging contexts, including limited anatomical coverage, contrast settings, and clinical indications. These constraints reduce their ability to generalize across the broad spectrum of real-world presentations encountered in volumetric CT imaging data. We introduce Percival, a vision-language foundation model trained on over 400,000 CT volumes and paired radiology reports from more than 50,000 participants enrolled in the Penn Medicine BioBank. Percival employs a dual-encoder architecture with a transformer-based image encoder and a BERT-style language encoder, aligned via symmetric contrastive learning. Percival was validated on over 20,000 participants imaging data encompassing over 100,000 CT volumes. In image-text recall tasks, Percival outperforms models trained on limited anatomical windows. To assess Percival's clinical knowledge, we evaluated the biologic, phenotypic and prognostic relevance using laboratory-wide, phenome-wide association studies and survival analyses, uncovering a rich latent structure aligned with physiological measurements and disease phenotypes.

CT LLM Radiology Report Whole Body Methodology In Silico Academic Lab GenAI Open Dataset

A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation.

Jin H, Che H, He S, Chen H

•papers•Jul 3 2025

Despite the progress of radiology report generation (RRG), existing works face two challenges: 1) The performances in clinical efficacy are unsatisfactory, especially for lesion attributes description; 2) the generated text lacks explainability, making it difficult for radiologists to trust the results. To address the challenges, we focus on a trustworthy RRG model, which not only generates accurate descriptions of abnormalities, but also provides basis of its predictions. To this end, we propose a framework named chain of diagnosis (CoD), which maintains a chain of diagnostic process for clinically accurate and explainable RRG. It first generates question-answer (QA) pairs via diagnostic conversation to extract key findings, then prompts a large language model with QA diagnoses for accurate generation. To enhance explainability, a diagnosis grounding module is designed to match QA diagnoses and generated sentences, where the diagnoses act as a reference. Moreover, a lesion grounding module is designed to locate abnormalities in the image, further improving the working efficiency of radiologists. To facilitate label-efficient training, we propose an omni-supervised learning strategy with clinical consistency to leverage various types of annotations from different datasets. Our efforts lead to 1) an omni-labeled RRG dataset with QA pairs and lesion boxes; 2) a evaluation tool for assessing the accuracy of reports in describing lesion location and severity; 3) extensive experiments to demonstrate the effectiveness of CoD, where it outperforms both specialist and generalist models consistently on two RRG benchmarks and shows promising explainability by accurately grounding generated sentences to QA diagnoses and images.

Mixed Modality Report Generation Whole Body Methodology In Silico Academic Lab Open Dataset Open Code

Embedding-Based Federated Data Sharing via Differentially Private Conditional VAEs

Francesco Di Salvo, Hanh Huyen My Nguyen, Christian Ledig

•preprint•Jul 3 2025

Deep Learning (DL) has revolutionized medical imaging, yet its adoption is constrained by data scarcity and privacy regulations, limiting access to diverse datasets. Federated Learning (FL) enables decentralized training but suffers from high communication costs and is often restricted to a single downstream task, reducing flexibility. We propose a data-sharing method via Differentially Private (DP) generative models. By adopting foundation models, we extract compact, informative embeddings, reducing redundancy and lowering computational overhead. Clients collaboratively train a Differentially Private Conditional Variational Autoencoder (DP-CVAE) to model a global, privacy-aware data distribution, supporting diverse downstream tasks. Our approach, validated across multiple feature extractors, enhances privacy, scalability, and efficiency, outperforming traditional FL classifiers while ensuring differential privacy. Additionally, DP-CVAE produces higher-fidelity embeddings than DP-CGAN while requiring $5{\times}$ fewer parameters.

Mixed Modality Image Synthesis Whole Body Methodology In Silico Academic Lab Open Code Ethics

3D MedDiffusion: A 3D Medical Latent Diffusion Model for Controllable and High-quality Medical Image Generation.

Wang H, Liu Z, Sun K, Wang X, Shen D, Cui Z

•papers•Jul 2 2025

The generation of medical images presents significant challenges due to their high-resolution and three-dimensional nature. Existing methods often yield suboptimal performance in generating high-quality 3D medical images, and there is currently no universal generative framework for medical imaging. In this paper, we introduce a 3D Medical Latent Diffusion (3D MedDiffusion) model for controllable, high-quality 3D medical image generation. 3D MedDiffusion incorporates a novel, highly efficient Patch-Volume Autoencoder that compresses medical images into latent space through patch-wise encoding and recovers back into image space through volume-wise decoding. Additionally, we design a new noise estimator to capture both local details and global structural information during diffusion denoising process. 3D MedDiffusion can generate fine-detailed, high-resolution images (up to 512x512x512) and effectively adapt to various downstream tasks as it is trained on large-scale datasets covering CT and MRI modalities and different anatomical regions (from head to leg). Experimental results demonstrate that 3D MedDiffusion surpasses state-of-the-art methods in generative quality and exhibits strong generalizability across tasks such as sparse-view CT reconstruction, fast MRI reconstruction, and data augmentation for segmentationand classification. Source code and checkpoints are available at https://github.com/ShanghaiTech-IMPACT/3D-MedDiffusion.

Mixed Modality Image Synthesis Whole Body Methodology In Silico Academic Lab Open Code Benchmark SOTA

Deep learning-based time-of-flight (ToF) enhancement of non-ToF PET scans for different radiotracers.

Mehranian A, Wollenweber SD, Bradley KM, Fielding PA, Huellner M, Iagaru A, Dedja M, Colwell T, Kotasidis F, Johnsen R, Jansen FP, McGowan DR

•papers•Jul 1 2025

To evaluate a deep learning-based time-of-flight (DLToF) model trained to enhance the image quality of non-ToF PET images for different tracers, reconstructed using BSREM algorithm, towards ToF images. A 3D residual U-NET model was trained using 8 different tracers (FDG: 75% and non-FDG: 25%) from 11 sites from US, Europe and Asia. A total of 309 training and 33 validation datasets scanned on GE Discovery MI (DMI) ToF scanners were used for development of DLToF models of three strengths: low (L), medium (M) and high (H). The training and validation pairs consisted of target ToF and input non-ToF BSREM reconstructions using site-preferred regularisation parameters (beta values). The contrast and noise properties of each model were defined by adjusting the beta value of target ToF images. A total of 60 DMI datasets, consisting of a set of 4 tracers (18F-FDG, 18F-PSMA, 68Ga-PSMA, 68Ga-DOTATATE) and 15 exams each, were collected for testing and quantitative analysis of the models based on standardized uptake value (SUV) in regions of interest (ROI) placed in lesions, lungs and liver. Each dataset includes 5 image series: ToF and non-ToF BSREM and three DLToF images. The image series (300 in total) were blind scored on a 5-point Likert score by 4 readers based on lesion detectability, diagnostic confidence, and image noise/quality. In lesion SUVmax quantification with respect to ToF BSREM, DLToF-H achieved the best results among the three models by reducing the non-ToF BSREM errors from -39% to -6% for 18F-FDG (38 lesions); from -42% to -7% for 18F-PSMA (35 lesions); from -34% to -4% for 68Ga-PSMA (23 lesions) and from -34% to -12% for 68Ga-DOTATATE (32 lesions). Quantification results in liver and lung also showed ToF-like performance of DLToF models. Clinical reader resulted showed that DLToF-H results in an improved lesion detectability on average for all four radiotracers whereas DLToF-L achieved the highest scores for image quality (noise level). The results of DLToF-M however showed that this model results in the best trade-off between lesion detection and noise level and hence achieved the highest score for diagnostic confidence on average for all radiotracers. This study demonstrated that the DLToF models are suitable for both FDG and non-FDG tracers and could be utilized for digital BGO PET/CT scanners to provide an image quality and lesion detectability comparable and close to ToF.

PET Reconstruction Whole Body Retrospective Clinical In Silico Big Tech

18F-FDG dose reduction using deep learning-based PET reconstruction.

Akita R, Takauchi K, Ishibashi M, Kondo S, Ono S, Yokomachi K, Ochi Y, Kiguchi M, Mitani H, Nakamura Y, Awai K

•papers•Jul 1 2025

A deep learning-based image reconstruction (DLR) algorithm that can reduce the statistical noise has been developed for PET/CT imaging. It may reduce the administered dose of 18F-FDG and minimize radiation exposure while maintaining diagnostic quality. This retrospective study evaluated whether the injected 18F-FDG dose could be reduced by applying DLR to PET images. To this aim, we compared the quantitative image quality metrics and the false-positive rate between DLR with a reduced 18F-FDG dose and Ordered Subsets Expectation Maximization (OSEM) with a standard dose. This study included 90 oncology patients who underwent 18F-FDG PET/CT. They were divided into 3 groups (30 patients each): group A (18F-FDG dose per body weight [BW]: 2.00-2.99 MBq/kg; PET image reconstruction: DLR), group B (3.00-3.99 MBq/kg; DLR), and group C (standard dose group; 4.00-4.99 MBq/kg; OSEM). The evaluation was performed using the signal-to-noise ratio (SNR), target-to-background ratio (TBR), and false-positive rate. DLR yielded significantly higher SNRs in groups A and B than group C (p < 0.001). There was no significant difference in the TBR between groups A and C, and between groups B and C (p = 0.983 and 0.605, respectively). In group B, more than 80% of patients weighing less than 75 kg had at most one false positive result. In contrast, in group B patients weighing 75 kg or more, as well as in group A, less than 80% of patients had at most one false-positives. Our findings suggest that the injected 18F-FDG dose can be reduced to 3.0 MBq/kg in patients weighing less than 75 kg by applying DLR. Compared to the recommended dose in the European Association of Nuclear Medicine (EANM) guidelines for 90 s per bed position (4.7 MBq/kg), this represents a dose reduction of 36%. Further optimization of DLR algorithms is required to maintain comparable diagnostic accuracy in patients weighing 75 kg or more.

PET Reconstruction Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA

LangMamba: A Language-driven Mamba Framework for Low-dose CT Denoising with Vision-language Models

AI lesion tracking in PET/CT imaging: a proposal for a Siamese-based CNN pipeline applied to PSMA PET/CT scans.

Foundation models for radiology: fundamentals, applications, opportunities, challenges, risks, and prospects.

AI-enhanced patient-specific dosimetry in I-131 planar imaging with a single oblique view.

A Pan-Organ Vision-Language Model for Generalizable 3D CT Representations.

A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation.

Embedding-Based Federated Data Sharing via Differentially Private Conditional VAEs

3D MedDiffusion: A 3D Medical Latent Diffusion Model for Controllable and High-quality Medical Image Generation.

Deep learning-based time-of-flight (ToF) enhancement of non-ToF PET scans for different radiotracers.

<sup>18</sup>F-FDG dose reduction using deep learning-based PET reconstruction.

Ready to Sharpen Your Edge?