Sort by:
Page 4 of 26256 results

RoentMod: A Synthetic Chest X-Ray Modification Model to Identify and Correct Image Interpretation Model Shortcuts

Lauren H. Cooke, Matthias Jung, Jan M. Brendel, Nora M. Kerkovits, Borek Foldyna, Michael T. Lu, Vineet K. Raghu

arxiv logopreprintSep 10 2025
Chest radiographs (CXRs) are among the most common tests in medicine. Automated image interpretation may reduce radiologists\' workload and expand access to diagnostic expertise. Deep learning multi-task and foundation models have shown strong performance for CXR interpretation but are vulnerable to shortcut learning, where models rely on spurious and off-target correlations rather than clinically relevant features to make decisions. We introduce RoentMod, a counterfactual image editing framework that generates anatomically realistic CXRs with user-specified, synthetic pathology while preserving unrelated anatomical features of the original scan. RoentMod combines an open-source medical image generator (RoentGen) with an image-to-image modification model without requiring retraining. In reader studies with board-certified radiologists and radiology residents, RoentMod-produced images appeared realistic in 93\% of cases, correctly incorporated the specified finding in 89-99\% of cases, and preserved native anatomy comparable to real follow-up CXRs. Using RoentMod, we demonstrate that state-of-the-art multi-task and foundation models frequently exploit off-target pathology as shortcuts, limiting their specificity. Incorporating RoentMod-generated counterfactual images during training mitigated this vulnerability, improving model discrimination across multiple pathologies by 3-19\% AUC in internal validation and by 1-11\% for 5 out of 6 tested pathologies in external testing. These findings establish RoentMod as a broadly applicable tool for probing and correcting shortcut learning in medical AI. By enabling controlled counterfactual interventions, RoentMod enhances the robustness and interpretability of CXR interpretation models and provides a generalizable strategy for improving foundation models in medical imaging.

LD-ViCE: Latent Diffusion Model for Video Counterfactual Explanations

Payal Varshney, Adriano Lucieri, Christoph Balada, Sheraz Ahmed, Andreas Dengel

arxiv logopreprintSep 10 2025
Video-based AI systems are increasingly adopted in safety-critical domains such as autonomous driving and healthcare. However, interpreting their decisions remains challenging due to the inherent spatiotemporal complexity of video data and the opacity of deep learning models. Existing explanation techniques often suffer from limited temporal coherence, insufficient robustness, and a lack of actionable causal insights. Current counterfactual explanation methods typically do not incorporate guidance from the target model, reducing semantic fidelity and practical utility. We introduce Latent Diffusion for Video Counterfactual Explanations (LD-ViCE), a novel framework designed to explain the behavior of video-based AI models. Compared to previous approaches, LD-ViCE reduces the computational costs of generating explanations by operating in latent space using a state-of-the-art diffusion model, while producing realistic and interpretable counterfactuals through an additional refinement step. Our experiments demonstrate the effectiveness of LD-ViCE across three diverse video datasets, including EchoNet-Dynamic (cardiac ultrasound), FERV39k (facial expression), and Something-Something V2 (action recognition). LD-ViCE outperforms a recent state-of-the-art method, achieving an increase in R2 score of up to 68% while reducing inference time by half. Qualitative analysis confirms that LD-ViCE generates semantically meaningful and temporally coherent explanations, offering valuable insights into the target model behavior. LD-ViCE represents a valuable step toward the trustworthy deployment of AI in safety-critical domains.

A comprehensive review of techniques, algorithms, advancements, challenges, and clinical applications of multi-modal medical image fusion for improved diagnosis.

Zubair M, Hussain M, Albashrawi MA, Bendechache M, Owais M

pubmed logopapersSep 9 2025
Multi-modal medical image fusion (MMIF) is increasingly recognized as an essential technique for enhancing diagnostic precision and facilitating effective clinical decision-making within computer-aided diagnosis systems. MMIF combines data from X-ray, MRI, CT, PET, SPECT, and ultrasound to create detailed, clinically useful images of patient anatomy and pathology. These integrated representations significantly advance diagnostic accuracy, lesion detection, and segmentation. This comprehensive review meticulously surveys the evolution, methodologies, algorithms, current advancements, and clinical applications of MMIF. We present a critical comparative analysis of traditional fusion approaches, including pixel-, feature-, and decision-level methods, and delves into recent advancements driven by deep learning, generative models, and transformer-based architectures. A critical comparative analysis is presented between these conventional methods and contemporary techniques, highlighting differences in robustness, computational efficiency, and interpretability. The article addresses extensive clinical applications across oncology, neurology, and cardiology, demonstrating MMIF's vital role in precision medicine through improved patient-specific therapeutic outcomes. Moreover, the review thoroughly investigates the persistent challenges affecting MMIF's broad adoption, including issues related to data privacy, heterogeneity, computational complexity, interpretability of AI-driven algorithms, and integration within clinical workflows. It also identifies significant future research avenues, such as the integration of explainable AI, adoption of privacy-preserving federated learning frameworks, development of real-time fusion systems, and standardization efforts for regulatory compliance. This review organizes key knowledge, outlines challenges, and highlights opportunities, guiding researchers, clinicians, and developers in advancing MMIF for routine clinical use and promoting personalized healthcare. To support further research, we provide a GitHub repository that includes popular multi-modal medical imaging datasets along with recent models in our shared GitHub repository.

New imaging techniques and trends in radiology.

Kantarcı M, Aydın S, Oğul H, Kızılgöz V

pubmed logopapersSep 8 2025
Radiography is a field of medicine inherently intertwined with technology. The dependency on technology is very high for obtaining images in ultrasound (US), computed tomography (CT), and magnetic resonance imaging (MRI). Although the reduction in radiation dose is not applicable in US and MRI, advancements in technology have made it possible in CT, with ongoing studies aimed at further optimization. The resolution and diagnostic quality of images obtained through advancements in each modality are steadily improving. Additionally, technological progress has significantly shortened acquisition times for CT and MRI. The use of artificial intelligence (AI), which is becoming increasingly widespread worldwide, has also been incorporated into radiography. This technology can produce more accurate and reproducible results in US examinations. Machine learning offers great potential for improving image quality, creating more distinct and useful images, and even developing new US imaging modalities. Furthermore, AI technologies are increasingly prevalent in CT and MRI for image evaluation, image generation, and enhanced image quality.

RetinaGuard: Obfuscating Retinal Age in Fundus Images for Biometric Privacy Preserving

Zhengquan Luo, Chi Liu, Dongfu Xiao, Zhen Yu, Yueye Wang, Tianqing Zhu

arxiv logopreprintSep 7 2025
The integration of AI with medical images enables the extraction of implicit image-derived biomarkers for a precise health assessment. Recently, retinal age, a biomarker predicted from fundus images, is a proven predictor of systemic disease risks, behavioral patterns, aging trajectory and even mortality. However, the capability to infer such sensitive biometric data raises significant privacy risks, where unauthorized use of fundus images could lead to bioinformation leakage, breaching individual privacy. In response, we formulate a new research problem of biometric privacy associated with medical images and propose RetinaGuard, a novel privacy-enhancing framework that employs a feature-level generative adversarial masking mechanism to obscure retinal age while preserving image visual quality and disease diagnostic utility. The framework further utilizes a novel multiple-to-one knowledge distillation strategy incorporating a retinal foundation model and diverse surrogate age encoders to enable a universal defense against black-box age prediction models. Comprehensive evaluations confirm that RetinaGuard successfully obfuscates retinal age prediction with minimal impact on image quality and pathological feature representation. RetinaGuard is also flexible for extension to other medical image derived biomarkers. RetinaGuard is also flexible for extension to other medical image biomarkers.

Imagining Alternatives: Towards High-Resolution 3D Counterfactual Medical Image Generation via Language Guidance

Mohamed Mohamed, Brennan Nichyporuk, Douglas L. Arnold, Tal Arbel

arxiv logopreprintSep 7 2025
Vision-language models have demonstrated impressive capabilities in generating 2D images under various conditions; however the impressive performance of these models in 2D is largely enabled by extensive, readily available pretrained foundation models. Critically, comparable pretrained foundation models do not exist for 3D, significantly limiting progress in this domain. As a result, the potential of vision-language models to produce high-resolution 3D counterfactual medical images conditioned solely on natural language descriptions remains completely unexplored. Addressing this gap would enable powerful clinical and research applications, such as personalized counterfactual explanations, simulation of disease progression scenarios, and enhanced medical training by visualizing hypothetical medical conditions in realistic detail. Our work takes a meaningful step toward addressing this challenge by introducing a framework capable of generating high-resolution 3D counterfactual medical images of synthesized patients guided by free-form language prompts. We adapt state-of-the-art 3D diffusion models with enhancements from Simple Diffusion and incorporate augmented conditioning to improve text alignment and image quality. To our knowledge, this represents the first demonstration of a language-guided native-3D diffusion model applied specifically to neurological imaging data, where faithful three-dimensional modeling is essential to represent the brain's three-dimensional structure. Through results on two distinct neurological MRI datasets, our framework successfully simulates varying counterfactual lesion loads in Multiple Sclerosis (MS), and cognitive states in Alzheimer's disease, generating high-quality images while preserving subject fidelity in synthetically generated medical images. Our results lay the groundwork for prompt-driven disease progression analysis within 3D medical imaging.

RetinaGuard: Obfuscating Retinal Age in Fundus Images for Biometric Privacy Preserving

Zhengquan Luo, Chi Liu, Dongfu Xiao, Zhen Yu, Yueye Wang, Tianqing Zhu

arxiv logopreprintSep 7 2025
The integration of AI with medical images enables the extraction of implicit image-derived biomarkers for a precise health assessment. Recently, retinal age, a biomarker predicted from fundus images, is a proven predictor of systemic disease risks, behavioral patterns, aging trajectory and even mortality. However, the capability to infer such sensitive biometric data raises significant privacy risks, where unauthorized use of fundus images could lead to bioinformation leakage, breaching individual privacy. In response, we formulate a new research problem of biometric privacy associated with medical images and propose RetinaGuard, a novel privacy-enhancing framework that employs a feature-level generative adversarial masking mechanism to obscure retinal age while preserving image visual quality and disease diagnostic utility. The framework further utilizes a novel multiple-to-one knowledge distillation strategy incorporating a retinal foundation model and diverse surrogate age encoders to enable a universal defense against black-box age prediction models. Comprehensive evaluations confirm that RetinaGuard successfully obfuscates retinal age prediction with minimal impact on image quality and pathological feature representation. RetinaGuard is also flexible for extension to other medical image derived biomarkers. RetinaGuard is also flexible for extension to other medical image biomarkers.

Generation of realistic cardiac ultrasound sequences with ground truth motion and speckle decorrelation

Thierry Judge, Nicolas Duchateau, Khuram Faraz, Pierre-Marc Jodoin, Olivier Bernard

arxiv logopreprintSep 5 2025
Simulated ultrasound image sequences are key for training and validating machine learning algorithms for left ventricular strain estimation. Several simulation pipelines have been proposed to generate sequences with corresponding ground truth motion, but they suffer from limited realism as they do not consider speckle decorrelation. In this work, we address this limitation by proposing an improved simulation framework that explicitly accounts for speckle decorrelation. Our method builds on an existing ultrasound simulation pipeline by incorporating a dynamic model of speckle variation. Starting from real ultrasound sequences and myocardial segmentations, we generate meshes that guide image formation. Instead of applying a fixed ratio of myocardial and background scatterers, we introduce a coherence map that adapts locally over time. This map is derived from correlation values measured directly from the real ultrasound data, ensuring that simulated sequences capture the characteristic temporal changes observed in practice. We evaluated the realism of our approach using ultrasound data from 98 patients in the CAMUS database. Performance was assessed by comparing correlation curves from real and simulated images. The proposed method achieved lower mean absolute error compared to the baseline pipeline, indicating that it more faithfully reproduces the decorrelation behavior seen in clinical data.

AI-based synthetic simulation CT generation from diagnostic CT for simulation-free workflow of spinal palliative radiotherapy

Han, Y., Hanania, A. N., Siddiqui, Z. A., Ugarte, V., Zhou, B., Mohamed, A. S. R., Pathak, P., Hamstra, D. A., Sun, B.

medrxiv logopreprintSep 5 2025
Purpose/ObjectiveCurrent radiotherapy (RT) planning workflows rely on pre-treatment simulation CT (sCT), which can significantly delay treatment initiation, particularly in resource-constrained settings. While diagnostic CT (dCT) offers a potential alternative for expedited planning, inherent geometric discrepancies from sCT in patient positioning and table curvature limit its direct use for accurate RT planning. This study presents a novel AI-based method designed to overcome these limitations by generating synthetic simulation CT (ssCT) directly from standard dCT for spinal palliative RT, aiming to eliminate the need for sCT and accelerate the treatment workflow. Materials/MethodsssCTs were generated using two neural network models to adjust spine position and correct table curvature. The neural networks use a three-layer structure (ReLU activation), optimized by Adam with MSE loss and MAE metrics. The models were trained on paired dCT and sCT images from 30 patients undergoing palliative spine radiotherapy from a safety-net hospital, with 22 cases used for training and 8 for testing. To explore institutional dependence, the models were also tested on 7 patients from an academic medical center (AMC). To evaluate ssCT accuracy, both ssCT and dCT were aligned with sCT using the same frame of reference rigid registration on bone windows. Dosimetric differences were assessed by comparing dCT vs. sCT and ssCT vs. sCT, quantifying deviations in dose-volume histogram (DVH) metrics, including Dmean, Dmax, D95, D99, V100, V107, and root-mean-square (RMS) differences. The imaging and plan quality was assessed by four radiation oncologists using a Likert score. The Wilcoxon signed-rank test was used to determine whether there is a significant difference between the two methods. ResultsFor the safety-net hospital cases, the generated ssCT demonstrated significantly improved geometric and dosimetric accuracy compared to dCT. ssCT reduced the mean difference in key dosimetric parameters (e.g., Dmean difference decreased from 2.0% for dCT vs. sCT to 0.57% for ssCT vs. sCT with significant improvement under the Wilcoxon signed-rank test) and achieved a significant reduction in the RMS difference of DVH curves (from 6.4% to 2.2%). Furthermore, physician evaluations showed that ssCT was consistently rated as significantly superior for treatment planning images (mean scores improving from "Acceptable" for dCT to "Good to Perfect" for ssCT), reflecting improved confidence in target and tissue positioning. In the academic medical-center cohort--where technologists already apply meticulous pre-scan alignment--ssCT still yielded statistically significant, though smaller, improvements in several dosimetric endpoints and in observer ratings. ConclusionOur AI-driven approach successfully generates ssCT from dCT that achieves geometric and dosimetric accuracy comparable to sCT for spinal palliative RT planning. By specifically addressing critical discrepancies like spine position and table curvature, this method offers a robust approach to bypass the need for dedicated sCT simulations. This advancement has the potential to significantly streamline the RT workflow, reduce treatment uncertainties, and accelerate time to treatment, offering a highly promising solution for improving access to timely and accurate radiotherapy, especially in limited-resource environments.

Diffusion Generative Models Meet Compressed Sensing, with Applications to Image Data and Financial Time Series

Zhengyi Guo, Jiatu Li, Wenpin Tang, David D. Yao

arxiv logopreprintSep 4 2025
This paper develops dimension reduction techniques for accelerating diffusion model inference in the context of synthetic data generation. The idea is to integrate compressed sensing into diffusion models: (i) compress the data into a latent space, (ii) train a diffusion model in the latent space, and (iii) apply a compressed sensing algorithm to the samples generated in the latent space, facilitating the efficiency of both model training and inference. Under suitable sparsity assumptions on data, the proposed algorithm is proved to enjoy faster convergence by combining diffusion model inference with sparse recovery. As a byproduct, we obtain an optimal value for the latent space dimension. We also conduct numerical experiments on a range of datasets, including image data (handwritten digits, medical images, and climate data) and financial time series for stress testing.
Page 4 of 26256 results
Show
per page

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.