Latest Papers on Radiology AI. Tags: None

Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?

Hanxue Gu, Yaqian Chen, Nicholas Konz, Qihang Li, Maciej A. Mazurowski

•preprint•Jul 15 2025

Foundation models, pre-trained on large image datasets and capable of capturing rich feature representations, have recently shown potential for zero-shot image registration. However, their performance has mostly been tested in the context of rigid or less complex structures, such as the brain or abdominal organs, and it remains unclear whether these models can handle more challenging, deformable anatomy. Breast MRI registration is particularly difficult due to significant anatomical variation between patients, deformation caused by patient positioning, and the presence of thin and complex internal structure of fibroglandular tissue, where accurate alignment is crucial. Whether foundation model-based registration algorithms can address this level of complexity remains an open question. In this study, we provide a comprehensive evaluation of foundation model-based registration algorithms for breast MRI. We assess five pre-trained encoders, including DINO-v2, SAM, MedSAM, SSLSAM, and MedCLIP, across four key breast registration tasks that capture variations in different years and dates, sequences, modalities, and patient disease status (lesion versus no lesion). Our results show that foundation model-based algorithms such as SAM outperform traditional registration baselines for overall breast alignment, especially under large domain shifts, but struggle with capturing fine details of fibroglandular tissue. Interestingly, additional pre-training or fine-tuning on medical or breast-specific images in MedSAM and SSLSAM, does not improve registration performance and may even decrease it in some cases. Further work is needed to understand how domain-specific training influences registration and to explore targeted strategies that improve both global alignment and fine structure accuracy. We also publicly release our code at \href{https://github.com/mazurowski-lab/Foundation-based-reg}{Github}.

MRI Registration Breast Methodology In Silico Academic Lab Open Code

Latent Space Consistency for Sparse-View CT Reconstruction

Duoyou Chen, Yunqing Chen, Can Zhang, Zhou Wang, Cheng Chen, Ruoxiu Xiao

•preprint•Jul 15 2025

Computed Tomography (CT) is a widely utilized imaging modality in clinical settings. Using densely acquired rotational X-ray arrays, CT can capture 3D spatial features. However, it is confronted with challenged such as significant time consumption and high radiation exposure. CT reconstruction methods based on sparse-view X-ray images have garnered substantial attention from researchers as they present a means to mitigate costs and risks. In recent years, diffusion models, particularly the Latent Diffusion Model (LDM), have demonstrated promising potential in the domain of 3D CT reconstruction. Nonetheless, due to the substantial differences between the 2D latent representation of X-ray modalities and the 3D latent representation of CT modalities, the vanilla LDM is incapable of achieving effective alignment within the latent space. To address this issue, we propose the Consistent Latent Space Diffusion Model (CLS-DM), which incorporates cross-modal feature contrastive learning to efficiently extract latent 3D information from 2D X-ray images and achieve latent space alignment between modalities. Experimental results indicate that CLS-DM outperforms classical and state-of-the-art generative models in terms of standard voxel-level metrics (PSNR, SSIM) on the LIDC-IDRI and CTSpine1K datasets. This methodology not only aids in enhancing the effectiveness and economic viability of sparse X-ray reconstructed CT but can also be generalized to other cross-modal transformation tasks, such as text-to-image synthesis. We have made our code publicly available at https://anonymous.4open.science/r/CLS-DM-50D6/ to facilitate further research and applications in other domains.

CT Reconstruction Methodology In Silico Academic Lab Open Code Benchmark SOTA

Semantically Informed Salient Regions Guided Radiology Report Generation

Zeyi Hou, Zeqiang Wei, Ruixin Yan, Ning Lang, Xiuzhuang Zhou

•preprint•Jul 15 2025

Recent advances in automated radiology report generation from chest X-rays using deep learning algorithms have the potential to significantly reduce the arduous workload of radiologists. However, due to the inherent massive data bias in radiology images, where abnormalities are typically subtle and sparsely distributed, existing methods often produce fluent yet medically inaccurate reports, limiting their applicability in clinical practice. To address this issue effectively, we propose a Semantically Informed Salient Regions-guided (SISRNet) report generation method. Specifically, our approach explicitly identifies salient regions with medically critical characteristics using fine-grained cross-modal semantics. Then, SISRNet systematically focuses on these high-information regions during both image modeling and report generation, effectively capturing subtle abnormal findings, mitigating the negative impact of data bias, and ultimately generating clinically accurate reports. Compared to its peers, SISRNet demonstrates superior performance on widely used IU-Xray and MIMIC-CXR datasets.

X-Ray Report Generation Chest Methodology In Silico

Human-Guided Shade Artifact Suppression in CBCT-to-MDCT Translation via Schrödinger Bridge with Conditional Diffusion

Sung Ho Kang, Hyun-Cheol Park

•preprint•Jul 15 2025

We present a novel framework for CBCT-to-MDCT translation, grounded in the Schrodinger Bridge (SB) formulation, which integrates GAN-derived priors with human-guided conditional diffusion. Unlike conventional GANs or diffusion models, our approach explicitly enforces boundary consistency between CBCT inputs and pseudo targets, ensuring both anatomical fidelity and perceptual controllability. Binary human feedback is incorporated via classifier-free guidance (CFG), effectively steering the generative process toward clinically preferred outcomes. Through iterative refinement and tournament-based preference selection, the model internalizes human preferences without relying on a reward model. Subtraction image visualizations reveal that the proposed method selectively attenuates shade artifacts in key anatomical regions while preserving fine structural detail. Quantitative evaluations further demonstrate superior performance across RMSE, SSIM, LPIPS, and Dice metrics on clinical datasets -- outperforming prior GAN- and fine-tuning-based feedback methods -- while requiring only 10 sampling steps. These findings underscore the effectiveness and efficiency of our framework for real-time, preference-aligned medical image translation.

CT Image Synthesis Methodology In Silico Academic Lab Breakthrough

Assessing MRI-based Artificial Intelligence Models for Preoperative Prediction of Microvascular Invasion in Hepatocellular Carcinoma: A Systematic Review and Meta-analysis.

Han X, Shan L, Xu R, Zhou J, Lu M

•papers•Jul 15 2025

To evaluate the performance of magnetic resonance imaging (MRI)-based artificial intelligence (AI) in the preoperative prediction of microvascular invasion (MVI) in patients with hepatocellular carcinoma (HCC). A systematic search of PubMed, Embase, and Web of Science was conducted up to May 2025, following PRISMA guidelines. Studies using MRI-based AI models with histopathologically confirmed MVI were included. Study quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool and the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework. Statistical synthesis used bivariate random-effects models. Twenty-nine studies were included, totaling 2838 internal and 1161 external validation cases. Pooled internal validation showed a sensitivity of 0.81 (95% CI: 0.76-0.85), specificity of 0.82 (95% CI: 0.78-0.85), diagnostic odds ratio (DOR) of 19.33 (95% CI: 13.15-28.42), and area under the curve (AUC) of 0.88 (95% CI: 0.85-0.91). External validation yielded a comparable AUC of 0.85. Traditional machine learning methods achieved higher sensitivity than deep learning approaches in both internal and external validation cohorts (both P < 0.05). Studies incorporating both radiomics and clinical features demonstrated superior sensitivity and specificity compared to radiomics-only models (P < 0.01). MRI-based AI demonstrates high performance for preoperative prediction of MVI in HCC, particularly for MRI-based models that combine multimodal imaging and clinical variables. However, substantial heterogeneity and low GRADE levels may affect the strength of the evidence, highlighting the need for methodological standardization and multicenter prospective validation to ensure clinical applicability.

MRI Classification Abdominal Meta Analysis In Silico Academic Lab

Multimodal Radiopathomics Signature for Prediction of Response to Immunotherapy-based Combination Therapy in Gastric Cancer Using Interpretable Machine Learning.

Huang W, Wang X, Zhong R, Li Z, Zhou K, Lyu Q, Han JE, Chen T, Islam MT, Yuan Q, Ahmad MU, Chen S, Chen C, Huang J, Xie J, Shen Y, Xiong W, Shen L, Xu Y, Yang F, Xu Z, Li G, Jiang Y

•papers•Jul 15 2025

Immunotherapy has become a cornerstone in the treatment of advanced gastric cancer (GC). However, identifying reliable predictive biomarkers remains a considerable challenge. This study demonstrates the potential of integrating multimodal baseline data, including computed tomography scan images and digital H&E-stained pathology images, with biological interpretation to predict the response to immunotherapy-based combination therapy using a multicenter cohort of 298 GC patients. By employing seven machine learning approaches, we developed a radiopathomics signature (RPS) to predict treatment response and stratify prognostic risk in GC. The RPS demonstrated area under the receiver-operating-characteristic curves (AUCs) of 0.978 (95% CI, 0.950-1.000), 0.863 (95% CI, 0.744-0.982), and 0.822 (95% CI, 0.668-0.975) in the training, internal validation, and external validation cohorts, respectively, outperforming conventional biomarkers such as CPS, MSI-H, EBV, and HER-2. Kaplan-Meier analysis revealed significant differences of survival between high- and low-risk groups, especially in advanced-stage and non-surgical patients. Additionally, genetic analyses revealed that the RPS correlates with enhanced immune regulation pathways and increased infiltration of memory B cells. The interpretable RPS provides accurate predictions for treatment response and prognosis in GC and holds potential for guiding more precise, patient-specific treatment strategies while offering insights into immune-related mechanisms.

CT Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Learning homeomorphic image registration via conformal-invariant hyperelastic regularisation.

Zou J, Debroux N, Liu L, Qin J, Schönlieb CB, Aviles-Rivero AI

•papers•Jul 15 2025

Deformable image registration is a fundamental task in medical image analysis and plays a crucial role in a wide range of clinical applications. Recently, deep learning-based approaches have been widely studied for deformable medical image registration and achieved promising results. However, existing deep learning image registration techniques do not theoretically guarantee topology-preserving transformations. This is a key property to preserve anatomical structures and achieve plausible transformations that can be used in real clinical settings. We propose a novel framework for deformable image registration. Firstly, we introduce a novel regulariser based on conformal-invariant properties in a nonlinear elasticity setting. Our regulariser enforces the deformation field to be mooth, invertible and orientation-preserving. More importantly, we strictly guarantee topology preservation yielding to a clinical meaningful registration. Secondly, we boost the performance of our regulariser through coordinate MLPs, where one can view the to-be-registered images as continuously differentiable entities. We demonstrate, through numerical and visual experiments, that our framework is able to outperform current techniques for image registration.

Mixed Modality Registration Methodology In Silico

Flatten Wisely: How Patch Order Shapes Mamba-Powered Vision for MRI Segmentation

Osama Hardan, Omar Elshenhabi, Tamer Khattab, Mohamed Mabrok

•preprint•Jul 15 2025

Vision Mamba models promise transformer-level performance at linear computational cost, but their reliance on serializing 2D images into 1D sequences introduces a critical, yet overlooked, design choice: the patch scan order. In medical imaging, where modalities like brain MRI contain strong anatomical priors, this choice is non-trivial. This paper presents the first systematic study of how scan order impacts MRI segmentation. We introduce Multi-Scan 2D (MS2D), a parameter-free module for Mamba-based architectures that facilitates exploring diverse scan paths without additional computational cost. We conduct a large-scale benchmark of 21 scan strategies on three public datasets (BraTS 2020, ISLES 2022, LGG), covering over 70,000 slices. Our analysis shows conclusively that scan order is a statistically significant factor (Friedman test: $\chi^{2}_{20}=43.9, p=0.0016$), with performance varying by as much as 27 Dice points. Spatially contiguous paths -- simple horizontal and vertical rasters -- consistently outperform disjointed diagonal scans. We conclude that scan order is a powerful, cost-free hyperparameter, and provide an evidence-based shortlist of optimal paths to maximize the performance of Mamba models in medical imaging.

MRI Segmentation Neurological Methodology In Silico

Region Uncertainty Estimation for Medical Image Segmentation with Noisy Labels.

Han K, Wang S, Chen J, Qian C, Lyu C, Ma S, Qiu C, Sheng VS, Huang Q, Liu Z

•papers•Jul 14 2025

The success of deep learning in 3D medical image segmentation hinges on training with a large dataset of fully annotated 3D volumes, which are difficult and time-consuming to acquire. Although recent foundation models (e.g., segment anything model, SAM) can utilize sparse annotations to reduce annotation costs, segmentation tasks involving organs and tissues with blurred boundaries remain challenging. To address this issue, we propose a region uncertainty estimation framework for Computed Tomography (CT) image segmentation using noisy labels. Specifically, we propose a sample-stratified training strategy that stratifies samples according to their varying quality labels, prioritizing confident and fine-grained information at each training stage. This sample-to-voxel level processing enables more reliable supervision information to propagate to noisy label data, thus effectively mitigating the impact of noisy annotations. Moreover, we further design a boundary-guided regional uncertainty estimation module that adapts sample hierarchical training to assist in evaluating sample confidence. Experiments conducted across multiple CT datasets demonstrate the superiority of our proposed method over several competitive approaches under various noise conditions. Our proposed reliable label propagation strategy not only significantly reduces the cost of medical image annotation and robust model training but also improves the segmentation performance in scenarios with imperfect annotations, thus paving the way towards the application of medical segmentation foundation models under low-resource and remote scenarios. Code will be available at https://github.com/KHan-UJS/NoisyLabel.

CT Segmentation Methodology In Silico Academic Lab Open Code

Self-supervised Upsampling for Reconstructions with Generalized Enhancement in Photoacoustic Computed Tomography.

Deng K, Luo Y, Zuo H, Chen Y, Gu L, Liu MY, Lan H, Luo J, Ma C

•papers•Jul 14 2025

Photoacoustic computed tomography (PACT) is an emerging hybrid imaging modality with potential applications in biomedicine. A major roadblock to the widespread adoption of PACT is the limited number of detectors, which gives rise to spatial aliasing and manifests as streak artifacts in the reconstructed image. A brute-force solution to the problem is to increase the number of detectors, which, however, is often undesirable due to escalated costs. In this study, we present a novel self-supervised learning approach, to overcome this long-standing challenge. We found that small blocks of PACT channel data show similarity at various downsampling rates. Based on this observation, a neural network trained on downsampled data can reliably perform accurate interpolation without requiring densely-sampled ground truth data, which is typically unavailable in real practice. Our method has undergone validation through numerical simulations, controlled phantom experiments, as well as ex vivo and in vivo animal tests, across multiple PACT systems. We have demonstrated that our technique provides an effective and cost-efficient solution to address the under-sampling issue in PACT, thereby enhancing the capabilities of this imaging technology.

Mixed Modality Reconstruction Methodology Phantom/Animal Academic Lab

Filter Papers

Tags

Are Vision Foundation Models Ready for Out-of-the-Box Medical Image Registration?

Latent Space Consistency for Sparse-View CT Reconstruction

Semantically Informed Salient Regions Guided Radiology Report Generation

Human-Guided Shade Artifact Suppression in CBCT-to-MDCT Translation via Schrödinger Bridge with Conditional Diffusion

Assessing MRI-based Artificial Intelligence Models for Preoperative Prediction of Microvascular Invasion in Hepatocellular Carcinoma: A Systematic Review and Meta-analysis.

Multimodal Radiopathomics Signature for Prediction of Response to Immunotherapy-based Combination Therapy in Gastric Cancer Using Interpretable Machine Learning.

Learning homeomorphic image registration via conformal-invariant hyperelastic regularisation.

Flatten Wisely: How Patch Order Shapes Mamba-Powered Vision for MRI Segmentation

Region Uncertainty Estimation for Medical Image Segmentation with Noisy Labels.

Self-supervised Upsampling for Reconstructions with Generalized Enhancement in Photoacoustic Computed Tomography.

Ready to Sharpen Your Edge?