Latest Papers on Radiology AI. Order: Best Match, Limit: 10.

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

Xuesong Li, Dianye Huang, Yameng Zhang, Nassir Navab, Zhongliang Jiang

•preprint•Jun 24 2025

Understanding medical ultrasound imaging remains a long-standing challenge due to significant visual variability caused by differences in imaging and acquisition parameters. Recent advancements in large language models (LLMs) have been used to automatically generate terminology-rich summaries orientated to clinicians with sufficient physiological knowledge. Nevertheless, the increasing demand for improved ultrasound interpretability and basic scanning guidance among non-expert users, e.g., in point-of-care settings, has not yet been explored. In this study, we first introduce the scene graph (SG) for ultrasound images to explain image content to ordinary and provide guidance for ultrasound scanning. The ultrasound SG is first computed using a transformer-based one-stage method, eliminating the need for explicit object detection. To generate a graspable image explanation for ordinary, the user query is then used to further refine the abstract SG representation through LLMs. Additionally, the predicted SG is explored for its potential in guiding ultrasound scanning toward missing anatomies within the current imaging view, assisting ordinary users in achieving more standardized and complete anatomical exploration. The effectiveness of this SG-based image explanation and scanning guidance has been validated on images from the left and right neck regions, including the carotid and thyroid, across five volunteers. The results demonstrate the potential of the method to maximally democratize ultrasound by enhancing its interpretability and usability for ordinaries.

Ultrasound Classification Vascular Methodology Prototype Academic Lab GenAI

SAM2-SGP: Enhancing SAM2 for Medical Image Segmentation via Support-Set Guided Prompting

Yang Xing, Jiong Wu, Yuheng Bu, Kuang Gong

•preprint•Jun 24 2025

Although new vision foundation models such as Segment Anything Model 2 (SAM2) have significantly enhanced zero-shot image segmentation capabilities, reliance on human-provided prompts poses significant challenges in adapting SAM2 to medical image segmentation tasks. Moreover, SAM2's performance in medical image segmentation was limited by the domain shift issue, since it was originally trained on natural images and videos. To address these challenges, we proposed SAM2 with support-set guided prompting (SAM2-SGP), a framework that eliminated the need for manual prompts. The proposed model leveraged the memory mechanism of SAM2 to generate pseudo-masks using image-mask pairs from a support set via a Pseudo-mask Generation (PMG) module. We further introduced a novel Pseudo-mask Attention (PMA) module, which used these pseudo-masks to automatically generate bounding boxes and enhance localized feature extraction by guiding attention to relevant areas. Furthermore, a low-rank adaptation (LoRA) strategy was adopted to mitigate the domain shift issue. The proposed framework was evaluated on both 2D and 3D datasets across multiple medical imaging modalities, including fundus photography, X-ray, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), and ultrasound. The results demonstrated a significant performance improvement over state-of-the-art models, such as nnUNet and SwinUNet, as well as foundation models, such as SAM2 and MedSAM2, underscoring the effectiveness of the proposed approach. Our code is publicly available at https://github.com/astlian9/SAM_Support.

Mixed Modality Segmentation Methodology In Silico Academic Lab Open Code Benchmark SOTA

Filling of incomplete sinograms from sparse PET detector configurations using a residual U-Net

Klara Leffler, Luigi Tommaso Luppino, Samuel Kuttner, Karin Söderkvist, Jan Axelsson

•preprint•Jun 24 2025

Long axial field-of-view PET scanners offer increased field-of-view and sensitivity compared to traditional PET scanners. However, a significant cost is associated with the densely packed photodetectors required for the extended-coverage systems, limiting clinical utilisation. To mitigate the cost limitations, alternative sparse system configurations have been proposed, allowing an extended field-of-view PET design with detector costs similar to a standard PET system, albeit at the expense of image quality. In this work, we propose a deep sinogram restoration network to fill in the missing sinogram data. Our method utilises a modified Residual U-Net, trained on clinical PET scans from a GE Signa PET/MR, simulating the removal of 50% of the detectors in a chessboard pattern (retaining only 25% of all lines of response). The model successfully recovers missing counts, with a mean absolute error below two events per pixel, outperforming 2D interpolation in both sinogram and reconstructed image domain. Notably, the predicted sinograms exhibit a smoothing effect, leading to reconstructed images lacking sharpness in finer details. Despite these limitations, the model demonstrates a substantial capacity for compensating for the undersampling caused by the sparse detector configuration. This proof-of-concept study suggests that sparse detector configurations, combined with deep learning techniques, offer a viable alternative to conventional PET scanner designs. This approach supports the development of cost-effective, total body PET scanners, allowing a significant step forward in medical imaging technology.

PET Reconstruction Whole Body Methodology In Silico Academic Lab Breakthrough

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Jakob Ambsdorf, Asbjørn Munk, Sebastian Llambias, Anders Nymark Christensen, Kamil Mikolaj, Randall Balestriero, Martin Tolsgaard, Aasa Feragen, Mads Nielsen

•preprint•Jun 24 2025

With access to large-scale, unlabeled medical datasets, researchers are confronted with two questions: Should they attempt to pretrain a custom foundation model on this medical data, or use transfer-learning from an existing generalist model? And, if a custom model is pretrained, are novel methods required? In this paper we explore these questions by conducting a case-study, in which we train a foundation model on a large regional fetal ultrasound dataset of 2M images. By selecting the well-established DINOv2 method for pretraining, we achieve state-of-the-art results on three fetal ultrasound datasets, covering data from different countries, classification, segmentation, and few-shot tasks. We compare against a series of models pretrained on natural images, ultrasound images, and supervised baselines. Our results demonstrate two key insights: (i) Pretraining on custom data is worth it, even if smaller models are trained on less data, as scaling in natural image pretraining does not translate to ultrasound performance. (ii) Well-tuned methods from computer vision are making it feasible to train custom foundation models for a given medical domain, requiring no hyperparameter tuning and little methodological adaptation. Given these findings, we argue that a bias towards methodological innovation should be avoided when developing domain specific foundation models under common computational resource constraints.

Ultrasound Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

AI-based large-scale screening of gastric cancer from noncontrast CT imaging.

Hu C, Xia Y, Zheng Z, Cao M, Zheng G, Chen S, Sun J, Chen W, Zheng Q, Pan S, Zhang Y, Chen J, Yu P, Xu J, Xu J, Qiu Z, Lin T, Yun B, Yao J, Guo W, Gao C, Kong X, Chen K, Wen Z, Zhu G, Qiao J, Pan Y, Li H, Gong X, Ye Z, Ao W, Zhang L, Yan X, Tong Y, Yang X, Zheng X, Fan S, Cao J, Yan C, Xie K, Zhang S, Wang Y, Zheng L, Wu Y, Ge Z, Tian X, Zhang X, Wang Y, Zhang R, Wei Y, Zhu W, Zhang J, Qiu H, Su M, Shi L, Xu Z, Zhang L, Cheng X

•papers•Jun 24 2025

Early detection through screening is critical for reducing gastric cancer (GC) mortality. However, in most high-prevalence regions, large-scale screening remains challenging due to limited resources, low compliance and suboptimal detection rate of upper endoscopic screening. Therefore, there is an urgent need for more efficient screening protocols. Noncontrast computed tomography (CT), routinely performed for clinical purposes, presents a promising avenue for large-scale designed or opportunistic screening. Here we developed the Gastric Cancer Risk Assessment Procedure with Artificial Intelligence (GRAPE), leveraging noncontrast CT and deep learning to identify GC. Our study comprised three phases. First, we developed GRAPE using a cohort from 2 centers in China (3,470 GC and 3,250 non-GC cases) and validated its performance on an internal validation set (1,298 cases, area under curve = 0.970) and an independent external cohort from 16 centers (18,160 cases, area under curve = 0.927). Subgroup analysis showed that the detection rate of GRAPE increased with advancing T stage but was independent of tumor location. Next, we compared the interpretations of GRAPE with those of radiologists and assessed its potential in assisting diagnostic interpretation. Reader studies demonstrated that GRAPE significantly outperformed radiologists, improving sensitivity by 21.8% and specificity by 14.0%, particularly in early-stage GC. Finally, we evaluated GRAPE in real-world opportunistic screening using 78,593 consecutive noncontrast CT scans from a comprehensive cancer center and 2 independent regional hospitals. GRAPE identified persons at high risk with GC detection rates of 24.5% and 17.7% in 2 regional hospitals, with 23.2% and 26.8% of detected cases in T1/T2 stage. Additionally, GRAPE detected GC cases that radiologists had initially missed, enabling earlier diagnosis of GC during follow-up for other diseases. In conclusion, GRAPE demonstrates strong potential for large-scale GC screening, offering a feasible and effective approach for early detection. ClinicalTrials.gov registration: NCT06614179 .

CT Detection Abdominal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA GenAI

Multimodal Deep Learning Based on Ultrasound Images and Clinical Data for Better Ovarian Cancer Diagnosis.

Su C, Miao K, Zhang L, Yu X, Guo Z, Li D, Xu M, Zhang Q, Dong X

•papers•Jun 24 2025

This study aimed to develop and validate a multimodal deep learning model that leverages 2D grayscale ultrasound (US) images alongside readily available clinical data to improve diagnostic performance for ovarian cancer (OC). A retrospective analysis was conducted involving 1899 patients who underwent preoperative US examinations and subsequent surgeries for adnexal masses between 2019 and 2024. A multimodal deep learning model was constructed for OC diagnosis and extracting US morphological features from the images. The model's performance was evaluated using metrics such as receiver operating characteristic (ROC) curves, accuracy, and F1 score. The multimodal deep learning model exhibited superior performance compared to the image-only model, achieving areas under the curves (AUCs) of 0.9393 (95% CI 0.9139-0.9648) and 0.9317 (95% CI 0.9062-0.9573) in the internal and external test sets, respectively. The model significantly improved the AUCs for OC diagnosis by radiologists and enhanced inter-reader agreement. Regarding US morphological feature extraction, the model demonstrated robust performance, attaining accuracies of 86.34% and 85.62% in the internal and external test sets, respectively. Multimodal deep learning has the potential to enhance the diagnostic accuracy and consistency of radiologists in identifying OC. The model's effective feature extraction from ultrasound images underscores the capability of multimodal deep learning to automate the generation of structured ultrasound reports.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Determination of Kennedy's classification in panoramic X-rays by automated tooth labeling.

Meine H, Metzger MC, Weingart P, Wüster J, Schmelzeisen R, Rörich A, Georgii J, Brandenburg LS

•papers•Jun 24 2025

Panoramic X-rays (PX) are extensively utilized in dental and maxillofacial diagnostics, offering comprehensive imaging of teeth and surrounding structures. This study investigates the automatic determination of Kennedy's classification in partially edentulous jaws. A retrospective study involving 209 PX images from 206 patients was conducted. The established Mask R-CNN, a deep learning-based instance segmentation model, was trained for the automatic detection, position labeling (according to the international dental federation's scheme), and segmentation of teeth in PX. Subsequent post-processing steps filter duplicate outputs by position label and by geometric overlap. Finally, a rule-based determination of Kennedy's class of partially edentulous jaws was performed. In a fivefold cross-validation, Kennedy's classification was correctly determined in 83.0% of cases, with the most common errors arising from the mislabeling of morphologically similar teeth. The underlying algorithm demonstrated high sensitivity (97.1%) and precision (98.1%) in tooth detection, with an F1 score of 97.6%. FDI position label accuracy was 94.7%. Ablation studies indicated that post-processing steps, such as duplicate filtering, significantly improved algorithm performance. Our findings show that automatic dentition analysis in PX images can be extended to include clinically relevant jaw classification, reducing the workload associated with manual labeling and classification.

X-Ray Segmentation Retrospective Clinical In Silico Academic Lab

VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration

Hang Zhang, Yuxi Zhang, Jiazheng Wang, Xiang Chen, Renjiu Hu, Xin Tian, Gaolei Li, Min Liu

•preprint•Jun 24 2025

Recent developments in neural networks have improved deformable image registration (DIR) by amortizing iterative optimization, enabling fast and accurate DIR results. However, learning-based methods often face challenges with limited training data, large deformations, and tend to underperform compared to iterative approaches when label supervision is unavailable. While iterative methods can achieve higher accuracy in such scenarios, they are considerably slower than learning-based methods. To address these limitations, we propose VoxelOpt, a discrete optimization-based DIR framework that combines the strengths of learning-based and iterative methods to achieve a better balance between registration accuracy and runtime. VoxelOpt uses displacement entropy from local cost volumes to measure displacement signal strength at each voxel, which differs from earlier approaches in three key aspects. First, it introduces voxel-wise adaptive message passing, where voxels with lower entropy receives less influence from their neighbors. Second, it employs a multi-level image pyramid with 27-neighbor cost volumes at each level, avoiding exponential complexity growth. Third, it replaces hand-crafted features or contrastive learning with a pretrained foundational segmentation model for feature extraction. In abdominal CT registration, these changes allow VoxelOpt to outperform leading iterative in both efficiency and accuracy, while matching state-of-the-art learning-based methods trained with label supervision. The source code will be available at https://github.com/tinymilky/VoxelOpt

CT Registration Abdominal Methodology In Silico Academic Lab Open Code

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Ankita Raj, Harsh Swaika, Deepankar Varma, Chetan Arora

•preprint•Jun 24 2025

The success of deep learning in medical imaging applications has led several companies to deploy proprietary models in diagnostic workflows, offering monetized services. Even though model weights are hidden to protect the intellectual property of the service provider, these models are exposed to model stealing (MS) attacks, where adversaries can clone the model's functionality by querying it with a proxy dataset and training a thief model on the acquired predictions. While extensively studied on general vision tasks, the susceptibility of medical imaging models to MS attacks remains inadequately explored. This paper investigates the vulnerability of black-box medical imaging models to MS attacks under realistic conditions where the adversary lacks access to the victim model's training data and operates with limited query budgets. We demonstrate that adversaries can effectively execute MS attacks by using publicly available datasets. To further enhance MS capabilities with limited query budgets, we propose a two-step model stealing approach termed QueryWise. This method capitalizes on unlabeled data obtained from a proxy distribution to train the thief model without incurring additional queries. Evaluation on two medical imaging models for Gallbladder Cancer and COVID-19 classification substantiates the effectiveness of the proposed attack. The source code is available at https://github.com/rajankita/QueryWise.

Mixed Modality Classification Abdominal Methodology In Silico Academic Lab Open Code

ReMAR-DS: Recalibrated Feature Learning for Metal Artifact Reduction and CT Domain Transformation

Mubashara Rehman, Niki Martinel, Michele Avanzo, Riccardo Spizzo, Christian Micheloni

•preprint•Jun 24 2025

Artifacts in kilo-Voltage CT (kVCT) imaging degrade image quality, impacting clinical decisions. We propose a deep learning framework for metal artifact reduction (MAR) and domain transformation from kVCT to Mega-Voltage CT (MVCT). The proposed framework, ReMAR-DS, utilizes an encoder-decoder architecture with enhanced feature recalibration, effectively reducing artifacts while preserving anatomical structures. This ensures that only relevant information is utilized in the reconstruction process. By infusing recalibrated features from the encoder block, the model focuses on relevant spatial regions (e.g., areas with artifacts) and highlights key features across channels (e.g., anatomical structures), leading to improved reconstruction of artifact-corrupted regions. Unlike traditional MAR methods, our approach bridges the gap between high-resolution kVCT and artifact-resistant MVCT, enhancing radiotherapy planning. It produces high-quality MVCT-like reconstructions, validated through qualitative and quantitative evaluations. Clinically, this enables oncologists to rely on kVCT alone, reducing repeated high-dose MVCT scans and lowering radiation exposure for cancer patients.

CT Reconstruction Methodology In Silico Academic Lab

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

SAM2-SGP: Enhancing SAM2 for Medical Image Segmentation via Support-Set Guided Prompting

Filling of incomplete sinograms from sparse PET detector configurations using a residual U-Net

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

AI-based large-scale screening of gastric cancer from noncontrast CT imaging.

Multimodal Deep Learning Based on Ultrasound Images and Clinical Data for Better Ovarian Cancer Diagnosis.

Determination of Kennedy's classification in panoramic X-rays by automated tooth labeling.

VoxelOpt: Voxel-Adaptive Message Passing for Discrete Optimization in Deformable Abdominal CT Registration

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

ReMAR-DS: Recalibrated Feature Learning for Metal Artifact Reduction and CT Domain Transformation

Ready to Sharpen Your Edge?