Latest Papers on Radiology AI. Tags: Benchmark SOTA

AI-Assisted Detection of Amyloid-related Imaging Abnormalities (ARIA): Promise and Pitfalls.

Petrella JR, Liu AJ, Wang LA, Doraiswamy PM

•papers•Jul 30 2025

The advent of anti-amyloid therapies (AATs) for Alzheimer's disease (AD) has elevated the importance of MRI surveillance for amyloidrelated imaging abnormalities (ARIA) such as microhemorrhages and siderosis (ARIA-H) and edema (ARIA-E). We report a literature review and early quality assurance experience with an FDA-cleared assistive AI tool intended for detection of ARIA in MRI clinical workflows. The AI system improved sensitivity for detection of subtle ARIA-E and ARIA-H lesions but at the cost of a reduction in specificity. We propose a tiered workflow combining protocol harmonization and expert interpretation with AI overlay review. AI-assisted ARIA detection is a paradigm shift that offers great promise to enhance patient safety as disease-modifying therapies for AD gain broader clinical use; however, some pitfalls need to be considered.ABBREVIATIONS: AAT＝ anti-amyloid therapy; ARIA＝ amyloid-related imaging abnormalities, ARIA-H = amyloid-related imaging abnormality-hemorrhage, ARIA-E = amyloid-related imaging abnormality-edema.

MRI Detection Neurological Review FDA Cleared FDA 510(k)Benchmark SOTA

Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques

Weide Liu, Wei Zhou, Jun Liu, Ping Hu, Jun Cheng, Jungong Han, Weisi Lin

•preprint•Jul 30 2025

Feature matching is a cornerstone task in computer vision, essential for applications such as image retrieval, stereo matching, 3D reconstruction, and SLAM. This survey comprehensively reviews modality-based feature matching, exploring traditional handcrafted methods and emphasizing contemporary deep learning approaches across various modalities, including RGB images, depth images, 3D point clouds, LiDAR scans, medical images, and vision-language interactions. Traditional methods, leveraging detectors like Harris corners and descriptors such as SIFT and ORB, demonstrate robustness under moderate intra-modality variations but struggle with significant modality gaps. Contemporary deep learning-based methods, exemplified by detector-free strategies like CNN-based SuperPoint and transformer-based LoFTR, substantially improve robustness and adaptability across modalities. We highlight modality-aware advancements, such as geometric and depth-specific descriptors for depth images, sparse and dense learning methods for 3D point clouds, attention-enhanced neural networks for LiDAR scans, and specialized solutions like the MIND descriptor for complex medical image matching. Cross-modal applications, particularly in medical image registration and vision-language tasks, underscore the evolution of feature matching to handle increasingly diverse data interactions.

Mixed Modality Registration Review Concept Benchmark SOTA

Optimizing Federated Learning Configurations for MRI Prostate Segmentation and Cancer Detection: A Simulation Study

Ashkan Moradi, Fadila Zerka, Joeran S. Bosma, Mohammed R. S. Sunoqrot, Bendik S. Abrahamsen, Derya Yakar, Jeroen Geerdink, Henkjan Huisman, Tone Frost Bathen, Mattijs Elschot

•preprint•Jul 30 2025

Purpose: To develop and optimize a federated learning (FL) framework across multiple clients for biparametric MRI prostate segmentation and clinically significant prostate cancer (csPCa) detection. Materials and Methods: A retrospective study was conducted using Flower FL to train a nnU-Net-based architecture for MRI prostate segmentation and csPCa detection, using data collected from January 2010 to August 2021. Model development included training and optimizing local epochs, federated rounds, and aggregation strategies for FL-based prostate segmentation on T2-weighted MRIs (four clients, 1294 patients) and csPCa detection using biparametric MRIs (three clients, 1440 patients). Performance was evaluated on independent test sets using the Dice score for segmentation and the Prostate Imaging: Cancer Artificial Intelligence (PI-CAI) score, defined as the average of the area under the receiver operating characteristic curve and average precision, for csPCa detection. P-values for performance differences were calculated using permutation testing. Results: The FL configurations were independently optimized for both tasks, showing improved performance at 1 epoch 300 rounds using FedMedian for prostate segmentation and 5 epochs 200 rounds using FedAdagrad, for csPCa detection. Compared with the average performance of the clients, the optimized FL model significantly improved performance in prostate segmentation and csPCa detection on the independent test set. The optimized FL model showed higher lesion detection performance compared to the FL-baseline model, but no evidence of a difference was observed for prostate segmentation. Conclusions: FL enhanced the performance and generalizability of MRI prostate segmentation and csPCa detection compared with local models, and optimizing its configuration further improved lesion detection performance.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Yutao Hu, Ying Zheng, Shumei Miao, Xiaolei Zhang, Jiahao Xia, Yaolei Qi, Yiyang Zhang, Yuting He, Qian Chen, Jing Ye, Hongyan Qiao, Xiuhua Hu, Lei Xu, Jiayin Zhang, Hui Liu, Minwen Zheng, Yining Wang, Daimin Zhang, Ji Zhang, Wenqi Shao, Yun Liu, Longjiang Zhang, Guanyu Yang

•preprint•Jul 29 2025

Foundation models have demonstrated remarkable potential in medical domain. However, their application to complex cardiovascular diagnostics remains underexplored. In this paper, we present Cardiac-CLIP, a multi-modal foundation model designed for 3D cardiac CT images. Cardiac-CLIP is developed through a two-stage pre-training strategy. The first stage employs a 3D masked autoencoder (MAE) to perform self-supervised representation learning from large-scale unlabeled volumetric data, enabling the visual encoder to capture rich anatomical and contextual features. In the second stage, contrastive learning is introduced to align visual and textual representations, facilitating cross-modal understanding. To support the pre-training, we collect 16641 real clinical CT scans, supplemented by 114k publicly available data. Meanwhile, we standardize free-text radiology reports into unified templates and construct the pathology vectors according to diagnostic attributes, based on which the soft-label matrix is generated to supervise the contrastive learning process. On the other hand, to comprehensively evaluate the effectiveness of Cardiac-CLIP, we collect 6,722 real-clinical data from 12 independent institutions, along with the open-source data to construct the evaluation dataset. Specifically, Cardiac-CLIP is comprehensively evaluated across multiple tasks, including cardiovascular abnormality classification, information retrieval and clinical analysis. Experimental results demonstrate that Cardiac-CLIP achieves state-of-the-art performance across various downstream tasks in both internal and external data. Particularly, Cardiac-CLIP exhibits great effectiveness in supporting complex clinical tasks such as the prospective prediction of acute coronary syndrome, which is notoriously difficult in real-world scenarios.

CT Classification Cardiac Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

Deep learning aging marker from retinal images unveils sex-specific clinical and genetic signatures

Trofimova, O., Böttger, L., Bors, S., Pan, Y., Liefers, B., Beyeler, M. J., Presby, D. M., Bontempi, D., Hastings, J., Klaver, C. C. W., Bergmann, S.

•preprint•Jul 29 2025

Retinal fundus images offer a non-invasive window into systemic aging. Here, we fine-tuned a foundation model (RETFound) to predict chronological age from color fundus images in 71,343 participants from the UK Biobank, achieving a mean absolute error of 2.85 years. The resulting retinal age gap (RAG), i.e., the difference between predicted and chronological age, was associated with cardiometabolic traits, inflammation, cognitive performance, mortality, dementia, cancer, and incident cardiovascular disease. Genome-wide analyses identified genes related to longevity, metabolism, neurodegeneration, and age-related eye diseases. Sex-stratified models revealed consistent performance but divergent biological signatures: males had younger-appearing retinas and stronger links to metabolic syndrome, while in females, both model attention and genetic associations pointed to a greater involvement of retinal vasculature. Our study positions retinal aging as a biologically meaningful and sex-sensitive biomarker that can support more personalized approaches to risk assessment and aging-related healthcare.

OCT Registration Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

Shreyank N Gowda, Ruichi Zhang, Xiao Gu, Ying Weng, Lu Yang

•preprint•Jul 29 2025

Medical image-language pre-training aims to align medical images with clinically relevant text to improve model performance on various downstream tasks. However, existing models often struggle with the variability and ambiguity inherent in medical data, limiting their ability to capture nuanced clinical information and uncertainty. This work introduces an uncertainty-aware medical image-text pre-training model that enhances generalization capabilities in medical image analysis. Building on previous methods and focusing on Chest X-Rays, our approach utilizes structured text reports generated by a large language model (LLM) to augment image data with clinically relevant context. These reports begin with a definition of the disease, followed by the `appearance' section to highlight critical regions of interest, and finally `observations' and `verdicts' that ground model predictions in clinical semantics. By modeling both inter- and intra-modal uncertainty, our framework captures the inherent ambiguity in medical images and text, yielding improved representations and performance on downstream tasks. Our model demonstrates significant advances in medical image-text pre-training, obtaining state-of-the-art performance on multiple downstream tasks.

X-Ray Classification Chest Methodology In Silico Benchmark SOTA GenAI

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Peiran Gu, Teng Yao, Mengshen He, Fuhao Duan, Feiyan Liu, RenYuan Peng, Bao Ge

•preprint•Jul 29 2025

In recent years, artificial intelligence has been increasingly applied in the field of medical imaging. Among these applications, fundus image analysis presents special challenges, including small lesion areas in certain fundus diseases and subtle inter-disease differences, which can lead to reduced prediction accuracy and overfitting in the models. To address these challenges, this paper proposes the Transformer-based model SwinECAT, which combines the Shifted Window (Swin) Attention with the Efficient Channel Attention (ECA) Attention. SwinECAT leverages the Swin Attention mechanism in the Swin Transformer backbone to effectively capture local spatial structures and long-range dependencies within fundus images. The lightweight ECA mechanism is incorporated to guide the SwinECAT's attention toward critical feature channels, enabling more discriminative feature representation. In contrast to previous studies that typically classify fundus images into 4 to 6 categories, this work expands fundus disease classification to 9 distinct types, thereby enhancing the granularity of diagnosis. We evaluate our method on the Eye Disease Image Dataset (EDID) containing 16,140 fundus images for 9-category classification. Experimental results demonstrate that SwinECAT achieves 88.29\% accuracy, with weighted F1-score of 0.88 and macro F1-score of 0.90. The classification results of our proposed model SwinECAT significantly outperform the baseline Swin Transformer and multiple compared baseline models. To our knowledge, this represents the highest reported performance for 9-category classification on this public dataset.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

Multi-Faceted Consistency learning with active cross-labeling for barely-supervised 3D medical image segmentation.

Wu X, Xu Z, Tong RK

•papers•Jul 29 2025

Deep learning-driven 3D medical image segmentation generally necessitates dense voxel-wise annotations, which are expensive and labor-intensive to acquire. Cross-annotation, which labels only a few orthogonal slices per scan, has recently emerged as a cost-effective alternative that better preserves the shape and precise boundaries of the 3D object than traditional weak labeling methods such as bounding boxes and scribbles. However, learning from such sparse labels, referred to as barely-supervised learning (BSL), remains challenging due to less fine-grained object perception, less compact class features and inferior generalizability. To tackle these challenges and foster collaboration between model training and human expertise, we propose a Multi-Faceted ConSistency learning (MF-ConS) framework with a Diversity and Uncertainty Sampling-based Active Learning (DUS-AL) strategy, specifically designed for the active BSL scenario. This framework combines a cross-annotation BSL strategy, where only three orthogonal slices are labeled per scan, with an AL paradigm guided by DUS to direct human-in-the-loop annotation toward the most informative volumes under a fixed budget. Built upon a teacher-student architecture, MF-ConS integrates three complementary consistency regularization modules: (i) neighbor-informed object prediction consistency for advancing fine-grained object perception by encouraging the student model to infer complete segmentation from masked inputs; (ii) prototype-driven consistency, which enhances intra-class compactness and discriminativeness by aligning latent feature and decision spaces using fused prototypes; and (iii) stability constraint that promotes model robustness against input perturbations. Extensive experiments on three benchmark datasets demonstrate that MF-ConS (DUS-AL) consistently outperforms state-of-the-art methods under extremely limited annotation.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Radiomics, machine learning, and deep learning for hippocampal sclerosis identification: a systematic review and diagnostic meta-analysis.

Baptista JM, Brenner LO, Koga JV, Ohannesian VA, Ito LA, Nabarro PH, Santos LP, Henrique A, de Oliveira Almeida G, Berbet LU, Paranhos T, Nespoli V, Bertani R

•papers•Jul 29 2025

Hippocampal sclerosis (HS) is the primary pathological finding in temporal lobe epilepsy (TLE) and a common cause of refractory seizures. Conventional diagnostic methods, such as EEG and MRI, have limitations. Artificial intelligence (AI) and radiomics, utilizing machine learning and deep learning, offer a non-invasive approach to enhance diagnostic accuracy. This study synthesized recent AI and radiomics research to improve HS detection in TLE. PubMed/Medline, Embase, and Web of Science were systematically searched following PRISMA-DTA guidelines until May 2024. Statistical analysis was conducted using STATA 14. A bivariate model was used to pool sensitivity (SEN) and specificity (SPE) for HS detection, with I2 assessing heterogeneity. Six studies were included. The pooled sensitivity and specificity of AI-based models for HS detection in medial temporal lobe epilepsy (MTLE) were 0.91 (95 % CI: 0.83-0.96; I2 = 71.48 %) and 0.9 (95 % CI: 0.83-0.94; I2 = 69.62 %), with an AUC of 0.96. AI alone showed higher sensitivity (0.92) and specificity (0.93) than AI combined with radiomics (sensitivity: 0.88; specificity: 0.9). Among algorithms, support vector machine (SVM) had the highest performance (SEN: 0.92; SPE: 0.95), followed by convolutional neural networks (CNN) and logistic regression (LR). AI models, particularly SVM, demonstrate high accuracy in detecting HS, with AI alone outperforming its combination with radiomics. These findings support the integration of AI into non-invasive diagnostic workflows, potentially enabling earlier detection and more personalized clinical decision-making in epilepsy care-ultimately contributing to improved patient outcomes and behavioral management.

MRI Classification Neurological Meta Analysis In Silico Academic Lab Benchmark SOTA

MFFBi-Unet: Merging Dynamic Sparse Attention and Multi-scale Feature Fusion for Medical Image Segmentation.

Sun B, Liu C, Wang Q, Bi K, Zhang W

•papers•Jul 29 2025

The advancement of deep learning has driven extensive research validating the effectiveness of U-Net-style symmetric encoder-decoder architectures based on Transformers for medical image segmentation. However, the inherent design requiring attention mechanisms to compute token affinities across all spatial locations leads to prohibitive computational complexity and substantial memory demands. Recent efforts have attempted to address these limitations through sparse attention mechanisms. However, existing approaches employing artificial, content-agnostic sparse attention patterns demonstrate limited capability in modeling long-range dependencies effectively. We propose MFFBi-Unet, a novel architecture incorporating dynamic sparse attention through bi-level routing, enabling context-aware computation allocation with enhanced adaptability. The encoder-decoder module integrates BiFormer to optimize semantic feature extraction and facilitate high-fidelity feature map reconstruction. A novel Multi-scale Feature Fusion (MFF) module in skip connections synergistically combines multi-level contextual information with processed multi-scale features. Extensive evaluations on multiple public medical benchmarks demonstrate that our method consistently exhibits significant advantages. Notably, our method achieves statistically significant improvements, outperforming state-of-the-art approaches like MISSFormer by 2.02% and 1.28% Dice scores on respective benchmarks.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Filter Papers

Tags

AI-Assisted Detection of Amyloid-related Imaging Abnormalities (ARIA): Promise and Pitfalls.

Modality-Aware Feature Matching: A Comprehensive Review of Single- and Cross-Modality Techniques

Optimizing Federated Learning Configurations for MRI Prostate Segmentation and Cancer Detection: A Simulation Study

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Deep learning aging marker from retinal images unveils sex-specific clinical and genetic signatures

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Multi-Faceted Consistency learning with active cross-labeling for barely-supervised 3D medical image segmentation.

Radiomics, machine learning, and deep learning for hippocampal sclerosis identification: a systematic review and diagnostic meta-analysis.

MFFBi-Unet: Merging Dynamic Sparse Attention and Multi-scale Feature Fusion for Medical Image Segmentation.

Ready to Sharpen Your Edge?