Latest Papers on Radiology AI. Sources: pubmed, Order: Best Match, Limit: 10.

Explanation and Elaboration with Examples for METRICS (METRICS-E3): an initiative from the EuSoMII Radiomics Auditing Group.

Kocak B, Ammirabile A, Ambrosini I, Akinci D'Antonoli T, Borgheresi A, Cavallo AU, Cannella R, D'Anna G, Díaz O, Doniselli FM, Fanni SC, Ghezzo S, Groot Lipman KBW, Klontzas ME, Ponsiglione A, Stanzione A, Triantafyllou M, Vernuccio F, Cuocolo R

•papers•Aug 13 2025

Radiomics research has been hindered by inconsistent and often poor methodological quality, limiting its potential for clinical translation. To address this challenge, the METhodological RadiomICs Score (METRICS) was recently introduced as a tool for systematically assessing study rigor. However, its effective application requires clearer guidance. The METRICS-E3 (Explanation and Elaboration with Examples) resource was developed by the European Society of Medical Imaging Informatics-Radiomics Auditing Group in response. This international initiative provides comprehensive support for users by offering detailed rationales, interpretive guidance, scoring recommendations, and illustrative examples for each METRICS item and condition. Each criterion includes positive examples from peer-reviewed, open-access studies and hypothetical negative examples. In total, the finalized METRICS-E3 includes over 200 examples. The complete resource is publicly available through an interactive website. CRITICAL RELEVANCE STATEMENT: METRICS-E3 offers deeper insights into each METRICS item and condition, providing concrete examples with accompanying commentary and recommendations to enhance the evaluation of methodological quality in radiomics research. KEY POINTS: As a complementary initiative to METRICS, METRICS-E3 is intended to support stakeholders in evaluating the methodological aspects of radiomics studies. In METRICS-E3, each METRICS item and condition is supplemented with interpretive guidance, positive literature-based examples, hypothetical negative examples, and scoring recommendations. The complete METRICS-E3 explanation and elaboration resource is accessible at its interactive website.

Review Concept Consortium Policy

MammosighTR: Nationwide Breast Cancer Screening Mammogram Dataset with BI-RADS Annotations for Artificial Intelligence Applications.

Koç U, Beşler MS, Sezer EA, Karakaş E, Özkaya YA, Evrimler Ş, Yalçın A, Kızıloğlu A, Kesimal U, Oruç M, Çankaya İ, Koç Keleş D, Merd N, Özkan E, Çevik Nİ, Gökhan MB, Boyraz Hayat B, Özer M, Tokur O, Işık F, Tezcan A, Battal F, Yüzkat M, Sebik NB, Karademir F, Topuz Y, Sezer Ö, Varlı S, Ülgü MM, Akdoğan E, Birinci Ş

•papers•Aug 13 2025

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. The MammosighTR dataset, derived from Türkiye's national breast cancer screening mammography program, provides BI-RADS-labeled mammograms with detailed annotations on breast composition and lesion quadrant location, which may be useful for developing and testing AI models in breast cancer detection. ©RSNA, 2025.

Mammography Detection Breast Dataset Release In Silico Open Dataset

Differentiation Between Fibro-Adipose Vascular Anomaly and Intramuscular Venous Malformation Using Grey-Scale Ultrasound-Based Radiomics and Machine Learning.

Hu WJ, Wu G, Yuan JJ, Ma BX, Liu YH, Guo XN, Dong CX, Kang H, Yang X, Li JC

•papers•Aug 13 2025

To establish an ultrasound-based radiomics model to differentiate fibro adipose vascular anomaly (FAVA) and intramuscular venous malformation (VM). The clinical data of 65 patients with VM and 31 patients with FAVA who were treated and pathologically confirmed were retrospectively analyzed. Dimensionality reduction was performed on these features using the least absolute shrinkage and selection operator (LASSO). An ultrasound-based radiomics model was established using support vector machine (SVM) and random forest (RF) models. The diagnostic efficiency of this model was evaluated using the receiver operating characteristic. A total of 851 features were obtained by feature extraction, and 311 features were screened out using the t-test and Mann-Whitney U test. The dimensionality reduction was performed on the remaining features using LASSO. Finally, seven features were included to establish the diagnostic prediction model. In the testing group, the AUC, accuracy and specificity of the SVM model were higher than those of the RF model (0.841 [0.815-0.867] vs. 0.791 [0.759-0.824], 96.6% vs. 93.1%, and 100.0% vs. 90.5%, respectively). However, the sensitivity of the SVM model was lower than that of the RF model (88.9% vs. 100.0%). In this study, a prediction model based on ultrasound radiomics was developed to distinguish FAVA from VM. The study achieved high classification accuracy, sensitivity, and specificity. SVM model is superior to RF model and provides a new perspective and tool for clinical diagnosis.

Ultrasound Classification Musculoskeletal Retrospective Clinical In Silico

Development of a multimodal vision transformer model for predicting traumatic versus degenerative rotator cuff tears on magnetic resonance imaging: A single-centre retrospective study.

Oettl FC, Malayeri AB, Furrer PR, Wieser K, Fürnstahl P, Bouaicha S

•papers•Aug 13 2025

The differentiation between traumatic and degenerative rotator cuff tears (RCTs remains a diagnostic challenge with significant implications for treatment planning. While magnetic resonance imaging (MRI) is standard practice, traditional radiological interpretation has shown limited reliability in distinguishing these etiologies. This study evaluates the potential of artificial intelligence (AI) models, specifically a multimodal vision transformer (ViT), to differentiate between traumatic and degenerative RCT. In this retrospective, single-centre study, 99 shoulder MRIs were analysed from patients who underwent surgery at a specialised university shoulder unit between 2016 and 2019. The cohort was divided into training (n = 79) and validation (n = 20) sets. The traumatic group required a documented relevant trauma (excluding simple lifting injuries), previously asymptomatic shoulder and MRI within 3 months posttrauma. The degenerative group was of similar age and injured tendon, with patients presenting with at least 1 year of constant shoulder pain prior to imaging and no trauma history. The ViT was subsequently combined with demographic data to finalise in a multimodal ViT. Saliency maps are utilised as an explainability tool. The multimodal ViT model achieved an accuracy of 0.75 ± 0.08 with a recall of 0.8 ± 0.08, specificity of 0.71 ± 0.11 and a F1 score of 0.76 ± 0.1. The model maintained consistent performance across different patient subsets, demonstrating robust generalisation. Saliency maps do not show a consistent focus on the rotator cuff. AI shows potential in supporting the challenging differentiation between traumatic and degenerative RCT on MRI. The achieved accuracy of 75% is particularly significant given the similar groups which presented a challenging diagnostic scenario. Saliency maps were utilised to ensure explainability, the given lack of consistent focus on rotator cuff tendons hints towards underappreciated aspects in the differentiation. Not applicable.

MRI Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

PPEA: Personalized positioning and exposure assistant based on multi-task shared pose estimation transformer.

Zhao J, Liu J, Yang C, Tang H, Chen Y, Zhang Y

•papers•Aug 13 2025

Hand and foot digital radiography (DR) is an indispensable tool in medical imaging, with varying diagnostic requirements necessitating different hand and foot positionings. Accurate positioning is crucial for obtaining diagnostically valuable images. Furthermore, adjusting exposure parameters such as exposure area based on patient conditions helps minimize the likelihood of image retakes. We propose a personalized positioning and exposure assistant capable of automatically recognizing hand and foot positionings and recommending appropriate exposure parameters to achieve these objectives. The assistant comprises three modules: (1) Progressive Iterative Hand-Foot Tracker (PIHFT) to iteratively locate hands or feet in RGB images, providing the foundation for accurate pose estimation; (2) Multi-Task Shared Pose Estimation Transformer (MTSPET), a Transformer-based model that encompasses hand and foot estimation branches with similar network architectures, sharing a common backbone. MTSPET outperformed MediaPipe in the hand pose estimation task and successfully transferred this capability to the foot pose estimation task; (3) Domain Expertise-embedded Positioning and Exposure Assistant (DEPEA), which combines the key-point coordinates of hands and feet with specific positioning and exposure parameter requirements, capable of checking patient positioning and inferring exposure areas and Regions of Interest (ROIs) of Digital Automatic Exposure Control (DAEC). Additionally, two datasets were collected and used to train MTSPET. A preliminary clinical trial showed strong agreement between PPEA's outputs and manual annotations, indicating the system's effectiveness in typical clinical scenarios. The contributions of this study lay the foundation for personalized, patient-specific imaging strategies, ultimately enhancing diagnostic outcomes and minimizing the risk of errors in clinical settings.

X-Ray Detection Musculoskeletal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Quantitative Prostate MRI, From the AJR Special Series on Quantitative Imaging.

Margolis DJA, Chatterjee A, deSouza NM, Fedorov A, Fennessy F, Maier SE, Obuchowski N, Punwani S, Purysko AS, Rakow-Penner R, Shukla-Dave A, Tempany CM, Boss M, Malyarenko D

•papers•Aug 13 2025

Prostate MRI has traditionally relied on qualitative interpretation. However, quantitative components hold the potential to markedly improve performance. The ADC from DWI is probably the most widely recognized quantitative MRI biomarker and has shown strong discriminatory value for clinically significant prostate cancer as well as for recurrent cancer after treatment. Advanced diffusion techniques, including intravoxel incoherent motion imaging, diffusion kurtosis imaging, diffusion-tensor imaging, and specific implementations such as restriction spectrum imaging, purport even better discrimination but are more technically challenging. The inherent T1 and T2 of tissue also provide diagnostic value, with more advanced techniques deriving luminal water fraction and hybrid multidimensional MRI metrics. Dynamic contrast-enhanced imaging, primarily using a modified Tofts model, also shows independent discriminatory value. Finally, quantitative lesion size and shape features can be combined with the aforementioned techniques and can be further refined using radiomics, texture analysis, and artificial intelligence. Which technique will ultimately find widespread clinical use will depend on validation across a myriad of platforms and use cases.

MRI Classification Abdominal Review Concept GenAI

BSA-Net: Boundary-prioritized spatial adaptive network for efficient left atrial segmentation.

Xu F, Tu W, Feng F, Yang J, Gunawardhana M, Gu Y, Huang J, Zhao J

•papers•Aug 13 2025

Atrial fibrillation, a common cardiac arrhythmia with rapid and irregular atrial electrical activity, requires accurate left atrial segmentation for effective treatment planning. Recently, deep learning methods have gained encouraging success in left atrial segmentation. However, current methodologies critically depend on the assumption of consistently complete centered left atrium as input, which neglects the structural incompleteness and boundary discontinuities arising from random-crop operations during inference. In this paper, we propose BSA-Net, which exploits an adaptive adjustment strategy in both feature position and loss optimization to establish long-range feature relationships and strengthen robust intermediate feature representations in boundary regions. Specifically, we propose a Spatial-adaptive Convolution (SConv) that employs a shuffle operation combined with lightweight convolution to directly establish cross-positional relationships within regions of potential relevance. Moreover, we develop the dual Boundary Prioritized loss, which enhances boundary precision by differentially weighting foreground and background boundaries, thus optimizing complex boundary regions. With the above technologies, the proposed method enjoys a better speed-accuracy trade-off compared to current methods. BSA-Net attains Dice scores of 92.55%, 91.42%, and 84.67% on the LA, Utah, and Waikato datasets, respectively, with a mere 2.16 M parameters-approximately 80% fewer than other contemporary state-of-the-art models. Extensive experimental results on three benchmark datasets have demonstrated that BSA-Net, consistently and significantly outperforms existing state-of-the-art methods.

CT Segmentation Cardiac Methodology In Silico Academic Lab Benchmark SOTA

Automatic detection of arterial input function for brain DCE-MRI in multi-site cohorts.

Saca L, Gaggar R, Pappas I, Benzinger T, Reiman EM, Shiroishi MS, Joe EB, Ringman JM, Yassine HN, Schneider LS, Chui HC, Nation DA, Zlokovic BV, Toga AW, Chakhoyan A, Barnes S

•papers•Aug 13 2025

Arterial input function (AIF) extraction is a crucial step in quantitative pharmacokinetic modeling of DCE-MRI. This work proposes a robust deep learning model that can precisely extract an AIF from DCE-MRI images. A diverse dataset of human brain DCE-MRI images from 289 participants, totaling 384 scans, from five different institutions with extracted gadolinium-based contrast agent curves from large penetrating arteries, and with most data collected for blood-brain barrier (BBB) permeability measurement, was retrospectively analyzed. A 3D UNet model was implemented and trained on manually drawn AIF regions. The testing cohort was compared using proposed AIF quality metric AIFitness and Ktrans values from a standard DCE pipeline. This UNet was then applied to a separate dataset of 326 participants with a total of 421 DCE-MRI images with analyzed AIF quality and Ktrans values. The resulting 3D UNet model achieved an average AIFitness score of 93.9 compared to 99.7 for manually selected AIFs, and white matter Ktrans values were 0.45/min × 10-3 and 0.45/min × 10-3, respectively. The intraclass correlation between automated and manual Ktrans values was 0.89. The separate replication dataset yielded an AIFitness score of 97.0 and white matter Ktrans of 0.44/min × 10-3. Findings suggest a 3D UNet model with additional convolutional neural network kernels and a modified Huber loss function achieves superior performance for identifying AIF curves from DCE-MRI in a diverse multi-center cohort. AIFitness scores and DCE-MRI-derived metrics, such as Ktrans maps, showed no significant differences in gray and white matter between manually drawn and automated AIFs.

MRI Detection Neurological Retrospective Clinical In Silico

Comparative evaluation of CAM methods for enhancing explainability in veterinary radiography.

Dusza P, Banzato T, Burti S, Bendazzoli M, Müller H, Wodzinski M

•papers•Aug 13 2025

Explainable Artificial Intelligence (XAI) encompasses a broad spectrum of methods that aim to enhance the transparency of deep learning models, with Class Activation Mapping (CAM) methods widely used for visual interpretability. However, systematic evaluations of these methods in veterinary radiography remain scarce. This study presents a comparative analysis of eleven CAM methods, including GradCAM, XGradCAM, ScoreCAM, and EigenCAM, on a dataset of 7362 canine and feline X-ray images. A ResNet18 model was chosen based on the specificity of the dataset and preliminary results where it outperformed other models. Quantitative and qualitative evaluations were performed to determine how well each CAM method produced interpretable heatmaps relevant to clinical decision-making. Among the techniques evaluated, EigenGradCAM achieved the highest mean score and standard deviation (SD) of 2.571 (SD = 1.256), closely followed by EigenCAM at 2.519 (SD = 1.228) and GradCAM++ at 2.512 (SD = 1.277), with methods such as FullGrad and XGradCAM achieving worst scores of 2.000 (SD = 1.300) and 1.858 (SD = 1.198) respectively. Despite variations in saliency visualization, no single method universally improved veterinarians' diagnostic confidence. While certain CAM methods provide better visual cues for some pathologies, they generally offered limited explainability and didn't substantially improve veterinarians' diagnostic confidence.

X-Ray Classification Methodology In Silico Reproducibility

Multimodal ensemble machine learning predicts neurological outcome within three hours after out of hospital cardiac arrest.

Kawai Y, Yamamoto K, Tsuruta K, Miyazaki K, Asai H, Fukushima H

•papers•Aug 13 2025

This study aimed to determine if an ensemble (stacking) model that integrates three independently developed base models can reliably predict patients' neurological outcomes following out-of-hospital cardiac arrest (OHCA) within 3 h of arrival and outperform each individual model. This retrospective study included patients with OHCA (≥ 18 years) admitted directly to Nara Medical University between April 2015 and March 2024 who remained comatose for ≥ 3 h after arrival and had suitable head computed tomography (CT) images. The area under the receiver operating characteristic curve (AUC) and Briers scores were used to evaluate the performance of four models (resuscitation-related background OHCA score factors, bilateral pupil diameter, single-slice head CT within 3 h of arrival, and an ensemble stacked model combining these three models) in predicting favourable neurological outcomes at hospital discharge or 1 month, as defined by a Cerebral Performance Category scale of 1-2. Among 533 patients, 82 (15%) had favourable outcomes. The OHCA, pupil, and head CT models yielded AUCs of 0.76, 0.65, and 0.68 with Brier scores of 0.11, 0.13, and 0.12, respectively. The ensemble model outperformed the other models (AUC, 0.82; Brier score, 0.10), thereby supporting its application for early clinical decision-making and optimising resource allocation.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

Explanation and Elaboration with Examples for METRICS (METRICS-E3): an initiative from the EuSoMII Radiomics Auditing Group.

MammosighTR: Nationwide Breast Cancer Screening Mammogram Dataset with BI-RADS Annotations for Artificial Intelligence Applications.

Differentiation Between Fibro-Adipose Vascular Anomaly and Intramuscular Venous Malformation Using Grey-Scale Ultrasound-Based Radiomics and Machine Learning.

Development of a multimodal vision transformer model for predicting traumatic versus degenerative rotator cuff tears on magnetic resonance imaging: A single-centre retrospective study.

PPEA: Personalized positioning and exposure assistant based on multi-task shared pose estimation transformer.

Quantitative Prostate MRI, From the <i>AJR</i> Special Series on Quantitative Imaging.

BSA-Net: Boundary-prioritized spatial adaptive network for efficient left atrial segmentation.

Automatic detection of arterial input function for brain DCE-MRI in multi-site cohorts.

Comparative evaluation of CAM methods for enhancing explainability in veterinary radiography.

Multimodal ensemble machine learning predicts neurological outcome within three hours after out of hospital cardiac arrest.

Ready to Sharpen Your Edge?