Latest Papers on Radiology AI. Order: Best Match, Limit: 10.

Comparative evaluation of CAM methods for enhancing explainability in veterinary radiography.

Dusza P, Banzato T, Burti S, Bendazzoli M, Müller H, Wodzinski M

•papers•Aug 13 2025

Explainable Artificial Intelligence (XAI) encompasses a broad spectrum of methods that aim to enhance the transparency of deep learning models, with Class Activation Mapping (CAM) methods widely used for visual interpretability. However, systematic evaluations of these methods in veterinary radiography remain scarce. This study presents a comparative analysis of eleven CAM methods, including GradCAM, XGradCAM, ScoreCAM, and EigenCAM, on a dataset of 7362 canine and feline X-ray images. A ResNet18 model was chosen based on the specificity of the dataset and preliminary results where it outperformed other models. Quantitative and qualitative evaluations were performed to determine how well each CAM method produced interpretable heatmaps relevant to clinical decision-making. Among the techniques evaluated, EigenGradCAM achieved the highest mean score and standard deviation (SD) of 2.571 (SD = 1.256), closely followed by EigenCAM at 2.519 (SD = 1.228) and GradCAM++ at 2.512 (SD = 1.277), with methods such as FullGrad and XGradCAM achieving worst scores of 2.000 (SD = 1.300) and 1.858 (SD = 1.198) respectively. Despite variations in saliency visualization, no single method universally improved veterinarians' diagnostic confidence. While certain CAM methods provide better visual cues for some pathologies, they generally offered limited explainability and didn't substantially improve veterinarians' diagnostic confidence.

X-Ray Classification Methodology In Silico Reproducibility

Multimodal ensemble machine learning predicts neurological outcome within three hours after out of hospital cardiac arrest.

Kawai Y, Yamamoto K, Tsuruta K, Miyazaki K, Asai H, Fukushima H

•papers•Aug 13 2025

This study aimed to determine if an ensemble (stacking) model that integrates three independently developed base models can reliably predict patients' neurological outcomes following out-of-hospital cardiac arrest (OHCA) within 3 h of arrival and outperform each individual model. This retrospective study included patients with OHCA (≥ 18 years) admitted directly to Nara Medical University between April 2015 and March 2024 who remained comatose for ≥ 3 h after arrival and had suitable head computed tomography (CT) images. The area under the receiver operating characteristic curve (AUC) and Briers scores were used to evaluate the performance of four models (resuscitation-related background OHCA score factors, bilateral pupil diameter, single-slice head CT within 3 h of arrival, and an ensemble stacked model combining these three models) in predicting favourable neurological outcomes at hospital discharge or 1 month, as defined by a Cerebral Performance Category scale of 1-2. Among 533 patients, 82 (15%) had favourable outcomes. The OHCA, pupil, and head CT models yielded AUCs of 0.76, 0.65, and 0.68 with Brier scores of 0.11, 0.13, and 0.12, respectively. The ensemble model outperformed the other models (AUC, 0.82; Brier score, 0.10), thereby supporting its application for early clinical decision-making and optimising resource allocation.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

In vivo variability of MRI radiomics features in prostate lesions assessed by a test-retest study with repositioning.

Zhang KS, Neelsen CJO, Wennmann M, Hielscher T, Kovacs B, Glemser PA, Görtz M, Stenzinger A, Maier-Hein KH, Huber J, Schlemmer HP, Bonekamp D

•papers•Aug 13 2025

Despite academic success, radiomics-based machine learning algorithms have not reached clinical practice, partially due to limited repeatability/reproducibility. To address this issue, this work aims to identify a stable subset of radiomics features in prostate MRI for radiomics modelling. A prospective study was conducted in 43 patients who received a clinical MRI examination and a research exam with repetition of T2-weighted and two different diffusion-weighted imaging (DWI) sequences with repositioning in between. Radiomics feature (RF) extraction was performed from MRI segmentations accounting for intra-rater and inter-rater effects, and three different image normalization methods were compared. Stability of RFs was assessed using the concordance correlation coefficient (CCC) for different comparisons: rater effects, inter-scan (before and after repositioning) and inter-sequence (between the two diffusion-weighted sequences) variability. In total, only 64 out of 321 (~ 20%) extracted features demonstrated stability, defined as CCC ≥ 0.75 in all settings (5 high-b value, 7 ADC- and 52 T2-derived features). For DWI, primarily intensity-based features proved stable with no shape feature passing the CCC threshold. T2-weighted images possessed the largest number of stable features with multiple shape (7), intensity-based (7) and texture features (28). Z-score normalization for high-b value images and muscle-normalization for T2-weighted images were identified as suitable.

MRI Segmentation Abdominal Prospective In Silico Academic Lab

Machine Learning-Driven Radiomic Profiling of Thalamus-Amygdala Nuclei for Prediction of Postoperative Delirium After STN-DBS in Parkinson's Disease Patients: A Pilot Study.

Radziunas A, Davidavicius G, Reinyte K, Pranckeviciene A, Fedaravicius A, Kucinskas V, Laucius O, Tamasauskas A, Deltuva V, Saudargiene A

•papers•Aug 13 2025

Postoperative delirium is a common complication following sub-thalamic nucleus deep brain stimulation surgery in Parkinson's disease patients. Postoperative delirium has been shown to prolong hospital stays, harm cognitive function, and negatively impact outcomes. Utilizing radiomics as a predictive tool for identifying patients at risk of delirium is a novel and personalized approach. This pilot study analyzed preoperative T1-weighted and T2-weighted magnetic resonance images from 34 Parkinson's disease patients, which were used to segment the thalamus, amygdala, and hippocampus, resulting in 10,680 extracted radiomic features. Feature selection using the minimum redundancy maximal relevance method identified the 20 most informative features, which were input into eight different machine learning algorithms. A high predictive accuracy of postoperative delirium was achieved by applying regularized binary logistic regression and linear discriminant analysis and using 10 most informative radiomic features. Regularized logistic regression resulted in 96.97% (±6.20) balanced accuracy, 99.5% (±4.97) sensitivity, 94.43% (±10.70) specificity, and area under the receiver operating characteristic curve of 0.97 (±0.06). Linear discriminant analysis showed 98.42% (±6.57) balanced accuracy, 98.00% (±9.80) sensitivity, 98.83% (±4.63) specificity, and area under the receiver operating characteristic curve of 0.98 (±0.07). The feed-forward neural network also demonstrated strong predictive capacity, achieving 96.17% (±10.40) balanced accuracy, 94.5% (±19.87) sensitivity, 97.83% (±7.87) specificity, and an area under the receiver operating characteristic curve of 0.96 (±0.10). However, when the feature set was extended to 20 features, both logistic regression and linear discriminant analysis showed reduced performance, while the feed-forward neural network achieved the highest predictive accuracy of 99.28% (±2.71), with 100.0% (±0.00) sensitivity, 98.57% (±5.42) specificity, and an area under the receiver operating characteristic curve of 0.99 (±0.03). Selected radiomic features might indicate network dysfunction between thalamic laterodorsal, reuniens medial ventral, and amygdala basal nuclei with hippocampus cornu ammonis 4 in these patients. This finding expands previous research suggesting the importance of the thalamic-hippocampal-amygdala network for postoperative delirium due to alterations in neuronal activity.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

A stacking ensemble framework integrating radiomics and deep learning for prognostic prediction in head and neck cancer.

Wang B, Liu J, Zhang X, Lin J, Li S, Wang Z, Cao Z, Wen D, Liu T, Ramli HRH, Harith HH, Hasan WZW, Dong X

•papers•Aug 13 2025

Radiomics models frequently face challenges related to reproducibility and robustness. To address these issues, we propose a multimodal, multi-model fusion framework utilizing stacking ensemble learning for prognostic prediction in head and neck cancer (HNC). This approach seeks to improve the accuracy and reliability of survival predictions. A total of 806 cases from nine centers were collected; 143 cases from two centers were assigned as the external validation cohort, while the remaining 663 were stratified and randomly split into training (n = 530) and internal validation (n = 133) sets. Radiomics features were extracted according to IBSI standards, and deep learning features were obtained using a 3D DenseNet-121 model. Following feature selection, the selected features were input into Cox, SVM, RSF, DeepCox, and DeepSurv models. A stacking fusion strategy was employed to develop the prognostic model. Model performance was evaluated using Kaplan-Meier survival curves and time-dependent ROC curves. On the external validation set, the model using combined PET and CT radiomics features achieved superior performance compared to single-modality models, with the RSF model obtaining the highest concordance index (C-index) of 0.7302. When using deep features extracted by 3D DenseNet-121, the PET + CT-based models demonstrated significantly improved prognostic accuracy, with Deepsurv and DeepCox achieving C-indices of 0.9217 and 0.9208, respectively. In stacking models, the PET + CT model using only radiomics features reached a C-index of 0.7324, while the deep feature-based stacking model achieved 0.9319. The best performance was obtained by the multi-feature fusion model, which integrated both radiomics and deep learning features from PET and CT, yielding a C-index of 0.9345. Kaplan-Meier survival analysis further confirmed the fusion model's ability to distinguish between high-risk and low-risk groups. The stacking-based ensemble model demonstrates superior performance compared to individual machine learning models, markedly improving the robustness of prognostic predictions.

Mixed Modality Classification Retrospective Clinical In Silico Academic Lab Benchmark SOTA

ES-UNet: efficient 3D medical image segmentation with enhanced skip connections in 3D UNet.

Park M, Oh S, Park J, Jeong T, Yu S

•papers•Aug 13 2025

Deep learning has significantly advanced medical image analysis, particularly in semantic segmentation, which is essential for clinical decisions. However, existing 3D segmentation models, like the traditional 3D UNet, face challenges in balancing computational efficiency and accuracy when processing volumetric medical data. This study aims to develop an improved architecture for 3D medical image segmentation with enhanced learning strategies to improve accuracy and address challenges related to limited training data. We propose ES-UNet, a 3D segmentation architecture that achieves superior segmentation performance while offering competitive efficiency across multiple computational metrics, including memory usage, inference time, and parameter count. The model builds upon the full-scale skip connection design of UNet3+ by integrating channel attention modules into each encoder-to-decoder path and incorporating full-scale deep supervision to enhance multi-resolution feature learning. We further introduce Region Specific Scaling (RSS), a data augmentation method that adaptively applies geometric transformations to annotated regions, and a Dynamically Weighted Dice (DWD) loss to improve the balance between precision and recall. The model was evaluated on the MICCAI HECKTOR dataset, and additional validation was conducted on selected tasks from the Medical Segmentation Decathlon (MSD). On the HECKTOR dataset, ES-UNet achieved a Dice Similarity Coefficient (DSC) of 76.87%, outperforming baseline models including 3D UNet, 3D UNet 3+, nnUNet, and Swin UNETR. Ablation studies showed that RSS and DWD contributed up to 1.22% and 1.06% improvement in DSC, respectively. A sensitivity analysis demonstrated that the chosen scaling range in RSS offered a favorable trade-off between deformation and anatomical plausibility. Cross-dataset evaluation on MSD Heart and Spleen tasks also indicated strong generalization. Computational analysis revealed that ES-UNet achieves superior segmentation performance with moderate computational demands. Specifically, the enhanced skip connection design with lightweight channel attention modules integrated throughout the network architecture enables this favorable balance between high segmentation accuracy and computational efficiency. ES-UNet integrates architectural and algorithmic improvements to achieve robust 3D medical image segmentation. While the framework incorporates established components, its core contributions lie in the optimized skip connection strategy and supporting techniques like RSS and DWD. Future work will explore adaptive scaling strategies and broader validation across diverse imaging modalities.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

An optimized multi-task contrastive learning framework for HIFU lesion detection and segmentation.

Zavar M, Ghaffari HR, Tabatabaee H

•papers•Aug 13 2025

Accurate detection and segmentation of lesions induced by High-Intensity Focused Ultrasound (HIFU) in medical imaging remain significant challenges in automated disease diagnosis. Traditional methods heavily rely on labeled data, which is often scarce, expensive, and time-consuming to obtain. Moreover, existing approaches frequently struggle with variations in medical data and the limited availability of annotated datasets, leading to suboptimal performance. To address these challenges, this paper introduces an innovative framework called the Optimized Multi-Task Contrastive Learning Framework (OMCLF), which leverages self-supervised learning (SSL) and genetic algorithms (GA) to enhance HIFU lesion detection and segmentation. OMCLF integrates classification and segmentation into a unified model, utilizing a shared backbone to extract common features. The framework systematically optimizes feature representations, hyperparameters, and data augmentation strategies tailored for medical imaging, ensuring that critical information, such as lesion details, is preserved. By employing a genetic algorithm, OMCLF explores and optimizes augmentation techniques suitable for medical data, avoiding distortions that could compromise diagnostic accuracy. Experimental results demonstrate that OMCLF outperforms single-task methods in both classification and segmentation tasks while significantly reducing dependency on labeled data. Specifically, OMCLF achieves an accuracy of 93.3% in lesion detection and a Dice score of 92.5% in segmentation, surpassing state-of-the-art methods such as SimCLR and MoCo. The proposed approach achieves superior accuracy in identifying and delineating HIFU-induced lesions, marking a substantial advancement in medical image interpretation and automated diagnosis. OMCLF represents a significant step forward in the evolutionary optimization of self-supervised learning, with potential applications across various medical imaging domains.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA

Explainable AI Technique in Lung Cancer Detection Using Convolutional Neural Networks

Nishan Rai, Sujan Khatri, Devendra Risal

•preprint•Aug 13 2025

Early detection of lung cancer is critical to improving survival outcomes. We present a deep learning framework for automated lung cancer screening from chest computed tomography (CT) images with integrated explainability. Using the IQ-OTH/NCCD dataset (1,197 scans across Normal, Benign, and Malignant classes), we evaluate a custom convolutional neural network (CNN) and three fine-tuned transfer learning backbones: DenseNet121, ResNet152, and VGG19. Models are trained with cost-sensitive learning to mitigate class imbalance and evaluated via accuracy, precision, recall, F1-score, and ROC-AUC. While ResNet152 achieved the highest accuracy (97.3%), DenseNet121 provided the best overall balance in precision, recall, and F1 (up to 92%, 90%, 91%, respectively). We further apply Shapley Additive Explanations (SHAP) to visualize evidence contributing to predictions, improving clinical transparency. Results indicate that CNN-based approaches augmented with explainability can provide fast, accurate, and interpretable support for lung cancer screening, particularly in resource-limited settings.

CT Classification Chest Retrospective Clinical In Silico GenAI

Deep Learning Enables Large-Scale Shape and Appearance Modeling in Total-Body DXA Imaging

Arianna Bunnell, Devon Cataldi, Yannik Glaser, Thomas K. Wolfgruber, Steven Heymsfield, Alan B. Zonderman, Thomas L. Kelly, Peter Sadowski, John A. Shepherd

•preprint•Aug 13 2025

Total-body dual X-ray absorptiometry (TBDXA) imaging is a relatively low-cost whole-body imaging modality, widely used for body composition assessment. We develop and validate a deep learning method for automatic fiducial point placement on TBDXA scans using 1,683 manually-annotated TBDXA scans. The method achieves 99.5% percentage correct keypoints in an external testing dataset. To demonstrate the value for shape and appearance modeling (SAM), our method is used to place keypoints on 35,928 scans for five different TBDXA imaging modes, then associations with health markers are tested in two cohorts not used for SAM model generation using two-sample Kolmogorov-Smirnov tests. SAM feature distributions associated with health biomarkers are shown to corroborate existing evidence and generate new hypotheses on body composition and shape's relationship to various frailty, metabolic, inflammation, and cardiometabolic health markers. Evaluation scripts, model weights, automatic point file generation code, and triangulation files are available at https://github.com/hawaii-ai/dxa-pointplacement.

X-Ray Detection Whole Body Methodology In Silico Academic Lab Open Code

Ultrasonic Texture Analysis for Predicting Acute Myocardial Infarction.

Jamthikar AD, Hathaway QA, Maganti K, Hamirani Y, Bokhari S, Yanamala N, Sengupta PP

•papers•Aug 13 2025

Acute myocardial infarction (MI) alters cardiomyocyte geometry and architecture, leading to changes in the acoustic properties of the myocardium. This study examines ultrasomics-a novel cardiac ultrasound-based radiomics technique to extract high-throughput pixel-level information from images-for identifying ultrasonic texture and morphologic changes associated with infarcted myocardium. We included 684 participants from multisource data: a) a retrospective single-center matched case-control dataset, b) a prospective multicenter matched clinical trial dataset, and c) an open-source international and multivendor dataset. Handcrafted and deep transfer learning-based ultrasomics features from 2- and 4-chamber echocardiographic views were used to train machine learning (ML) models with the use of leave-one-source-out cross-validation for external validation. The ML model showed a higher AUC than transfer learning-based deep features in identifying MI [AUCs: 0.87 [95% CI: 0.84-0.89] vs 0.74 [95% CI: 0.70-0.77]; P < 0.0001]. ML probability was an independent predictor of MI even after adjusting for conventional echocardiographic parameters [adjusted OR: 1.03 [95% CI: 1.01-1.05]; P < 0.0001]. ML probability showed diagnostic value in differentiating acute MI, even in the presence of myocardial dysfunction (averaged longitudinal strain [LS] <16%) (AUC: 0.84 [95% CI: 0.77-0.89]). In addition, combining averaged LS with ML probability significantly improved predictive performance compared with LS alone (AUCs: 0.86 [95% CI: 0.80-0.91] vs 0.80 [95% CI: 0.72-0.87]; P = 0.02). Visualization of ultrasomics features with the use of a Manhattan plot discriminated infarcted and noninfarcted segments (P < 0.001) and facilitated parametric visualization of infarcted myocardium. This study demonstrates the potential of cardiac ultrasomics to distinguish healthy from infarcted myocardium and highlights the need for validation in diverse populations to define its role and incremental value in myocardial tissue characterization beyond conventional echocardiography.

Ultrasound Classification Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Dataset

Comparative evaluation of CAM methods for enhancing explainability in veterinary radiography.

Multimodal ensemble machine learning predicts neurological outcome within three hours after out of hospital cardiac arrest.

In vivo variability of MRI radiomics features in prostate lesions assessed by a test-retest study with repositioning.

Machine Learning-Driven Radiomic Profiling of Thalamus-Amygdala Nuclei for Prediction of Postoperative Delirium After STN-DBS in Parkinson's Disease Patients: A Pilot Study.

A stacking ensemble framework integrating radiomics and deep learning for prognostic prediction in head and neck cancer.

ES-UNet: efficient 3D medical image segmentation with enhanced skip connections in 3D UNet.

An optimized multi-task contrastive learning framework for HIFU lesion detection and segmentation.

Explainable AI Technique in Lung Cancer Detection Using Convolutional Neural Networks

Deep Learning Enables Large-Scale Shape and Appearance Modeling in Total-Body DXA Imaging

Ultrasonic Texture Analysis for Predicting Acute Myocardial Infarction.

Ready to Sharpen Your Edge?