Latest Papers on Radiology AI. Tags: None

Impact of a deep learning image reconstruction algorithm on the robustness of abdominal computed tomography radiomics features using standard and low radiation doses.

Yang S, Bie Y, Zhao L, Luan K, Li X, Chi Y, Bian Z, Zhang D, Pang G, Zhong H

•papers•Sep 1 2025

Deep learning image reconstruction (DLIR) can enhance image quality and lower image dose, yet its impact on radiomics features (RFs) remains unclear. This study aimed to compare the effects of DLIR and conventional adaptive statistical iterative reconstruction-Veo (ASIR-V) algorithms on the robustness of RFs using standard and low-dose abdominal clinical computed tomography (CT) scans. A total of 54 patients with hepatic masses who underwent abdominal contrast-enhanced CT scans were retrospectively analyzed. The raw data of standard dose in the venous phase and low dose in the delayed phase were reconstructed using five reconstruction settings, including ASIR-V at 30% (ASIR-V30%) and 70% (ASIR-V70%) levels, and DLIR at low (DLIR-L), medium (DLIR-M), and high (DLIR-H) levels. The PyRadiomics platform was used for the extraction of RFs in 18 regions of interest (ROIs) in different organs or tissues. The consistency of RFs among different algorithms and different strength levels was tested by coefficient of variation (CV) and quartile coefficient of dispersion (QCD). The consistency of RFs among different strength levels of the same algorithm and clinically comparable levels across algorithms was evaluated by intraclass correlation coefficient (ICC). Robust features were identified by Kruskal-Wallis and Mann-Whitney <i>U</i> test. Among the five reconstruction methods, the mean CV and QCD in the standard-dose group were 0.364 and 0.213, respectively, and the corresponding values were 0.444 and 0.245 in the low-dose group. The mean ICC values between ASIR-V 30% and 70%, DLIR-L and M, DLIR-M and H, DLIR-L and H, ASIR-V30% and DLIR-M, and ASIR-V70% and DLIR-H were 0.672, 0.734, 0.756, 0.629, 0.724, and 0.651, respectively, in the standard-dose group, and the corresponding values were 0.500, 0.567, 0.700, 0.474, 0.499, and 0.650 in the low-dose group. The ICC values between DLIR-M and H under low-dose conditions were even higher than those of ASIR-V30% and -V70% under standard dose conditions. Among the five reconstruction settings, averages of 14.0% (117/837) and 10.3% (86/837) of RFs across 18 ROIs exhibited robustness under standard-dose and low-dose conditions, respectively. Some 23.1% (193/837) of RFs demonstrated robustness between the low-dose DLIR-M and H groups, which was higher than the 21.0% (176/837) observed in the standard-dose ASIR-V30% and -V70% groups. Most of the RFs lacked reproducibility across algorithms and energy levels. However, DLIR at medium (M) and high (H) levels significantly improved RFs consistency and robustness, even at reduced doses.

CT Reconstruction Abdominal Retrospective Clinical In Silico Academic Lab

Impact of pre-test probability on AI-LVO detection: a systematic review of LVO prevalence across clinical contexts.

Olivé-Gadea M, Mayol J, Requena M, Rodrigo-Gisbert M, Rizzo F, Garcia-Tornel A, Simonetti R, Diana F, Muchada M, Pagola J, Rodriguez-Luna D, Rodriguez-Villatoro N, Rubiera M, Molina CA, Tomasello A, Hernandez D, de Dios Lascuevas M, Ribo M

•papers•Aug 31 2025

Rapid identification of large vessel occlusion (LVO) in acute ischemic stroke (AIS) is essential for reperfusion therapy. Screening tools, including Artificial Intelligence (AI) based algorithms, have been developed to accelerate detection but rely heavily on pre-test LVO prevalence. This study aimed to review LVO prevalence across clinical contexts and analyze its impact on AI-algorithm performance. We systematically reviewed studies reporting consecutive suspected AIS cohorts. Cohorts were grouped into four clinical scenarios based on patient selection criteria: (a) high suspicion of LVO by stroke specialists (direct-to-angiosuite candidates), (b) high suspicion of LVO according to pre-hospital scales, (c) and (d) any suspected AIS without considering severity cut-off in a hospital or pre-hospital setting, respectively. We analyzed LVO prevalence in each scenario and assessed the false discovery rate (FDR) - number of positive studies needed to encounter a false positive, if applying eight commercially available LVO-detecting algorithms. We included 87 cohorts from 80 studies. Median LVO prevalence was: (a) 84% (77-87%), (b) 35% (26-42%), (c) 19% (14-25%), and (d) 14% (8-22%). At high prevalence levels: (a) FDR ranged between 0.007 (1 false positive in 142 positives) and 0.023 (1 in 43), whereas in low prevalence scenarios (Ccand d), FDR ranged between 0.168 (1 in 6) and 0.543 (over 1 in 2). To ensure meaningful clinical impact, AI algorithms must be evaluated within the specific populations and care pathways where they are applied.

CT Detection Neurological Review Post Market Consortium Benchmark SOTA

Adaptive Contrast Adjustment Module: A Clinically-Inspired Plug-and-Play Approach for Enhanced Fetal Plane Classification

Yang Chen, Sanglin Zhao, Baoyu Chen, Mans Gustaf

•preprint•Aug 31 2025

Fetal ultrasound standard plane classification is essential for reliable prenatal diagnosis but faces inherent challenges, including low tissue contrast, boundary ambiguity, and operator-dependent image quality variations. To overcome these limitations, we propose a plug-and-play adaptive contrast adjustment module (ACAM), whose core design is inspired by the clinical practice of doctors adjusting image contrast to obtain clearer and more discriminative structural information. The module employs a shallow texture-sensitive network to predict clinically plausible contrast parameters, transforms input images into multiple contrast-enhanced views through differentiable mapping, and fuses them within downstream classifiers. Validated on a multi-center dataset of 12,400 images across six anatomical categories, the module consistently improves performance across diverse models, with accuracy of lightweight models increasing by 2.02 percent, accuracy of traditional models increasing by 1.29 percent, and accuracy of state-of-the-art models increasing by 1.15 percent. The innovation of the module lies in its content-aware adaptation capability, replacing random preprocessing with physics-informed transformations that align with sonographer workflows while improving robustness to imaging heterogeneity through multi-view fusion. This approach effectively bridges low-level image features with high-level semantics, establishing a new paradigm for medical image analysis under real-world image quality variations.

Ultrasound Classification Abdominal Methodology In Silico Academic Lab Breakthrough

Can General-Purpose Omnimodels Compete with Specialists? A Case Study in Medical Image Segmentation

Yizhe Zhang, Qiang Chen, Tao Zhou

•preprint•Aug 31 2025

The emergence of powerful, general-purpose omnimodels capable of processing diverse data modalities has raised a critical question: can these ``jack-of-all-trades'' systems perform on par with highly specialized models in knowledge-intensive domains? This work investigates this question within the high-stakes field of medical image segmentation. We conduct a comparative study analyzing the zero-shot performance of a state-of-the-art omnimodel (Gemini 2.5 Pro, the ``Nano Banana'' model) against domain-specific deep learning models on three distinct tasks: polyp (endoscopy), retinal vessel (fundus), and breast tumor segmentation (ultrasound). Our study focuses on performance at the extremes by curating subsets of the ``easiest'' and ``hardest'' cases based on the specialist models' accuracy. Our findings reveal a nuanced and task-dependent landscape. For polyp and breast tumor segmentation, specialist models excel on easy samples, but the omnimodel demonstrates greater robustness on hard samples where specialists fail catastrophically. Conversely, for the fine-grained task of retinal vessel segmentation, the specialist model maintains superior performance across both easy and hard cases. Intriguingly, qualitative analysis suggests omnimodels may possess higher sensitivity, identifying subtle anatomical features missed by human annotators. Our results indicate that while current omnimodels are not yet a universal replacement for specialists, their unique strengths suggest a potential complementary role with specialist models, particularly in enhancing robustness on challenging edge cases.

Mixed Modality Segmentation Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Noncontrast CT-based deep learning for predicting intracerebral hemorrhage expansion incorporating growth of intraventricular hemorrhage.

Ning Y, Yu Q, Fan X, Jiang W, Chen X, Jiang H, Xie K, Liu R, Zhou Y, Zhang X, Lv F, Xu X, Peng J

•papers•Aug 31 2025

Intracerebral hemorrhage (ICH) is a severe form of stroke with high mortality and disability, where early hematoma expansion (HE) critically influences prognosis. Previous studies suggest that revised hematoma expansion (rHE), defined to include intraventricular hemorrhage (IVH) growth, provides improved prognostic accuracy. Therefore, this study aimed to develop a deep learning model based on noncontrast CT (NCCT) to predict high-risk rHE in ICH patients, enabling timely intervention. A retrospective dataset of 775 spontaneous ICH patients with baseline and follow-up CT scans was collected from two centers and split into training (n = 389), internal-testing (n = 167), and external-testing (n = 219) cohorts. 2D/3D convolutional neural network (CNN) models based on ResNet-101, ResNet-152, DenseNet-121, and DenseNet-201 were separately developed using baseline NCCT images, and the activation areas of the optimal deep learning model were visualized using gradient-weighted class activation mapping (Grad-CAM). Two baseline logistic regression clinical models based on the BRAIN score and independent clinical-radiologic predictors were also developed, along with combined-logistic and combined-SVM models incorporating handcrafted radiomics features and clinical-radiologic factors. Model performance was assessed using the area under the receiver operating characteristic curve (AUC). The 2D-ResNet-101 model outperformed others, with an AUC of 0.777 (95%CI, 0.716-0.830) in the external-testing set, surpassing the baseline clinical-radiologic model and the BRAIN score (AUC increase of 0.087, p = 0.022; 0.119, p = 0.003). Compared to the combined-logistic and combined-SVM models, AUC increased by 0.083 (p = 0.029) and 0.074 (p < 0.058), respectively. The deep learning model can identify ICH patients with high-risk rHE with favorable predictive performance than traditional baseline models based on clinical-radiologic variables and radiomics features.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

MSFE-GallNet-X: a multi-scale feature extraction-based CNN Model for gallbladder disease analysis with enhanced explainability.

Nabil HR, Ahmed I, Das A, Mridha MF, Kabir MM, Aung Z

•papers•Aug 30 2025

This study introduces MSFE-GallNet-X, a domain-adaptive deep learning model utilizing multi-scale feature extraction (MSFE) to improve the classification accuracy of gallbladder diseases from grayscale ultrasound images, while integrating explainable artificial intelligence (XAI) methods to enhance clinical interpretability. We developed a convolutional neural network-based architecture that automatically learns multi-scale features from a dataset comprising 10,692 high-resolution ultrasound images from 1,782 patients, covering nine gallbladder disease classes, including gallstones, cholecystitis, and carcinoma. The model incorporated Gradient-Weighted Class Activation Mapping (Grad-CAM) and Local Interpretable Model-Agnostic Explanations (LIME) to provide visual interpretability of diagnostic predictions. Model performance was evaluated using standard metrics, including accuracy and F1 score. The MSFE-GallNet-X achieved a classification accuracy of 99.63% and an F1 score of 99.50%, outperforming state-of-the-art models including VGG-19 (98.89%) and DenseNet121 (91.81%), while maintaining greater parameter efficiency, only 1·91 M parameters in gallbladder disease classification. Visualization through Grad-CAM and LIME highlighted critical image regions influencing model predictions, supporting explainability for clinical use. MSFE-GallNet-X demonstrates strong performance on a controlled and balanced dataset, suggesting its potential as an AI-assisted tool for clinical decision-making in gallbladder disease management. Not applicable.

Ultrasound Classification Abdominal Methodology In Silico Academic Lab GenAI

Interpretable Auto Window setting for deep-learning-based CT analysis.

Zhang Y, Chen M, Zhang Z

•papers•Aug 30 2025

Whether during the early days of popularization or in the present, the window setting in Computed Tomography (CT) has always been an indispensable part of the CT analysis process. Although research has investigated the capabilities of CT multi-window fusion in enhancing neural networks, there remains a paucity of domain-invariant, intuitively interpretable methodologies for Auto Window Setting. In this work, we propose plug-and-play module derived from Tanh activation function. This module enables the deployment of medical imaging neural network backbones without requiring manual CT window configuration. Domain-invariant design facilitates observation of the preference decisions rendered by the adaptive mechanism from a clinically intuitive perspective. We confirm the effectiveness of the proposed method on multiple open-source datasets, allowing for direct training without the need for manual window setting and yielding improvements with 54%∼127%+ Dice, 14%∼32%+ Recall and 94%∼200%+ Precision on hard segmentation targets. Experimental results conducted in NVIDIA NGC environment demonstrate that the module facilitates efficient deployment of AI-powered medical imaging tasks. The proposed method enables automatic determination of CT window settings for specific downstream tasks in the development and deployment of mainstream medical imaging neural networks, demonstrating the potential to reduce associated deployment costs.

CT Segmentation Methodology In Silico Academic Lab Open Code

Brain Atrophy Does Not Predict Clinical Progression in Progressive Supranuclear Palsy.

Quattrone A, Franzmeier N, Huppertz HJ, Seneca N, Petzold GC, Spottke A, Levin J, Prudlo J, Düzel E, Höglinger GU

•papers•Aug 30 2025

Clinical progression rate is the typical primary endpoint measure in progressive supranuclear palsy (PSP) clinical trials. This longitudinal multicohort study investigated whether baseline clinical severity and regional brain atrophy could predict clinical progression in PSP-Richardson's syndrome (PSP-RS). PSP-RS patients (n = 309) from the placebo arms of clinical trials (NCT03068468, NCT01110720, NCT02985879, NCT01049399) and DescribePSP cohort were included. We investigated associations of baseline clinical and volumetric magnetic resonance imaging (MRI) data with 1-year longitudinal PSP rating scale (PSPRS) change. Machine learning (ML) models were tested to predict individual clinical trajectories. PSP-RS patients showed a mean PSPRS score increase of 10.3 points/yr. The frontal lobe volume showed the strongest association with subsequent clinical progression (β: -0.34, P < 0.001). However, ML models did not accurately predict individual progression rates (R<sup>2</sup> <0.15). Baseline clinical severity and brain atrophy could not predict individual clinical progression, suggesting no need for MRI-based stratification of patients in future PSP trials. © 2025 The Author(s). Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab

External validation of deep learning-derived 18F-FDG PET/CT delta biomarkers for loco-regional control in head and neck cancer.

Kovacs DG, Aznar M, Van Herk M, Mohamed I, Price J, Ladefoged CN, Fischer BM, Andersen FL, McPartlin A, Osorio EMV, Abravan A

•papers•Aug 30 2025

Delta biomarkers that reflect changes in tumour burden over time can support personalised follow-up in head and neck cancer. However, their clinical use can be limited by the need for manual image segmentation. This study externally evaluates a deep learning model for automatic determination of volume change from serial 18F-fluorodeoxyglucose (18F-FDG) positron emission tomography/computed tomography (PET/CT) scans to stratify patients by loco-regional outcome. Patient/material and methods: An externally developed deep learning algorithm for tumour segmentation was applied to pre- and post-radiotherapy (RT, with or without concomitant chemoradiotherapy) PET/CT scans of 50 consecutive head and neck cancer patients from The Christie NHS Foundation Trust, UK. The model, originally trained on pre-treatment scans from a different institution, was deployed to derive tumour volumes at both time points. The AI-derived change in tumour volume (ΔPET-Gross tumour volume (GTV)) was calculated for each patient. Kaplan-Meier analysis assessed loco-regional control based on ΔPET-GTV, dichotomised at the cohort median. In a separate secondary analysis confined to the pre‑treatment scans, a radiation oncologist qualitatively evaluated the AI‑generated PET‑GTV contours. Patients with higher ΔPET-GTV (i.e. greater tumour shrinkage) had significantly improved loco-regional control (log-rank p = 0.02). At 2 years, control was 94.1% (95% CI: 83.6-100%) vs. 53.6% (95% CI: 32.2-89.1%). Only one of nine failures occurred in the high ΔPET-GTV group. Clinician review found AI volumes acceptable for planning in 78% of cases. In two cases, the algorithm identified oropharyngeal primaries on pre-treatment PET-CT before clinical identification. Deep learning-derived ΔPET-GTV may support clinically meaningful assessment of post-treatment disease status and risk stratification, offering a scalable alternative to manual segmentation in PET/CT follow-up.

Mixed Modality Segmentation Retrospective Clinical Clinical Pilot Academic Lab Reproducibility

Clinical Radiomics Nomogram Based on Ultrasound: A Tool for Preoperative Prediction of Uterine Sarcoma.

Zheng W, Lu A, Tang X, Chen L

•papers•Aug 30 2025

This study aims to develop a noninvasive preoperative predictive model utilizing ultrasound radiomics combined with clinical characteristics to differentiate uterine sarcoma from leiomyoma. This study included 212 patients with uterine mesenchymal lesions (102 sarcomas and 110 leiomyomas). Clinical characteristics were systematically selected through both univariate and multivariate logistic regression analyses. A clinical model was constructed using the selected clinical characteristics. Radiomics features were extracted from transvaginal ultrasound images, and 6 machine learning algorithms were used to construct radiomics models. Then, a clinical radiomics nomogram was developed integrating clinical characteristics with radiomics signature. The effectiveness of these models in predicting uterine sarcoma was thoroughly evaluated. The area under the curve (AUC) was used to compare the predictive efficacy of the different models. The AUC of the clinical model was 0.835 (95% confidence interval [CI]: 0.761-0.883) and 0.791 (95% CI: 0.652-0.869) in the training and testing sets, respectively. The logistic regression model performed best in the radiomics model construction, with AUC values of 0.878 (95% CI: 0.811-0.918) and 0.818 (95% CI: 0.681-0.895) in the training and testing sets, respectively. The clinical radiomics nomogram performed well in differentiation, with AUC values of 0.955 (95% CI: 0.911-0.973) and 0.882 (95% CI: 0.767-0.936) in the training and testing sets, respectively. The clinical radiomics nomogram can provide more comprehensive and personalized diagnostic information, which is highly important for selecting treatment strategies and ultimately improving patient outcomes in the management of uterine mesenchymal tumors.

Ultrasound Classification Abdominal Retrospective Clinical In Silico

Filter Papers

Tags

Impact of a deep learning image reconstruction algorithm on the robustness of abdominal computed tomography radiomics features using standard and low radiation doses.

Impact of pre-test probability on AI-LVO detection: a systematic review of LVO prevalence across clinical contexts.

Adaptive Contrast Adjustment Module: A Clinically-Inspired Plug-and-Play Approach for Enhanced Fetal Plane Classification

Can General-Purpose Omnimodels Compete with Specialists? A Case Study in Medical Image Segmentation

Noncontrast CT-based deep learning for predicting intracerebral hemorrhage expansion incorporating growth of intraventricular hemorrhage.

MSFE-GallNet-X: a multi-scale feature extraction-based CNN Model for gallbladder disease analysis with enhanced explainability.

Interpretable Auto Window setting for deep-learning-based CT analysis.

Brain Atrophy Does Not Predict Clinical Progression in Progressive Supranuclear Palsy.

External validation of deep learning-derived 18F-FDG PET/CT delta biomarkers for loco-regional control in head and neck cancer.

Clinical Radiomics Nomogram Based on Ultrasound: A Tool for Preoperative Prediction of Uterine Sarcoma.

Ready to Sharpen Your Edge?