Latest Papers on Radiology AI. Tags: Benchmark SOTA

Deep learning NTCP model for late dysphagia after radiotherapy for head and neck cancer patients based on 3D dose, CT and segmentations.

de Vette SPM, Neh H, van der Hoek L, MacRae DC, Chu H, Gawryszuk A, Steenbakkers RJHM, van Ooijen PMA, Fuller CD, Hutcheson KA, Langendijk JA, Sijtsema NM, van Dijk LV

•papers•Sep 29 2025

Late radiation-associated dysphagia after head and neck cancer (HNC) significantly impacts patient's health and quality of life. Conventional normal tissue complication probability (NTCP) models use discrete dose parameters to predict toxicity risk but fail to fully capture the complexity of this side effect. Deep learning (DL) offers potential improvements by incorporating 3D dose data for all anatomical structures involved in swallowing. This study aims to enhance dysphagia prediction with 3D DL NTCP models compared to conventional NTCP models. A multi-institutional cohort of 1484 HNC patients was used to train and validate a 3D DL model (Residual Network) incorporating 3D dose distributions, organ-at-risk segmentations, and CT scans, with or without patient- or treatment-related data. Predictions of grade ≥ 2 dysphagia (CTCAEv4) at six months post-treatment were evaluated using area under the curve (AUC) and calibration curves. Results were compared to a conventional NTCP model based on pre-treatment dysphagia, tumour location, and mean dose to swallowing organs. Attention maps highlighting regions of interest for individual patients were assessed. DL models outperformed the conventional NTCP model in both the independent test set (AUC = 0.80-0.84 versus 0.76) and external test set (AUC = 0.73-0.74 versus 0.63) in AUC and calibration. Attention maps showed a focus on the oral cavity and superior pharyngeal constrictor muscle. DL NTCP models performed significantly better than the conventional NTCP model, suggesting the benefit of using 3D-input over the conventional discrete dose parameters. Attention maps highlighted relevant regions linked to dysphagia, supporting the utility of DL for improved predictions.

CT Classification Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation

Lei Tong, Zhihua Liu, Chaochao Lu, Dino Oglic, Tom Diethe, Philip Teare, Sotirios A. Tsaftaris, Chen Jin

•preprint•Sep 29 2025

We present Causal-Adapter, a modular framework that adapts frozen text-to-image diffusion backbones for counterfactual image generation. Our method enables causal interventions on target attributes, consistently propagating their effects to causal dependents without altering the core identity of the image. In contrast to prior approaches that rely on prompt engineering without explicit causal structure, Causal-Adapter leverages structural causal modeling augmented with two attribute regularization strategies: prompt-aligned injection, which aligns causal attributes with textual embeddings for precise semantic control, and a conditioned token contrastive loss to disentangle attribute factors and reduce spurious correlations. Causal-Adapter achieves state-of-the-art performance on both synthetic and real-world datasets, with up to 91% MAE reduction on Pendulum for accurate attribute control and 87% FID reduction on ADNI for high-fidelity MRI image generation. These results show that our approach enables robust, generalizable counterfactual editing with faithful attribute modification and strong identity preservation.

MRI Image Synthesis Neurological Methodology In Silico Benchmark SOTA

Deep convolutional neural networks outperform vanilla machine learning when predicting language outcomes after stroke.

Hope TMH, Bowman H, Leff AP, Price CJ

•papers•Sep 29 2025

Current medicine cannot confidently predict patients' language skills after stroke. In recent years, researchers have sought to bridge this gap with machine learning. These models appear to benefit from access to features describing where and how much brain damage these patients have suffered. Given the very high dimensionality of structural brain imaging data, those brain lesion features are typically post-processed from the images themselves into tabular features. With the introduction of deep Convolutional Neural Networks (CNN), which appear to be much more robust to high dimensional data, it is natural to hope that much of this image post-processing might be unnecessary. But prior attempts to demonstrate this (in the area of post-stroke prognostics) have so far yielded only equivocal results - perhaps because the datasets that those studies could deploy were too small to properly constrain CNNs, which are famously 'data-hungry'. The study draws on a much larger dataset than has been employed in previous work like this, referring to patients whose language outcomes were assessed once during the chronic phase post-stroke, on or around the same days as they underwent high resolution MRI brain scans. Following the model of our own and others' past work, we use state of the art 'vanilla' machine learning models (boosted ensembles) to predict a variety of language and cognitive outcomes scores. These models employ both demographic variables and features derived from the brain imaging data, which represent where brain damage has occurred. These are our baseline models. Next, we use deep CNNs to predict the same language scores for the same patients, drawing on both the demographic variables, and post-processed brain lesion images: i.e., multi-input models with one input for tabular features and another for 3-dimensional images. We compare the models using 5 × 2-fold cross-validation, with consistent folds. The CNN models consistently outperform the vanilla machine learning models, in this domain. Deep CNNs offer state of the art performance when predicting language outcomes after stroke, outperforming vanilla machine learning and obviating the need to post-process lesion images into lesion features.

MRI Classification Neurological Retrospective Clinical In Silico Benchmark SOTA

Revealing Shared Tumor Microenvironment Dynamics Related to Microsatellite Instability Across Different Cancers Using Cellular Social Network Analysis

Zamanitajeddin, N., Jahanifar, M., Eastwood, M., Gunesli, G., Arends, M. J., Rajpoot, N.

•preprint•Sep 29 2025

Microsatellite instability (MSI) is a key biomarker for immunotherapy response and prognosis across multiple cancers, yet its identification from routine Hematoxylin and Eosin (H&E) slides remains challenging. Current deep learning predictors often operate as black-box, weakly supervised models trained on individual slides, limiting interpretability, biological insight, and generalization; particularly in low-data regimes. Importantly, systematic quantitative analysis of shared MSI-associated characteristics across different cancer types has not been performed, representing a major gap in understanding conserved tumor microenvironment (TME) patterns linked to MSI. Here, we present a multi-cancer MSI prediction model that leverages pathology foundation models for robust feature extraction and cell-level social network analysis (SNA) to uncover TME patterns associated with MSI. For the MSI prediction task, we introduce a novel transformer-based embedding aggregation method, leveraging attention-guided, multi-case batch training to improve learning efficiency, stability, and interpretability. Our method achieves high predictive performance, with mean AUROCs of 0.86{+/-}0.06 (colorectal cancer), 0.89{+/-}0.06 (stomach adenocarcinoma), and 0.73{+/-}0.06 (uterine corpus endometrial carcinoma) in internal cross-validation on TCGA dataset and AUROC of 0.99 on external PAIP dataset, outperforming state-of-the-art weakly supervised methods (particularly in AUPRC with an average of 0.65 across three cancers). Multi-cancer training further improved generalization (by 3%) via exposing the model to diverse MSI manifestations, enabling robust learning of transferable, domain-invariant histological patterns. To investigate the TME, we constructed cell graphs from high-attention regions, classifying cells as epithelial, inflammatory, mitotic, or connective, and applied SNA metrics to quantify spatial interactions. Across cancers, MSI tumors exhibited increased epithelial cell density and stronger epithelial-inflammatory connectivity, with subtle, context-dependent changes in stromal organization. These features were consistent across univariate and multivariate analyses and supported by expert pathologist review, suggesting the presence of a conserved MSI-associated microenvironmental phenotype. Our proposed prediction algorithm and SNA-driven interpretation advance MSI prediction and uncover interpretable, biologically meaningful MSI signatures shared across colorectal, gastric, and endometrial cancers.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

Artificial intelligence in carotid computed tomography angiography plaque detection: Decade of progress and future perspectives.

Wang DY, Yang T, Zhang CT, Zhan PC, Miao ZX, Li BL, Yang H

•papers•Sep 28 2025

The application of artificial intelligence (AI) in carotid atherosclerotic plaque detection <i>via</i> computed tomography angiography (CTA) has significantly advanced over the past decade. This mini-review consolidates recent innovations in deep learning architectures, domain adaptation techniques, and automated plaque characterization methodologies. Hybrid models, such as residual U-Net-Pyramid Scene Parsing Network, exhibit a remarkable precision of 80.49% in plaque segmentation, outperforming radiologists in diagnostic efficiency by reducing analysis time from minutes to mere seconds. Domain-adaptive frameworks, such as Lesion Assessment through Tracklet Evaluation, demonstrate robust performance across heterogeneous imaging datasets, achieving an area under the curve (AUC) greater than 0.88. Furthermore, novel approaches integrating U-Net and Efficient-Net architectures, enhanced by Bayesian optimization, have achieved impressive correlation coefficients (0.89) for plaque quantification. AI-powered CTA also enables high-precision three-dimensional vascular segmentation, with a Dice coefficient of 0.9119, and offers superior cardiovascular risk stratification compared to traditional Agatston scoring, yielding AUC values of 0.816 <i>vs</i> 0.729 at a 15-year follow-up. These breakthroughs address key challenges in plaque motion analysis, with systolic retractive motion biomarkers successfully identifying 80% of vulnerable plaques. Looking ahead, future directions focus on enhancing the interpretability of AI models through explainable AI and leveraging federated learning to mitigate data heterogeneity. This mini-review underscores the transformative potential of AI in carotid plaque assessment, offering substantial implications for stroke prevention and personalized cerebrovascular management strategies.

CT Segmentation Vascular Review In Silico Academic Lab Benchmark SOTA

A Novel Hybrid Deep Learning and Chaotic Dynamics Approach for Thyroid Cancer Classification

Nada Bouchekout, Abdelkrim Boukabou, Morad Grimes, Yassine Habchi, Yassine Himeur, Hamzah Ali Alkhazaleh, Shadi Atalla, Wathiq Mansoor

•preprint•Sep 28 2025

Timely and accurate diagnosis is crucial in addressing the global rise in thyroid cancer, ensuring effective treatment strategies and improved patient outcomes. We present an intelligent classification method that couples an Adaptive Convolutional Neural Network (CNN) with Cohen-Daubechies-Feauveau (CDF9/7) wavelets whose detail coefficients are modulated by an n-scroll chaotic system to enrich discriminative features. We evaluate on the public DDTI thyroid ultrasound dataset (n = 1,638 images; 819 malignant / 819 benign) using 5-fold cross-validation, where the proposed method attains 98.17% accuracy, 98.76% sensitivity, 97.58% specificity, 97.55% F1-score, and an AUC of 0.9912. A controlled ablation shows that adding chaotic modulation to CDF9/7 improves accuracy by +8.79 percentage points over a CDF9/7-only CNN (from 89.38% to 98.17%). To objectively position our approach, we trained state-of-the-art backbones on the same data and splits: EfficientNetV2-S (96.58% accuracy; AUC 0.987), Swin-T (96.41%; 0.986), ViT-B/16 (95.72%; 0.983), and ConvNeXt-T (96.94%; 0.987). Our method outperforms the best of these by +1.23 points in accuracy and +0.0042 in AUC, while remaining computationally efficient (28.7 ms per image; 1,125 MB peak VRAM). Robustness is further supported by cross-dataset testing on TCIA (accuracy 95.82%) and transfer to an ISIC skin-lesion subset (n = 28 unique images, augmented to 2,048; accuracy 97.31%). Explainability analyses (Grad-CAM, SHAP, LIME) highlight clinically relevant regions. Altogether, the wavelet-chaos-CNN pipeline delivers state-of-the-art thyroid ultrasound classification with strong generalization and practical runtime characteristics suitable for clinical integration.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Q-FSRU: Quantum-Augmented Frequency-Spectral For Medical Visual Question Answering

Rakesh Thakur, Yusra Tariq, Rakesh Chandra Joshi

•preprint•Sep 28 2025

Solving tough clinical questions that require both image and text understanding is still a major challenge in healthcare AI. In this work, we propose Q-FSRU, a new model that combines Frequency Spectrum Representation and Fusion (FSRU) with a method called Quantum Retrieval-Augmented Generation (Quantum RAG) for medical Visual Question Answering (VQA). The model takes in features from medical images and related text, then shifts them into the frequency domain using Fast Fourier Transform (FFT). This helps it focus on more meaningful data and filter out noise or less useful information. To improve accuracy and ensure that answers are based on real knowledge, we add a quantum inspired retrieval system. It fetches useful medical facts from external sources using quantum-based similarity techniques. These details are then merged with the frequency-based features for stronger reasoning. We evaluated our model using the VQA-RAD dataset, which includes real radiology images and questions. The results showed that Q-FSRU outperforms earlier models, especially on complex cases needing image text reasoning. The mix of frequency and quantum information improves both performance and explainability. Overall, this approach offers a promising way to build smart, clear, and helpful AI tools for doctors.

Mixed Modality LLM Radiology Report Methodology In Silico GenAI Benchmark SOTA

Evaluating the Impact of Radiographic Noise on Chest X-ray Semantic Segmentation and Disease Classification Using a Scalable Noise Injection Framework

Derek Jiu, Kiran Nijjer, Nishant Chinta, Ryan Bui, Ben Liu, Kevin Zhu

•preprint•Sep 28 2025

Deep learning models are increasingly used for radiographic analysis, but their reliability is challenged by the stochastic noise inherent in clinical imaging. A systematic, cross-task understanding of how different noise types impact these models is lacking. Here, we evaluate the robustness of state-of-the-art convolutional neural networks (CNNs) to simulated quantum (Poisson) and electronic (Gaussian) noise in two key chest X-ray tasks: semantic segmentation and pulmonary disease classification. Using a novel, scalable noise injection framework, we applied controlled, clinically-motivated noise severities to common architectures (UNet, DeepLabV3, FPN; ResNet, DenseNet, EfficientNet) on public datasets (Landmark, ChestX-ray14). Our results reveal a stark dichotomy in task robustness. Semantic segmentation models proved highly vulnerable, with lung segmentation performance collapsing under severe electronic noise (Dice Similarity Coefficient drop of 0.843), signifying a near-total model failure. In contrast, classification tasks demonstrated greater overall resilience, but this robustness was not uniform. We discovered a differential vulnerability: certain tasks, such as distinguishing Pneumothorax from Atelectasis, failed catastrophically under quantum noise (AUROC drop of 0.355), while others were more susceptible to electronic noise. These findings demonstrate that while classification models possess a degree of inherent robustness, pixel-level segmentation tasks are far more brittle. The task- and noise-specific nature of model failure underscores the critical need for targeted validation and mitigation strategies before the safe clinical deployment of diagnostic AI.

X-Ray Segmentation Chest Methodology In Silico Academic Lab Benchmark SOTA

EWC-Guided Diffusion Replay for Exemplar-Free Continual Learning in Medical Imaging

Anoushka Harit, William Prew, Zhongtian Sun, Florian Markowetz

•preprint•Sep 28 2025

Medical imaging foundation models must adapt over time, yet full retraining is often blocked by privacy constraints and cost. We present a continual learning framework that avoids storing patient exemplars by pairing class conditional diffusion replay with Elastic Weight Consolidation. Using a compact Vision Transformer backbone, we evaluate across eight MedMNIST v2 tasks and CheXpert. On CheXpert our approach attains 0.851 AUROC, reduces forgetting by more than 30\% relative to DER\texttt{++}, and approaches joint training at 0.869 AUROC, while remaining efficient and privacy preserving. Analyses connect forgetting to two measurable factors: fidelity of replay and Fisher weighted parameter drift, highlighting the complementary roles of replay diffusion and synaptic stability. The results indicate a practical route for scalable, privacy aware continual adaptation of clinical imaging models.

X-Ray Classification Chest Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

Tunable-Generalization Diffusion Powered by Self-Supervised Contextual Sub-Data for Low-Dose CT Reconstruction

Guoquan Wei, Zekun Zhou, Liu Shi, Wenzhe Shan, Qiegen Liu

•preprint•Sep 28 2025

Current models based on deep learning for low-dose CT denoising rely heavily on paired data and generalize poorly. Even the more concerned diffusion models need to learn the distribution of clean data for reconstruction, which is difficult to satisfy in medical clinical applications. At the same time, self-supervised-based methods face the challenge of significant degradation of generalizability of models pre-trained for the current dose to expand to other doses. To address these issues, this paper proposes a novel method of tunable-generalization diffusion powered by self-supervised contextual sub-data for low-dose CT reconstruction, named SuperDiff. Firstly, a contextual subdata similarity adaptive sensing strategy is designed for denoising centered on the LDCT projection domain, which provides an initial prior for the subsequent progress. Subsequently, the initial prior is used to combine knowledge distillation with a deep combination of latent diffusion models for optimizing image details. The pre-trained model is used for inference reconstruction, and the pixel-level self-correcting fusion technique is proposed for fine-grained reconstruction of the image domain to enhance the image fidelity, using the initial prior and the LDCT image as a guide. In addition, the technique is flexibly applied to the generalization of upper and lower doses or even unseen doses. Dual-domain strategy cascade for self-supervised LDCT denoising, SuperDiff requires only LDCT projection domain data for training and testing. Full qualitative and quantitative evaluations on both datasets and real data show that SuperDiff consistently outperforms existing state-of-the-art methods in terms of reconstruction and generalization performance.

CT Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Deep learning NTCP model for late dysphagia after radiotherapy for head and neck cancer patients based on 3D dose, CT and segmentations.

Causal-Adapter: Taming Text-to-Image Diffusion for Faithful Counterfactual Generation

Deep convolutional neural networks outperform vanilla machine learning when predicting language outcomes after stroke.

Revealing Shared Tumor Microenvironment Dynamics Related to Microsatellite Instability Across Different Cancers Using Cellular Social Network Analysis

Artificial intelligence in carotid computed tomography angiography plaque detection: Decade of progress and future perspectives.

A Novel Hybrid Deep Learning and Chaotic Dynamics Approach for Thyroid Cancer Classification

Q-FSRU: Quantum-Augmented Frequency-Spectral For Medical Visual Question Answering

Evaluating the Impact of Radiographic Noise on Chest X-ray Semantic Segmentation and Disease Classification Using a Scalable Noise Injection Framework

EWC-Guided Diffusion Replay for Exemplar-Free Continual Learning in Medical Imaging

Tunable-Generalization Diffusion Powered by Self-Supervised Contextual Sub-Data for Low-Dose CT Reconstruction

Ready to Sharpen Your Edge?