Latest Papers on Radiology AI. Tags: Benchmark SOTA

Predicting language outcome after stroke using machine learning: in search of the big data benefit.

Saranti M, Neville D, White A, Rotshtein P, Hope TMH, Price CJ, Bowman H

•papers•Aug 6 2025

Accurate prediction of post-stroke language outcomes using machine learning offers the potential to enhance clinical treatment and rehabilitation for aphasic patients. This study of 758 English speaking stroke patients from the PLORAS project explores the impact of sample size on the performance of logistic regression and a deep learning (ResNet-18) model in predicting language outcomes from neuroimaging and impairment-relevant tabular data. We assessed the performance of both models on two key language tasks from the Comprehensive Aphasia Test: Spoken Picture Description and Naming, using a learning curve approach. Contrary to expectations, the simpler logistic regression model performed comparably or better than the deep learning model (with overlapping confidence intervals), with both models showing an accuracy plateau around 80% for sample sizes larger than 300 patients. Principal Component Analysis revealed that the dimensionality of the neuroimaging data could be reduced to as few as 20 (or even 2) dominant components without significant loss in accuracy, suggesting that classification may be driven by simple patterns such as lesion size. The study highlights both the potential limitations of current dataset size in achieving further accuracy gains and the need for larger datasets to capture more complex patterns, as some of our results indicate that we might not have reached an absolute classification performance ceiling. Overall, these findings provide insights into the practical use of machine learning for predicting aphasia outcomes and the potential benefits of much larger datasets in enhancing model performance.

MRI Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA Reproducibility

Towards Globally Predictable k-Space Interpolation: A White-box Transformer Approach

Chen Luo, Qiyu Jin, Taofeng Xie, Xuemei Wang, Huayu Wang, Congcong Liu, Liming Tang, Guoqing Chen, Zhuo-Xu Cui, Dong Liang

•preprint•Aug 6 2025

Interpolating missing data in k-space is essential for accelerating imaging. However, existing methods, including convolutional neural network-based deep learning, primarily exploit local predictability while overlooking the inherent global dependencies in k-space. Recently, Transformers have demonstrated remarkable success in natural language processing and image analysis due to their ability to capture long-range dependencies. This inspires the use of Transformers for k-space interpolation to better exploit its global structure. However, their lack of interpretability raises concerns regarding the reliability of interpolated data. To address this limitation, we propose GPI-WT, a white-box Transformer framework based on Globally Predictable Interpolation (GPI) for k-space. Specifically, we formulate GPI from the perspective of annihilation as a novel k-space structured low-rank (SLR) model. The global annihilation filters in the SLR model are treated as learnable parameters, and the subgradients of the SLR model naturally induce a learnable attention mechanism. By unfolding the subgradient-based optimization algorithm of SLR into a cascaded network, we construct the first white-box Transformer specifically designed for accelerated MRI. Experimental results demonstrate that the proposed method significantly outperforms state-of-the-art approaches in k-space interpolation accuracy while providing superior interpretability.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

Segmenting Whole-Body MRI and CT for Multiorgan Anatomic Structure Delineation.

Häntze H, Xu L, Mertens CJ, Dorfner FJ, Donle L, Busch F, Kader A, Ziegelmayer S, Bayerl N, Navab N, Rueckert D, Schnabel J, Aerts HJWL, Truhn D, Bamberg F, Weiss J, Schlett CL, Ringhof S, Niendorf T, Pischon T, Kauczor HU, Nonnenmacher T, Kröncke T, Völzke H, Schulz-Menger J, Maier-Hein K, Hering A, Prokop M, van Ginneken B, Makowski MR, Adams LC, Bressem KK

•papers•Aug 6 2025

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop and validate MRSegmentator, a retrospective cross-modality deep learning model for multiorgan segmentation of MRI scans. Materials and Methods This retrospective study trained MRSegmentator on 1,200 manually annotated UK Biobank Dixon MRI sequences (50 participants), 221 in-house abdominal MRI sequences (177 patients), and 1228 CT scans from the TotalSegmentator-CT dataset. A human-in-the-loop annotation workflow leveraged cross-modality transfer learning from an existing CT segmentation model to segment 40 anatomic structures. The model's performance was evaluated on 900 MRI sequences from 50 participants in the German National Cohort (NAKO), 60 MRI sequences from AMOS22 dataset, and 29 MRI sequences from TotalSegmentator-MRI. Reference standard manual annotations were used for comparison. Metrics to assess segmentation quality included Dice Similarity Coefficient (DSC). Statistical analyses included organ-and sequence-specific mean ± SD reporting and two-sided t tests for demographic effects. Results 139 participants were evaluated; demographic information was available for 70 (mean age 52.7 years ± 14.0 [SD], 36 female). Across all test datasets, MRSegmentator demonstrated high class wise DSC for well-defined organs (lungs: 0.81-0.96, heart: 0.81-0.94) and organs with anatomic variability (liver: 0.82-0.96, kidneys: 0.77-0.95). Smaller structures showed lower DSC (portal/splenic veins: 0.64-0.78, adrenal glands: 0.56-0.69). The average DSC on the external testing using NAKO data, ranged from 0.85 ± 0.08 for T2-HASTE to 0.91 ± 0.05 for in-phase sequences. The model generalized well to CT, achieving mean DSC of 0.84 ± 0.12 on AMOS CT data. Conclusion MRSegmentator accurately segmented 40 anatomic structures on MRI and generalized to CT; outperforming existing open-source tools. Published under a CC BY 4.0 license.

Mixed Modality Segmentation Whole Body Retrospective Clinical In Silico Academic Lab Benchmark SOTA Open Code

AI-derived CT biomarker score for robust COVID-19 mortality prediction across multiple waves and regions using machine learning.

De Smet K, De Smet D, De Jaeger P, Dewitte J, Martens GA, Buls N, De Mey J

•papers•Aug 6 2025

This study aimed to develop a simple, interpretable model using routinely available data for predicting COVID-19 mortality at admission, addressing limitations of complex models, and to provide a statistically robust framework for controlled clinical use, managing model uncertainty for responsible healthcare application. Data from Belgium's first COVID-19 wave (UZ Brussel, n = 252) were used for model development. External validation utilized data from unvaccinated patients during the late second and early third waves (AZ Delta, n = 175). Various machine learning methods were trained and compared for diagnostic performance after data preprocessing and feature selection. The final model, the M3-score, incorporated three features: age, white blood cell (WBC) count, and AI-derived total lung involvement (TOTALAI) quantified from CT scans using Icolung software. The M3-score demonstrated strong classification performance in the training cohort (AUC 0.903) and clinically useful performance in the external validation dataset (AUC 0.826), indicating generalizability potential. To enhance clinical utility and interpretability, predicted probabilities were categorized into actionable likelihood ratio (LR) intervals: highly unlikely (LR 0.0), unlikely (LR 0.13), gray zone (LR 0.85), more likely (LR 2.14), and likely (LR 8.19) based on the training cohort. External validation suggested temporal and geographical robustness, though some variability in AUC and LR performance was observed, as anticipated in real-world settings. The parsimonious M3-score, integrating AI-based CT quantification with clinical and laboratory data, offers an interpretable tool for predicting in-hospital COVID-19 mortality, showing robust training performance. Observed performance variations in external validation underscore the need for careful interpretation and further extensive validation across international cohorts to confirm wider applicability and robustness before widespread clinical adoption.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Quantum Federated Learning in Healthcare: The Shift from Development to Deployment and from Models to Data.

Bhatia AS, Kais S, Alam MA

•papers•Aug 6 2025

Healthcare organizations have a high volume of sensitive data and traditional technologies have limited storage capacity and computational resources. The prospect of sharing healthcare data for machine learning is more arduous due to firm regulations related to patient privacy. In recent years, federated learning has offered a solution to accelerate distributed machine learning addressing concerns related to data privacy and governance. Currently, the blend of quantum computing and machine learning has experienced significant attention from academic institutions and research communities. The ultimate objective of this work is to develop a federated quantum machine learning framework (FQML) to tackle the optimization, security, and privacy challenges in the healthcare industry for medical imaging tasks. In this work, we proposed federated quantum convolutional neural networks (QCNNs) with distributed training across edge devices. To demonstrate the feasibility of the proposed FQML framework, we performed extensive experiments on two benchmark medical datasets (Pneumonia MNIST, and CT kidney disease analysis), which are non-independently and non-identically partitioned among the healthcare institutions/clients. The proposed framework is validated and assessed via large-scale simulations. Based on our results, the quantum simulation experiments achieve performance levels on par with well-known classical CNN models, 86.3% accuracy on the pneumonia dataset and 92.8% on the CT-kidney dataset, while requiring fewer model parameters and consuming less data. Moreover, the client selection mechanism is proposed to reduce the computation overhead at each communication round, which effectively improves the convergence rate.

Mixed Modality Classification Chest Methodology In Silico Academic Lab Benchmark SOTA

TCSAFormer: Efficient Vision Transformer with Token Compression and Sparse Attention for Medical Image Segmentation

Zunhui Xia, Hongxing Li, Libin Lan

•preprint•Aug 6 2025

In recent years, transformer-based methods have achieved remarkable progress in medical image segmentation due to their superior ability to capture long-range dependencies. However, these methods typically suffer from two major limitations. First, their computational complexity scales quadratically with the input sequences. Second, the feed-forward network (FFN) modules in vanilla Transformers typically rely on fully connected layers, which limits models' ability to capture local contextual information and multiscale features critical for precise semantic segmentation. To address these issues, we propose an efficient medical image segmentation network, named TCSAFormer. The proposed TCSAFormer adopts two key ideas. First, it incorporates a Compressed Attention (CA) module, which combines token compression and pixel-level sparse attention to dynamically focus on the most relevant key-value pairs for each query. This is achieved by pruning globally irrelevant tokens and merging redundant ones, significantly reducing computational complexity while enhancing the model's ability to capture relationships between tokens. Second, it introduces a Dual-Branch Feed-Forward Network (DBFFN) module as a replacement for the standard FFN to capture local contextual features and multiscale information, thereby strengthening the model's feature representation capability. We conduct extensive experiments on three publicly available medical image segmentation datasets: ISIC-2018, CVC-ClinicDB, and Synapse, to evaluate the segmentation performance of TCSAFormer. Experimental results demonstrate that TCSAFormer achieves superior performance compared to existing state-of-the-art (SOTA) methods, while maintaining lower computational overhead, thus achieving an optimal trade-off between efficiency and accuracy.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Benchmarking Uncertainty and its Disentanglement in multi-label Chest X-Ray Classification

Simon Baur, Wojciech Samek, Jackie Ma

•preprint•Aug 6 2025

Reliable uncertainty quantification is crucial for trustworthy decision-making and the deployment of AI models in medical imaging. While prior work has explored the ability of neural networks to quantify predictive, epistemic, and aleatoric uncertainties using an information-theoretical approach in synthetic or well defined data settings like natural image classification, its applicability to real life medical diagnosis tasks remains underexplored. In this study, we provide an extensive uncertainty quantification benchmark for multi-label chest X-ray classification using the MIMIC-CXR-JPG dataset. We evaluate 13 uncertainty quantification methods for convolutional (ResNet) and transformer-based (Vision Transformer) architectures across a wide range of tasks. Additionally, we extend Evidential Deep Learning, HetClass NNs, and Deep Deterministic Uncertainty to the multi-label setting. Our analysis provides insights into uncertainty estimation effectiveness and the ability to disentangle epistemic and aleatoric uncertainties, revealing method- and architecture-specific strengths and limitations.

X-Ray Classification Chest Methodology In Silico Benchmark SOTA

A Comprehensive Framework for Uncertainty Quantification of Voxel-wise Supervised Models in IVIM MRI

Nicola Casali, Alessandro Brusaferri, Giuseppe Baselli, Stefano Fumagalli, Edoardo Micotti, Gianluigi Forloni, Riaz Hussein, Giovanna Rizzo, Alfonso Mastropietro

•preprint•Aug 6 2025

Accurate estimation of intravoxel incoherent motion (IVIM) parameters from diffusion-weighted MRI remains challenging due to the ill-posed nature of the inverse problem and high sensitivity to noise, particularly in the perfusion compartment. In this work, we propose a probabilistic deep learning framework based on Deep Ensembles (DE) of Mixture Density Networks (MDNs), enabling estimation of total predictive uncertainty and decomposition into aleatoric (AU) and epistemic (EU) components. The method was benchmarked against non probabilistic neural networks, a Bayesian fitting approach and a probabilistic network with single Gaussian parametrization. Supervised training was performed on synthetic data, and evaluation was conducted on both simulated and two in vivo datasets. The reliability of the quantified uncertainties was assessed using calibration curves, output distribution sharpness, and the Continuous Ranked Probability Score (CRPS). MDNs produced more calibrated and sharper predictive distributions for the D and f parameters, although slight overconfidence was observed in D*. The Robust Coefficient of Variation (RCV) indicated smoother in vivo estimates for D* with MDNs compared to Gaussian model. Despite the training data covering the expected physiological range, elevated EU in vivo suggests a mismatch with real acquisition conditions, highlighting the importance of incorporating EU, which was allowed by DE. Overall, we present a comprehensive framework for IVIM fitting with uncertainty quantification, which enables the identification and interpretation of unreliable estimates. The proposed approach can also be adopted for fitting other physical models through appropriate architectural and simulation adjustments.

MRI Registration Abdominal Methodology In Silico Academic Lab Benchmark SOTA

ATLASS: An AnaTomicaLly-Aware Self-Supervised Learning Framework for Generalizable Retinal Disease Detection.

Khan AA, Ahmad KM, Shafiq S, Akram MU, Shao J

•papers•Aug 6 2025

Medical imaging, particularly retinal fundus photography, plays a crucial role in early disease detection and treatment for various ocular disorders. However, the development of robust diagnostic systems using deep learning remains constrained by the scarcity of expertly annotated data, which is time-consuming and expensive. Self-Supervised Learning (SSL) has emerged as a promising solution, but existing models fail to effectively incorporate critical domain knowledge specific to retinal anatomy. This potentially limits their clinical relevance and diagnostic capability. We address this issue by introducing an anatomically aware SSL framework that strategically integrates domain expertise through specialized masking of vital retinal structures during pretraining. Our approach leverages vessel and optic disc segmentation maps to guide the SSL process, enabling the development of clinically relevant feature representations without extensive labeled data. The framework combines a Vision Transformer with dual-masking strategies and anatomically informed loss functions to preserve structural integrity during feature learning. Comprehensive evaluation across multiple datasets demonstrates our method's competitive performance in diverse retinal disease classification tasks, including diabetic retinopathy grading, glaucoma detection, age-related macular degeneration identification, and multi-disease classification. The evaluation results establish the effectiveness of anatomically-aware SSL in advancing automated retinal disease diagnosis while addressing the fundamental challenge of limited labeled medical data.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

DDTracking: A Deep Generative Framework for Diffusion MRI Tractography with Streamline Local-Global Spatiotemporal Modeling

Yijie Li, Wei Zhang, Xi Zhu, Ye Wu, Yogesh Rathi, Lauren J. O'Donnell, Fan Zhang

•preprint•Aug 6 2025

This paper presents DDTracking, a novel deep generative framework for diffusion MRI tractography that formulates streamline propagation as a conditional denoising diffusion process. In DDTracking, we introduce a dual-pathway encoding network that jointly models local spatial encoding (capturing fine-scale structural details at each streamline point) and global temporal dependencies (ensuring long-range consistency across the entire streamline). Furthermore, we design a conditional diffusion model module, which leverages the learned local and global embeddings to predict streamline propagation orientations for tractography in an end-to-end trainable manner. We conduct a comprehensive evaluation across diverse, independently acquired dMRI datasets, including both synthetic and clinical data. Experiments on two well-established benchmarks with ground truth (ISMRM Challenge and TractoInferno) demonstrate that DDTracking largely outperforms current state-of-the-art tractography methods. Furthermore, our results highlight DDTracking's strong generalizability across heterogeneous datasets, spanning varying health conditions, age groups, imaging protocols, and scanner types. Collectively, DDTracking offers anatomically plausible and robust tractography, presenting a scalable, adaptable, and end-to-end learnable solution for broad dMRI applications. Code is available at: https://github.com/yishengpoxiao/DDtracking.git

MRI Segmentation Neurological Methodology In Silico Academic Lab Open Code Benchmark SOTA

Filter Papers

Tags

Predicting language outcome after stroke using machine learning: in search of the big data benefit.

Towards Globally Predictable k-Space Interpolation: A White-box Transformer Approach

Segmenting Whole-Body MRI and CT for Multiorgan Anatomic Structure Delineation.

AI-derived CT biomarker score for robust COVID-19 mortality prediction across multiple waves and regions using machine learning.

Quantum Federated Learning in Healthcare: The Shift from Development to Deployment and from Models to Data.

TCSAFormer: Efficient Vision Transformer with Token Compression and Sparse Attention for Medical Image Segmentation

Benchmarking Uncertainty and its Disentanglement in multi-label Chest X-Ray Classification

A Comprehensive Framework for Uncertainty Quantification of Voxel-wise Supervised Models in IVIM MRI

ATLASS: An AnaTomicaLly-Aware Self-Supervised Learning Framework for Generalizable Retinal Disease Detection.

DDTracking: A Deep Generative Framework for Diffusion MRI Tractography with Streamline Local-Global Spatiotemporal Modeling

Ready to Sharpen Your Edge?