Latest Papers on Radiology AI. Tags: Benchmark SOTA

Diff-Unfolding: A Model-Based Score Learning Framework for Inverse Problems

Yuanhao Wang, Shirin Shoushtari, Ulugbek S. Kamilov

•preprint•May 16 2025

Diffusion models are extensively used for modeling image priors for inverse problems. We introduce \emph{Diff-Unfolding}, a principled framework for learning posterior score functions of \emph{conditional diffusion models} by explicitly incorporating the physical measurement operator into a modular network architecture. Diff-Unfolding formulates posterior score learning as the training of an unrolled optimization scheme, where the measurement model is decoupled from the learned image prior. This design allows our method to generalize across inverse problems at inference time by simply replacing the forward operator without retraining. We theoretically justify our unrolling approach by showing that the posterior score can be derived from a composite model-based optimization formulation. Extensive experiments on image restoration and accelerated MRI show that Diff-Unfolding achieves state-of-the-art performance, improving PSNR by up to 2 dB and reducing LPIPS by $22.7\%$, while being both compact (47M parameters) and efficient (0.72 seconds per $256 \times 256$ image). An optimized C++/LibTorch implementation further reduces inference time to 0.63 seconds, underscoring the practicality of our approach.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA Open Code

UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights

Shijun Liang, Ismail R. Alkhouri, Siddhant Gautam, Qing Qu, Saiprasad Ravishankar

•preprint•May 16 2025

Recent advances in data-centric deep generative models have led to significant progress in solving inverse imaging problems. However, these models (e.g., diffusion models (DMs)) typically require large amounts of fully sampled (clean) training data, which is often impractical in medical and scientific settings such as dynamic imaging. On the other hand, training-data-free approaches like the Deep Image Prior (DIP) do not require clean ground-truth images but suffer from noise overfitting and can be computationally expensive as the network parameters need to be optimized for each measurement set independently. Moreover, DIP-based methods often overlook the potential of learning a prior using a small number of sub-sampled measurements (or degraded images) available during training. In this paper, we propose UGoDIT, an Unsupervised Group DIP via Transferable weights, designed for the low-data regime where only a very small number, M, of sub-sampled measurement vectors are available during training. Our method learns a set of transferable weights by optimizing a shared encoder and M disentangled decoders. At test time, we reconstruct the unseen degraded image using a DIP network, where part of the parameters are fixed to the learned weights, while the remaining are optimized to enforce measurement consistency. We evaluate UGoDIT on both medical (multi-coil MRI) and natural (super resolution and non-linear deblurring) image recovery tasks under various settings. Compared to recent standalone DIP methods, UGoDIT provides accelerated convergence and notable improvement in reconstruction quality. Furthermore, our method achieves performance competitive with SOTA DM-based and supervised approaches, despite not requiring large amounts of clean training data.

MRI Reconstruction Methodology In Silico Academic Lab Benchmark SOTA

CheX-DS: Improving Chest X-ray Image Classification with Ensemble Learning Based on DenseNet and Swin Transformer

Xinran Li, Yu Liu, Xiujuan Xu, Xiaowei Zhao

•preprint•May 16 2025

The automatic diagnosis of chest diseases is a popular and challenging task. Most current methods are based on convolutional neural networks (CNNs), which focus on local features while neglecting global features. Recently, self-attention mechanisms have been introduced into the field of computer vision, demonstrating superior performance. Therefore, this paper proposes an effective model, CheX-DS, for classifying long-tail multi-label data in the medical field of chest X-rays. The model is based on the excellent CNN model DenseNet for medical imaging and the newly popular Swin Transformer model, utilizing ensemble deep learning techniques to combine the two models and leverage the advantages of both CNNs and Transformers. The loss function of CheX-DS combines weighted binary cross-entropy loss with asymmetric loss, effectively addressing the issue of data imbalance. The NIH ChestX-ray14 dataset is selected to evaluate the model's effectiveness. The model outperforms previous studies with an excellent average AUC score of 83.76\%, demonstrating its superior performance.

X-Ray Classification Chest Retrospective Clinical In Silico Benchmark SOTA

Impact of sarcopenia and obesity on mortality in older adults with SARS-CoV-2 infection: automated deep learning body composition analysis in the NAPKON-SUEP cohort.

Schluessel S, Mueller B, Tausendfreund O, Rippl M, Deissler L, Martini S, Schmidmaier R, Stoecklein S, Ingrisch M, Blaschke S, Brandhorst G, Spieth P, Lehnert K, Heuschmann P, de Miranda SMN, Drey M

•papers•May 16 2025

Severe respiratory infections pose a major challenge in clinical practice, especially in older adults. Body composition analysis could play a crucial role in risk assessment and therapeutic decision-making. This study investigates whether obesity or sarcopenia has a greater impact on mortality in patients with severe respiratory infections. The study focuses on the National Pandemic Cohort Network (NAPKON-SUEP) cohort, which includes patients over 60 years of age with confirmed severe COVID-19 pneumonia. An innovative approach was adopted, using pre-trained deep learning models for automated analysis of body composition based on routine thoracic CT scans. The study included 157 hospitalized patients (mean age 70 ± 8 years, 41% women, mortality rate 39%) from the NAPKON-SUEP cohort at 57 study sites. A pre-trained deep learning model was used to analyze body composition (muscle, bone, fat, and intramuscular fat volumes) from thoracic CT images of the NAPKON-SUEP cohort. Binary logistic regression was performed to investigate the association between obesity, sarcopenia, and mortality. Non-survivors exhibited lower muscle volume (p = 0.043), higher intramuscular fat volume (p = 0.041), and a higher BMI (p = 0.031) compared to survivors. Among all body composition parameters, muscle volume adjusted to weight was the strongest predictor of mortality in the logistic regression model, even after adjusting for factors such as sex, age, diabetes, chronic lung disease and chronic kidney disease, (odds ratio = 0.516). In contrast, BMI did not show significant differences after adjustment for comorbidities. This study identifies muscle volume derived from routine CT scans as a major predictor of survival in patients with severe respiratory infections. The results underscore the potential of AI supported CT-based body composition analysis for risk stratification and clinical decision making, not only for COVID-19 patients but also for all patients over 60 years of age with severe acute respiratory infections. The innovative application of pre-trained deep learning models opens up new possibilities for automated and standardized assessment in clinical practice.

CT Segmentation Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs

Raman Dutt, Pedro Sanchez, Yongchen Yao, Steven McDonagh, Sotirios A. Tsaftaris, Timothy Hospedales

•preprint•May 15 2025

We introduce CheXGenBench, a rigorous and multifaceted evaluation framework for synthetic chest radiograph generation that simultaneously assesses fidelity, privacy risks, and clinical utility across state-of-the-art text-to-image generative models. Despite rapid advancements in generative AI for real-world imagery, medical domain evaluations have been hindered by methodological inconsistencies, outdated architectural comparisons, and disconnected assessment criteria that rarely address the practical clinical value of synthetic samples. CheXGenBench overcomes these limitations through standardised data partitioning and a unified evaluation protocol comprising over 20 quantitative metrics that systematically analyse generation quality, potential privacy vulnerabilities, and downstream clinical applicability across 11 leading text-to-image architectures. Our results reveal critical inefficiencies in the existing evaluation protocols, particularly in assessing generative fidelity, leading to inconsistent and uninformative comparisons. Our framework establishes a standardised benchmark for the medical AI community, enabling objective and reproducible comparisons while facilitating seamless integration of both existing and future generative models. Additionally, we release a high-quality, synthetic dataset, SynthCheX-75K, comprising 75K radiographs generated by the top-performing model (Sana 0.6B) in our benchmark to support further research in this critical domain. Through CheXGenBench, we establish a new state-of-the-art and release our framework, models, and SynthCheX-75K dataset at https://raman1121.github.io/CheXGenBench/

X-Ray Image Synthesis Chest Dataset Release In Silico Academic Lab Open Dataset Open Code Benchmark SOTA

Uncertainty Co-estimator for Improving Semi-Supervised Medical Image Segmentation.

Zeng X, Xiong S, Xu J, Du G, Rong Y

•papers•May 15 2025

Recently, combining the strategy of consistency regularization with uncertainty estimation has shown promising performance on semi-supervised medical image segmentation tasks. However, most existing methods estimate the uncertainty solely based on the outputs of a single neural network, which results in imprecise uncertainty estimations and eventually degrades the segmentation performance. In this paper, we propose a novel Uncertainty Co-estimator (UnCo) framework to deal with this problem. Inspired by the co-training technique, UnCo establishes two different mean-teacher modules (i.e., two pairs of teacher and student models), and estimates three types of uncertainty from the multi-source predictions generated by these models. Through combining these uncertainties, their differences will help to filter out incorrect noise in each estimate, thus allowing the final fused uncertainty maps to be more accurate. These resulting maps are then used to enhance a cross-consistency regularization imposed between the two modules. In addition, UnCo also designs an internal consistency regularization within each module, so that the student models can aggregate diverse feature information from both modules, thus promoting the semi-supervised segmentation performance. Finally, an adversarial constraint is introduced to maintain the model diversity. Experimental results on four medical image datasets indicate that UnCo can achieve new state-of-the-art performance on both 2D and 3D semi-supervised segmentation tasks. The source code will be available at https://github.com/z1010x/UnCo.

Mixed Modality Segmentation Methodology In Silico Academic Lab Benchmark SOTA Open Code

CLIF-Net: Intersection-guided Cross-view Fusion Network for Infection Detection from Cranial Ultrasound.

Yu M, Peterson MR, Burgoine K, Harbaugh T, Olupot-Olupot P, Gladstone M, Hagmann C, Cowan FM, Weeks A, Morton SU, Mulondo R, Mbabazi-Kabachelor E, Schiff SJ, Monga V

•papers•May 15 2025

This paper addresses the problem of detecting possible serious bacterial infection (pSBI) of infancy, i.e. a clinical presentation consistent with bacterial sepsis in newborn infants using cranial ultrasound (cUS) images. The captured image set for each patient enables multiview imagery: coronal and sagittal, with geometric overlap. To exploit this geometric relation, we develop a new learning framework, called the intersection-guided Crossview Local- and Image-level Fusion Network (CLIF-Net). Our technique employs two distinct convolutional neural network branches to extract features from coronal and sagittal images with newly developed multi-level fusion blocks. Specifically, we leverage the spatial position of these images to locate the intersecting region. We then identify and enhance the semantic features from this region across multiple levels using cross-attention modules, facilitating the acquisition of mutually beneficial and more representative features from both views. The final enhanced features from the two views are then integrated and projected through the image-level fusion layer, outputting pSBI and non-pSBI class probabilities. We contend that our method of exploiting multi-view cUS images enables a first of its kind, robust 3D representation tailored for pSBI detection. When evaluated on a dataset of 302 cUS scans from Mbale Regional Referral Hospital in Uganda, CLIF-Net demonstrates substantially enhanced performance, surpassing the prevailing state-of-the-art infection detection techniques.

Ultrasound Classification Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Data-Agnostic Augmentations for Unknown Variations: Out-of-Distribution Generalisation in MRI Segmentation

Puru Vaish, Felix Meister, Tobias Heimann, Christoph Brune, Jelmer M. Wolterink

•preprint•May 15 2025

Medical image segmentation models are often trained on curated datasets, leading to performance degradation when deployed in real-world clinical settings due to mismatches between training and test distributions. While data augmentation techniques are widely used to address these challenges, traditional visually consistent augmentation strategies lack the robustness needed for diverse real-world scenarios. In this work, we systematically evaluate alternative augmentation strategies, focusing on MixUp and Auxiliary Fourier Augmentation. These methods mitigate the effects of multiple variations without explicitly targeting specific sources of distribution shifts. We demonstrate how these techniques significantly improve out-of-distribution generalization and robustness to imaging variations across a wide range of transformations in cardiac cine MRI and prostate MRI segmentation. We quantitatively find that these augmentation methods enhance learned feature representations by promoting separability and compactness. Additionally, we highlight how their integration into nnU-Net training pipelines provides an easy-to-implement, effective solution for enhancing the reliability of medical segmentation models in real-world applications.

MRI Segmentation Cardiac Methodology In Silico Academic Lab Benchmark SOTA

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Xianrui Li, Yufei Cui, Jun Li, Antoni B. Chan

•preprint•May 15 2025

Advances in medical imaging and deep learning have propelled progress in whole slide image (WSI) analysis, with multiple instance learning (MIL) showing promise for efficient and accurate diagnostics. However, conventional MIL models often lack adaptability to evolving datasets, as they rely on static training that cannot incorporate new information without extensive retraining. Applying continual learning (CL) to MIL models is a possible solution, but often sees limited improvements. In this paper, we analyze CL in the context of attention MIL models and find that the model forgetting is mainly concentrated in the attention layers of the MIL model. Using the results of this analysis we propose two components for improving CL on MIL: Attention Knowledge Distillation (AKD) and the Pseudo-Bag Memory Pool (PMP). AKD mitigates catastrophic forgetting by focusing on retaining attention layer knowledge between learning sessions, while PMP reduces the memory footprint by selectively storing only the most informative patches, or ``pseudo-bags'' from WSIs. Experimental evaluations demonstrate that our method significantly improves both accuracy and memory efficiency on diverse WSI datasets, outperforming current state-of-the-art CL methods. This work provides a foundation for CL in large-scale, weakly annotated clinical datasets, paving the way for more adaptable and resilient diagnostic models.

Mixed Modality Classification Methodology In Silico Academic Lab Benchmark SOTA

Machine learning-based prognostic subgrouping of glioblastoma: A multicenter study.

Akbari H, Bakas S, Sako C, Fathi Kazerooni A, Villanueva-Meyer J, Garcia JA, Mamourian E, Liu F, Cao Q, Shinohara RT, Baid U, Getka A, Pati S, Singh A, Calabrese E, Chang S, Rudie J, Sotiras A, LaMontagne P, Marcus DS, Milchenko M, Nazeri A, Balana C, Capellades J, Puig J, Badve C, Barnholtz-Sloan JS, Sloan AE, Vadmal V, Waite K, Ak M, Colen RR, Park YW, Ahn SS, Chang JH, Choi YS, Lee SK, Alexander GS, Ali AS, Dicker AP, Flanders AE, Liem S, Lombardo J, Shi W, Shukla G, Griffith B, Poisson LM, Rogers LR, Kotrotsou A, Booth TC, Jain R, Lee M, Mahajan A, Chakravarti A, Palmer JD, DiCostanzo D, Fathallah-Shaykh H, Cepeda S, Santonocito OS, Di Stefano AL, Wiestler B, Melhem ER, Woodworth GF, Tiwari P, Valdes P, Matsumoto Y, Otani Y, Imoto R, Aboian M, Koizumi S, Kurozumi K, Kawakatsu T, Alexander K, Satgunaseelan L, Rulseh AM, Bagley SJ, Bilello M, Binder ZA, Brem S, Desai AS, Lustig RA, Maloney E, Prior T, Amankulor N, Nasrallah MP, O'Rourke DM, Mohan S, Davatzikos C

•papers•May 15 2025

Glioblastoma (GBM) is the most aggressive adult primary brain cancer, characterized by significant heterogeneity, posing challenges for patient management, treatment planning, and clinical trial stratification. We developed a highly reproducible, personalized prognostication, and clinical subgrouping system using machine learning (ML) on routine clinical data, magnetic resonance imaging (MRI), and molecular measures from 2838 demographically diverse patients across 22 institutions and 3 continents. Patients were stratified into favorable, intermediate, and poor prognostic subgroups (I, II, and III) using Kaplan-Meier analysis (Cox proportional model and hazard ratios [HR]). The ML model stratified patients into distinct prognostic subgroups with HRs between subgroups I-II and I-III of 1.62 (95% CI: 1.43-1.84, P < .001) and 3.48 (95% CI: 2.94-4.11, P < .001), respectively. Analysis of imaging features revealed several tumor properties contributing unique prognostic value, supporting the feasibility of a generalizable prognostic classification system in a diverse cohort. Our ML model demonstrates extensive reproducibility and online accessibility, utilizing routine imaging data rather than complex imaging protocols. This platform offers a unique approach to personalized patient management and clinical trial stratification in GBM.

MRI Classification Neurological Retrospective Clinical In Silico Consortium Benchmark SOTA

Filter Papers

Tags

Diff-Unfolding: A Model-Based Score Learning Framework for Inverse Problems

UGoDIT: Unsupervised Group Deep Image Prior Via Transferable Weights

CheX-DS: Improving Chest X-ray Image Classification with Ensemble Learning Based on DenseNet and Swin Transformer

Impact of sarcopenia and obesity on mortality in older adults with SARS-CoV-2 infection: automated deep learning body composition analysis in the NAPKON-SUEP cohort.

CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs

Uncertainty Co-estimator for Improving Semi-Supervised Medical Image Segmentation.

CLIF-Net: Intersection-guided Cross-view Fusion Network for Infection Detection from Cranial Ultrasound.

Data-Agnostic Augmentations for Unknown Variations: Out-of-Distribution Generalisation in MRI Segmentation

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Machine learning-based prognostic subgrouping of glioblastoma: A multicenter study.

Ready to Sharpen Your Edge?