Latest Papers on Radiology AI. Tags: Benchmark SOTA, Order: Best Match, Limit: 10.

Recovering Diagnostic Value: Super-Resolution-Aided Echocardiographic Classification in Resource-Constrained Imaging

Krishan Agyakari Raja Babu, Om Prabhu, Annu, Mohanasankar Sivaprakasam

•preprint•Jul 30 2025

Automated cardiac interpretation in resource-constrained settings (RCS) is often hindered by poor-quality echocardiographic imaging, limiting the effectiveness of downstream diagnostic models. While super-resolution (SR) techniques have shown promise in enhancing magnetic resonance imaging (MRI) and computed tomography (CT) scans, their application to echocardiography-a widely accessible but noise-prone modality-remains underexplored. In this work, we investigate the potential of deep learning-based SR to improve classification accuracy on low-quality 2D echocardiograms. Using the publicly available CAMUS dataset, we stratify samples by image quality and evaluate two clinically relevant tasks of varying complexity: a relatively simple Two-Chamber vs. Four-Chamber (2CH vs. 4CH) view classification and a more complex End-Diastole vs. End-Systole (ED vs. ES) phase classification. We apply two widely used SR models-Super-Resolution Generative Adversarial Network (SRGAN) and Super-Resolution Residual Network (SRResNet), to enhance poor-quality images and observe significant gains in performance metric-particularly with SRResNet, which also offers computational efficiency. Our findings demonstrate that SR can effectively recover diagnostic value in degraded echo scans, making it a viable tool for AI-assisted care in RCS, achieving more with less.

Ultrasound Classification Cardiac Methodology In Silico Benchmark SOTA Open Dataset

A generalizable diffusion framework for 3D low-dose and few-view cardiac SPECT imaging.

Xie H, Gan W, Ji W, Chen X, Alashi A, Thorn SL, Zhou B, Liu Q, Xia M, Guo X, Liu YH, An H, Kamilov US, Wang G, Sinusas AJ, Liu C

•papers•Jul 30 2025

Myocardial perfusion imaging using SPECT is widely utilized to diagnose coronary artery diseases, but image quality can be negatively affected in low-dose and few-view acquisition settings. Although various deep learning methods have been introduced to improve image quality from low-dose or few-view SPECT data, previous approaches often fail to generalize across different acquisition settings, limiting realistic applicability. This work introduced DiffSPECT-3D, a diffusion framework for 3D cardiac SPECT imaging that effectively adapts to different acquisition settings without requiring further network re-training or fine-tuning. Using both image and projection data, a consistency strategy is proposed to ensure that diffusion sampling at each step aligns with the low-dose/few-view projection measurements, the image data, and the scanner geometry, thus enabling generalization to different low-dose/few-view settings. Incorporating anatomical spatial information from CT and total variation constraint, we proposed a 2.5D conditional strategy to allow DiffSPECT-3D to observe 3D contextual information from the entire image volume, addressing the 3D memory/computational issues in diffusion model. We extensively evaluated the proposed method on 1,325 clinical <sup>99m</sup>Tc tetrofosmin stress/rest studies from 795 patients. Each study was reconstructed into 5 different low-count levels and 5 different projection few-view levels for model evaluations, ranging from 1% to 50% and from 1 view to 9 view, respectively. Validated against cardiac catheterization results and diagnostic review from nuclear cardiologists, the presented results show the potential to achieve low-dose and few-view SPECT imaging without compromising clinical performance. Additionally, DiffSPECT-3D could be directly applied to full-dose SPECT images to further improve image quality, especially in a low-dose stress-first cardiac SPECT imaging protocol.

SPECT Reconstruction Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Optimizing Federated Learning Configurations for MRI Prostate Segmentation and Cancer Detection: A Simulation Study

Ashkan Moradi, Fadila Zerka, Joeran S. Bosma, Mohammed R. S. Sunoqrot, Bendik S. Abrahamsen, Derya Yakar, Jeroen Geerdink, Henkjan Huisman, Tone Frost Bathen, Mattijs Elschot

•preprint•Jul 30 2025

Purpose: To develop and optimize a federated learning (FL) framework across multiple clients for biparametric MRI prostate segmentation and clinically significant prostate cancer (csPCa) detection. Materials and Methods: A retrospective study was conducted using Flower FL to train a nnU-Net-based architecture for MRI prostate segmentation and csPCa detection, using data collected from January 2010 to August 2021. Model development included training and optimizing local epochs, federated rounds, and aggregation strategies for FL-based prostate segmentation on T2-weighted MRIs (four clients, 1294 patients) and csPCa detection using biparametric MRIs (three clients, 1440 patients). Performance was evaluated on independent test sets using the Dice score for segmentation and the Prostate Imaging: Cancer Artificial Intelligence (PI-CAI) score, defined as the average of the area under the receiver operating characteristic curve and average precision, for csPCa detection. P-values for performance differences were calculated using permutation testing. Results: The FL configurations were independently optimized for both tasks, showing improved performance at 1 epoch 300 rounds using FedMedian for prostate segmentation and 5 epochs 200 rounds using FedAdagrad, for csPCa detection. Compared with the average performance of the clients, the optimized FL model significantly improved performance in prostate segmentation and csPCa detection on the independent test set. The optimized FL model showed higher lesion detection performance compared to the FL-baseline model, but no evidence of a difference was observed for prostate segmentation. Conclusions: FL enhanced the performance and generalizability of MRI prostate segmentation and csPCa detection compared with local models, and optimizing its configuration further improved lesion detection performance.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Use of imaging biomarkers and ambulatory functional endpoints in Duchenne muscular dystrophy clinical trials: Systematic review and machine learning-driven trend analysis.

Todd M, Kang S, Wu S, Adhin D, Yoon DY, Willcocks R, Kim S

•papers•Jul 29 2025

Duchenne muscular dystrophy (DMD) is a rare X-linked genetic muscle disorder affecting primarily pediatric males and leading to limited life expectancy. This systematic review of 85 DMD trials and non-interventional studies (2010-2022) evaluated how magnetic resonance imaging biomarkers-particularly fat fraction and T2 relaxation time-are currently being used to quantitatively track disease progression and how their use compares to traditional mobility-based functional endpoints. Imaging biomarker studies lasted on average 4.50 years, approximately 11 months longer than those using only ambulatory functional endpoints. While 93% of biologic intervention trials (n = 28) included ambulatory functional endpoints, only 13.3% (n = 4) incorporated imaging biomarkers. Small molecule trials and natural history studies were the predominant contributors to imaging biomarker use, each comprising 30.4% of such studies. Small molecule trials used imaging biomarkers more frequently than biologic trials, likely because biologics often target dystrophin, an established surrogate biomarker, while small molecules lack regulatory-approved biomarkers. Notably, following the 2018 FDA guidance finalization, we observed a significant decrease in new trials using imaging biomarkers despite earlier regulatory encouragement. This analysis demonstrates that while imaging biomarkers are increasingly used in natural history studies, their integration into interventional trials remains limited. From XGBoost machine learning analysis, trial duration and start year were the strongest predictors of biomarker usage, with a decline observed following the 2018 FDA guidance. Despite their potential to objectively track disease progression, imaging biomarkers have not yet been widely adopted as primary endpoints in therapeutic trials, likely due to regulatory and logistical challenges. Future research should examine whether standardizing imaging protocols or integrating hybrid endpoint models could bridge the regulatory gap currently limiting biomarker adoption in therapeutic trials.

MRI Musculoskeletal Review In Silico Benchmark SOTA

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Peiran Gu, Teng Yao, Mengshen He, Fuhao Duan, Feiyan Liu, RenYuan Peng, Bao Ge

•preprint•Jul 29 2025

In recent years, artificial intelligence has been increasingly applied in the field of medical imaging. Among these applications, fundus image analysis presents special challenges, including small lesion areas in certain fundus diseases and subtle inter-disease differences, which can lead to reduced prediction accuracy and overfitting in the models. To address these challenges, this paper proposes the Transformer-based model SwinECAT, which combines the Shifted Window (Swin) Attention with the Efficient Channel Attention (ECA) Attention. SwinECAT leverages the Swin Attention mechanism in the Swin Transformer backbone to effectively capture local spatial structures and long-range dependencies within fundus images. The lightweight ECA mechanism is incorporated to guide the SwinECAT's attention toward critical feature channels, enabling more discriminative feature representation. In contrast to previous studies that typically classify fundus images into 4 to 6 categories, this work expands fundus disease classification to 9 distinct types, thereby enhancing the granularity of diagnosis. We evaluate our method on the Eye Disease Image Dataset (EDID) containing 16,140 fundus images for 9-category classification. Experimental results demonstrate that SwinECAT achieves 88.29\% accuracy, with weighted F1-score of 0.88 and macro F1-score of 0.90. The classification results of our proposed model SwinECAT significantly outperform the baseline Swin Transformer and multiple compared baseline models. To our knowledge, this represents the highest reported performance for 9-category classification on this public dataset.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Yutao Hu, Ying Zheng, Shumei Miao, Xiaolei Zhang, Jiahao Xia, Yaolei Qi, Yiyang Zhang, Yuting He, Qian Chen, Jing Ye, Hongyan Qiao, Xiuhua Hu, Lei Xu, Jiayin Zhang, Hui Liu, Minwen Zheng, Yining Wang, Daimin Zhang, Ji Zhang, Wenqi Shao, Yun Liu, Longjiang Zhang, Guanyu Yang

•preprint•Jul 29 2025

Foundation models have demonstrated remarkable potential in medical domain. However, their application to complex cardiovascular diagnostics remains underexplored. In this paper, we present Cardiac-CLIP, a multi-modal foundation model designed for 3D cardiac CT images. Cardiac-CLIP is developed through a two-stage pre-training strategy. The first stage employs a 3D masked autoencoder (MAE) to perform self-supervised representation learning from large-scale unlabeled volumetric data, enabling the visual encoder to capture rich anatomical and contextual features. In the second stage, contrastive learning is introduced to align visual and textual representations, facilitating cross-modal understanding. To support the pre-training, we collect 16641 real clinical CT scans, supplemented by 114k publicly available data. Meanwhile, we standardize free-text radiology reports into unified templates and construct the pathology vectors according to diagnostic attributes, based on which the soft-label matrix is generated to supervise the contrastive learning process. On the other hand, to comprehensively evaluate the effectiveness of Cardiac-CLIP, we collect 6,722 real-clinical data from 12 independent institutions, along with the open-source data to construct the evaluation dataset. Specifically, Cardiac-CLIP is comprehensively evaluated across multiple tasks, including cardiovascular abnormality classification, information retrieval and clinical analysis. Experimental results demonstrate that Cardiac-CLIP achieves state-of-the-art performance across various downstream tasks in both internal and external data. Particularly, Cardiac-CLIP exhibits great effectiveness in supporting complex clinical tasks such as the prospective prediction of acute coronary syndrome, which is notoriously difficult in real-world scenarios.

CT Classification Cardiac Methodology In Silico Academic Lab Benchmark SOTA Open Dataset

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

Shreyank N Gowda, Ruichi Zhang, Xiao Gu, Ying Weng, Lu Yang

•preprint•Jul 29 2025

Medical image-language pre-training aims to align medical images with clinically relevant text to improve model performance on various downstream tasks. However, existing models often struggle with the variability and ambiguity inherent in medical data, limiting their ability to capture nuanced clinical information and uncertainty. This work introduces an uncertainty-aware medical image-text pre-training model that enhances generalization capabilities in medical image analysis. Building on previous methods and focusing on Chest X-Rays, our approach utilizes structured text reports generated by a large language model (LLM) to augment image data with clinically relevant context. These reports begin with a definition of the disease, followed by the `appearance' section to highlight critical regions of interest, and finally `observations' and `verdicts' that ground model predictions in clinical semantics. By modeling both inter- and intra-modal uncertainty, our framework captures the inherent ambiguity in medical images and text, yielding improved representations and performance on downstream tasks. Our model demonstrates significant advances in medical image-text pre-training, obtaining state-of-the-art performance on multiple downstream tasks.

X-Ray Classification Chest Methodology In Silico Benchmark SOTA GenAI

BioAug-Net: a bioimage sensor-driven attention-augmented segmentation framework with physiological coupling for early prostate cancer detection in T2-weighted MRI.

Arshad M, Wang C, Us Sima MW, Ali Shaikh J, Karamti H, Alharthi R, Selecky J

•papers•Jul 29 2025

Accurate segmentation of the prostate peripheral zone (PZ) in T2-weighted MRI is critical for the early detection of prostate cancer. Existing segmentation methods are hindered by significant inter-observer variability (37.4 ± 5.6%), poor boundary localization, and the presence of motion artifacts, along with challenges in clinical integration. In this study, we propose BioAug-Net, a novel framework that integrates real-time physiological signal feedback with MRI data, leveraging transformer-based attention mechanisms and a probabilistic clinical decision support system (PCDSS). BioAug-Net features a dual-branch asymmetric attention mechanism: one branch processes spatial MRI features, while the other incorporates temporal sensor signals through a BiGRU-driven adaptive masking module. Additionally, a Markov Decision Process-based PCDSS maps segmentation outputs to clinical PI-RADS scores, with uncertainty quantification. We validated BioAug-Net on a multi-institutional dataset (n=1,542) and demonstrated state-of-the-art performance, achieving a Dice Similarity Coefficient of 89.7% (p < 0.001), sensitivity of 91.2% (p < 0.001), specificity of 88.4% (p < 0.001), and HD95 of 2.14 mm (p < 0.001), outperforming U-Net, Attention U-Net, and TransUNet. Sensor integration improved segmentation accuracy by 12.6% (p < 0.001) and reduced inter-observer variation by 48.3% (p < 0.001). Radiologist evaluations (n=3) confirmed a 15.0% reduction in diagnosis time (p = 0.003) and an increase in inter-reader agreement from K = 0.68 to K = 0.82 (p = 0.001). Our results show that BioAug-Net offers a clinically viable solution for early prostate cancer detection through enhanced physiological coupling and explainable AI diagnostics.

MRI Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

A data assimilation framework for predicting the spatiotemporal response of high-grade gliomas to chemoradiation.

Miniere HJM, Hormuth DA, Lima EABF, Farhat M, Panthi B, Langshaw H, Shanker MD, Talpur W, Thrower S, Goldman J, Ty S, Chung C, Yankeelov TE

•papers•Jul 29 2025

High-grade gliomas are highly invasive and respond variably to chemoradiation. Accurate, patient-specific predictions of tumor response could enhance treatment planning. We present a novel computational platform that assimilates MRI data to continually predict spatiotemporal tumor changes during chemoradiotherapy. Tumor growth and response to chemoradiation was described using a two-species reaction-diffusion model of enhancing and non-enhancing regions of the tumor. Two evaluation scenarios were used to test the predictive accuracy of this model. In scenario 1, the model was calibrated on a patient-specific basis (n = 21) to weekly MRI data during the course of chemoradiotherapy. A data assimilation framework was used to update model parameters with each new imaging visit which were then used to update model predictions. In scenario 2, we evaluated the predictive accuracy of the model when fewer data points are available by calibrating the same model using only the first two imaging visits and then predicted tumor response at the remaining five weeks of treatment. We investigated three approaches to assign model parameters for scenario 2: (1) predictions using only parameters estimated by fitting the data obtained from an individual patient's first two imaging visits, (2) predictions made by averaging the patient-specific parameters with the cohort-derived parameters, and (3) predictions using only cohort-derived parameters. Scenario 1 achieved a median [range] concordance correlation coefficient (CCC) between the predicted and measured total tumor cell counts of 0.91 [0.84, 0.95], and a median [range] percent error in tumor volume of -2.6% [-19.7, 8.0%], demonstrating strong agreement throughout the course of treatment. For scenario 2, the three approaches yielded CCCs of: (1) 0.65 [0.51, 0.88], (2) 0.74 [0.70, 0.91], (3) 0.76 [0.73, 0.92] with significant differences between the approach (1) that does not use the cohort parameters and the two approaches (2 and 3) that do. The proposed data assimilation framework enhances the accuracy of tumor growth forecasts by integrating patient-specific and cohort-based data. These findings show a practical method for identifying more personalized treatment strategies in high-grade glioma patients.

MRI Segmentation Neurological Retrospective Clinical In Silico Academic Lab Benchmark SOTA

MFFBi-Unet: Merging Dynamic Sparse Attention and Multi-scale Feature Fusion for Medical Image Segmentation.

Sun B, Liu C, Wang Q, Bi K, Zhang W

•papers•Jul 29 2025

The advancement of deep learning has driven extensive research validating the effectiveness of U-Net-style symmetric encoder-decoder architectures based on Transformers for medical image segmentation. However, the inherent design requiring attention mechanisms to compute token affinities across all spatial locations leads to prohibitive computational complexity and substantial memory demands. Recent efforts have attempted to address these limitations through sparse attention mechanisms. However, existing approaches employing artificial, content-agnostic sparse attention patterns demonstrate limited capability in modeling long-range dependencies effectively. We propose MFFBi-Unet, a novel architecture incorporating dynamic sparse attention through bi-level routing, enabling context-aware computation allocation with enhanced adaptability. The encoder-decoder module integrates BiFormer to optimize semantic feature extraction and facilitate high-fidelity feature map reconstruction. A novel Multi-scale Feature Fusion (MFF) module in skip connections synergistically combines multi-level contextual information with processed multi-scale features. Extensive evaluations on multiple public medical benchmarks demonstrate that our method consistently exhibits significant advantages. Notably, our method achieves statistically significant improvements, outperforming state-of-the-art approaches like MISSFormer by 2.02% and 1.28% Dice scores on respective benchmarks.

Mixed Modality Segmentation Methodology In Silico Benchmark SOTA

Recovering Diagnostic Value: Super-Resolution-Aided Echocardiographic Classification in Resource-Constrained Imaging

A generalizable diffusion framework for 3D low-dose and few-view cardiac SPECT imaging.

Optimizing Federated Learning Configurations for MRI Prostate Segmentation and Cancer Detection: A Simulation Study

Use of imaging biomarkers and ambulatory functional endpoints in Duchenne muscular dystrophy clinical trials: Systematic review and machine learning-driven trend analysis.

SwinECAT: A Transformer-based fundus disease classification model with Shifted Window Attention and Efficient Channel Attention

Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images

Distribution-Based Masked Medical Vision-Language Model Using Structured Reports

BioAug-Net: a bioimage sensor-driven attention-augmented segmentation framework with physiological coupling for early prostate cancer detection in T2-weighted MRI.

A data assimilation framework for predicting the spatiotemporal response of high-grade gliomas to chemoradiation.

MFFBi-Unet: Merging Dynamic Sparse Attention and Multi-scale Feature Fusion for Medical Image Segmentation.

Ready to Sharpen Your Edge?