Latest Papers on Radiology AI.

Automating prostate volume acquisition using abdominal ultrasound scans for prostate-specific antigen density calculations.

Bennett RD, Barrett T, Sushentsev N, Sanmugalingam N, Lee KL, Gnanapragasam VJ, Tse ZTH

•papers•Sep 30 2025

Proposed methods for prostate cancer screening are currently prohibitively expensive (due to the high costs of imaging equipment such as magnetic resonance imaging and traditional ultrasound systems), inadequate in their detection rates, require highly trained specialists, and/or are invasive, resulting in patient discomfort. These limitations make population-wide screening for prostate cancer challenging. Machine learning techniques applied to abdominal ultrasound scanning may help alleviate some of these disadvantages. Abdominal ultrasound scans are comparatively low cost and exhibit minimal patient discomfort, and machine learning can be applied to mitigate against the high operator-dependent variability of ultrasound scanning. In this study, a state-of-the-art machine learning model was compared to an expert radiologist and trainee radiologist registrars of varying experience when estimating prostate volume from abdominal ultrasound images, a crucial step in detecting prostate cancer using prostate-specific antigen density. The machine learning model calculated prostatic volume by marking out dimensions of the prolate ellipsoid formula from two orthogonal images of the prostate acquired with abdominal ultrasound scans (which could be conducted by operators with minimal experience in a primary care setting). While both the algorithm and the registrars showed high correlation with the expert ([Formula: see text]) it was found that the model outperformed the trainees in both accuracy (lowest average volume error of [Formula: see text]) and consistency (lowest IQR of [Formula: see text] and lowest average volume standard deviation of [Formula: see text]). The results are promising for the future development of an automated prostate cancer screening workflow using machine learning and abdominal ultrasound scans.

Ultrasound Segmentation Abdominal Retrospective Clinical In Silico Academic Lab GenAI

Multi scale self supervised learning for deep knowledge transfer in diabetic retinopathy grading.

Almattar W, Anwar S, Al-Azani S, Khan FA

•papers•Sep 30 2025

Diabetic retinopathy is a leading cause of vision loss, necessitating early, accurate detection. Automated deep learning models show promise but struggle with the complexity of retinal images and limited labeled data. Due to domain differences, traditional transfer learning from datasets like ImageNet often fails in medical imaging. Self-supervised learning (SSL) offers a solution by enabling models to learn directly from medical data, but its success depends on the backbone architecture. Convolutional Neural Networks (CNNs) focus on local features, which can be limiting. To address this, we propose the Multi-scale Self-Supervised Learning (MsSSL) model, combining Vision Transformers (ViTs) for global context and CNNs with a Feature Pyramid Network (FPN) for multi-scale feature extraction. These features are refined through a Deep Learner module, improving spatial resolution and capturing high-level and fine-grained information. The MsSSL model significantly enhances DR grading, outperforming traditional methods, and underscores the value of domain-specific pretraining and advanced model integration in medical imaging.

OCT Classification Methodology In Silico Academic Lab Benchmark SOTA

Optimizing retinal images based carotid atherosclerosis prediction with explainable foundation models.

Lee H, Kim J, Kwak S, Rehman A, Park SM, Chang J

•papers•Sep 30 2025

Carotid atherosclerosis is a key predictor of cardiovascular disease (CVD), necessitating early detection. While foundation models (FMs) show promise in medical imaging, their optimal selection and fine-tuning strategies for classifying carotid atherosclerosis from retinal images remain unclear. Using data from 39,620 individuals, we evaluated four vision FMs with three fine-tuning methods. Performance was evaluated by predictive performance, clinical utility by survival analysis for future CVD mortality, and explainability by Grad-CAM with vessel segmentation. DINOv2 with low-rank adaptation showed the best overall performance (area under the receiver operating characteristic curve = 0.71; sensitivity = 0.87; specificity = 0.44), prognostic relevance (hazard ratio = 2.20, P-trend < 0.05), and vascular alignment. While further external validation on a broader clinical context is necessary to improve the model's generalizability, these findings support the feasibility of opportunistic atherosclerosis and CVD screening using retinal imaging and highlight the importance of a multi-dimensional evaluation framework for optimal FM selection in medical artificial intelligence.

OCT Classification Methodology In Silico Academic Lab GenAI Benchmark SOTA

Advanced MRI based Alzheimer's diagnosis through ensemble learning techniques.

Sriram S, Nivethitha V, Arun Kaarthic TP, Archita S, Murugan T

•papers•Sep 30 2025

Alzheimer's Disease is a condition that affects the brain and causes changes in behavior and memory loss while making it hard to carry out tasks properly. It's vital to spot the illness early, for effective treatment. MRI technology has advanced in detecting Alzheimer's by using machine learning and deep learning models. These models use neural networks to analyze brain MRI results automatically and identify key indicators of Alzheimer's disease. In this study, we used MRI data to train a CNN for diagnosing and categorizing the four stages of Alzheimer's disease with deep learning techniques, offering significant advantages in identifying patterns in medical imaging for this neurodegenerative condition compared to using a CNN exclusively trained for this purpose. They evaluated ResNet50, InceptionResNetv2 as well as a CNN specifically trained for their study and found that combining the models led to highly accurate results. The accuracy rates for the trained CNN model stood at 90.76%, InceptionResNetv2 at 86.84%, and ResNet50 at 90.27%. In this trial run of the experiment conducted by combining all three models collaboratively resulted in an accuracy rate of 94.27% compared to the accuracy rates of each model working individually.

MRI Classification Neurological Methodology In Silico

Radiomics-enhanced modelling approach for predicting the need for ECMO in ARDS patients: a retrospective cohort study.

Mirus M, Leitert E, Bockholt R, Heubner L, Löck S, Brei M, Biehler J, Kühn JP, Koch T, Wall W, Spieth PM

•papers•Sep 30 2025

Decisions regarding veno-venous extracorporeal membrane oxygenation (vv-ECMO) in patients with acute respiratory distress syndrome (ARDS) are often based solely on clinical and physiological parameters, which may insufficiently reflect severity and heterogeneity of lung injury. This study aimed to develop a predictive model integrating machine learning-derived quantitative features from admission chest computed tomography (CT) with selected clinical variables to support early individualized decision-making regarding vv-ECMO therapy. In this retrospective single-center cohort study, 375 consecutive patients with COVID-19-associated ARDS admitted to the ICU between March 2020 and April 2022 were included. Lung segmentation from initial CTs was performed using a convolutional neural network (CNN) to generate high-resolution, anatomically accurate masks of the lungs. Subsequently, 592 radiomic features, quantifying lung aeration, density and morphology, were extracted. Four clinical parameters - age, mean airway pressure, lactate, and C-reactive protein, were selected on the basis of clinical relevance. Three logistic regression models were developed: (1) Imaging Model, (2) Clinical Model, and (3) Combined Model integrating different features. Predictive performance was assessed via the area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, and specificity. A total of 375 patients were included: 172 in the training and 203 in the validation cohort. In the training cohort, the AUROCs were 0.743 (Imaging), 0.828 (Clinical), and 0.842 (Combined). In the validation cohort, the Combined Model achieved the highest AUROC (0.705), outperforming the Clinical (0.674) and Imaging (0.639) Models. Overall accuracy in the validation cohort was 64.0% (Combined), 66.5% (Clinical), and 59.1% (Imaging). The Combined Model showed 68.1% sensitivity and 58.9% specificity. Kaplan-Meier analysis confirmed a significantly greater cumulative incidence of ECMO therapy in patients predicted as high risk (p < 0.001), underscoring its potential to support individualized, timely ECMO decisions in ARDS by providing clinicians with objective data-driven risk estimates. Quantitative CT features based on machine learning-derived lung segmentation allow early individualized prediction of the need for vv-ECMO in ARDS. While clinical data remain essential, radiomic markers enhance prognostic accuracy. The Combined Model demonstrates considerable potential to support timely and evidence-based ECMO initiation, facilitating individualized critical care in both specialized and general ICU environments.Trial registration: The study is registered with the German Clinical Trials Register under the number DRKS00027856. Registered 18.01.2022, retrospectively registered due to retrospective design of the study.

CT Classification Chest Retrospective Clinical In Silico Academic Lab

Attention-enhanced hybrid U-Net for prostate cancer grading and explainability.

Zaheer AN, Farhan M, Min G, Alotaibi FA, Alnfiai MM

•papers•Sep 30 2025

Prostate cancer remains a leading cause of mortality, necessitating precise histopathological segmentation for accurate Gleason Grade assessment. However, existing deep learning-based segmentation models lack contextual awareness and explainability, leading to inconsistent performance across heterogeneous tissue structures. Conventional U-Net architectures and CNN-based approaches struggle with capturing long-range dependencies and fine-grained histopathological patterns, resulting in suboptimal boundary delineation and model generalizability. To address these limitations, we propose a transformer-attention hybrid U-Net (TAH U-Net), integrating hybrid CNN-transformer encoding, attention-guided skip connections, and a multi-stage guided loss mechanism for enhanced segmentation accuracy and model interpretability. The ResNet50-based convolutional layers efficiently capture local spatial features, while Vision Transformer (ViT) blocks model global contextual dependencies, improving segmentation consistency. Attention mechanisms are incorporated into skip connections and decoder pathways, refining feature propagation by suppressing irrelevant tissue noise while enhancing diagnostically significant regions. A novel hierarchical guided loss function optimizes segmentation masks at multiple decoder stages, improving boundary refinement and gradient stability. Additionally, Explainable AI (XAI) techniques such as LIME, Occlusion Sensitivity, and Partial Dependence Analysis (PDP), validate the model's decision-making transparency, ensuring clinical reliability. The experimental evaluation on the SICAPv2 dataset demonstrates state-of-the-art performance, surpassing traditional U-Net architectures with a 4.6% increase in Dice Score, 5.1% gain in IoU, along with notable improvements in Precision (+ 4.2%) and Recall (+ 3.8%). This research significantly advances AI-driven prostate cancer diagnostics by providing an interpretable and highly accurate segmentation framework, enhancing clinical trust in histopathology-based grading within medical imaging and computational pathology.

X-Ray Segmentation Abdominal Methodology In Silico Academic Lab Benchmark SOTA GenAI

Enhanced EfficientNet-Extended Multimodal Parkinson's disease classification with Hybrid Particle Swarm and Grey Wolf Optimizer.

Raajasree K, Jaichandran R

•papers•Sep 30 2025

Parkinson's disease (PD) is a chronic neurodegenerative disorder characterized by progressive loss of dopaminergic neurons in substantia nigra, resulting in both motor impairments and cognitive decline. Traditional PD classification methods are expert-dependent and time-intensive, while existing deep learning (DL) models often suffer from inconsistent accuracy, limited interpretability, and inability to fully capture PD's clinical heterogeneity. This study proposes a novel framework Enhanced EfficientNet-Extended Multimodal PD Classification with Hybrid Particle Swarm and Grey Wolf Optimizer (EEFN-XM-PDC-HybPS-GWO) to overcome these challenges. The model integrates T1-weighted MRI, DaTscan images, and gait scores from NTUA and PhysioNet repository respectively. Denoising is achieved via Multiscale Attention Variational Autoencoders (MSA-VAE), and critical regions are segmented using Semantic Invariant Multi-View Clustering (SIMVC). The Enhanced EfficientNet-Extended Multimodal (EEFN-XM) model extracts and fuses image and gait features, while HybPS-GWO optimizes classification weights. The system classifies subjects into early-stage PD, advanced-stage PD, and healthy controls (HCs). Ablation analysis confirms the hybrid optimizer's contribution to performance gains. The proposed model achieved 99.2% accuracy with stratified 5-fold cross-validation, outperforming DMFEN-PDC, MMT-CA-PDC, and LSTM-PDD-GS by 7.3%, 15.97%, and 10.43%, respectively, and reduced execution time by 33.33%. EEFN-XM-PDC-HybPS-GWO demonstrates superior accuracy, computational efficiency, and clinical relevance, particularly in early-stage diagnosis and PD classification.

Mixed Modality Classification Neurological Methodology In Silico Academic Lab

Inter-slice Complementarity Enhanced Ring Artifact Removal using Central Region Reinforced Neural Network.

Zhang Y, Liu G, Chen Z, Huang Z, Kan S, Ji X, Luo S, Zhu S, Yang J, Chen Y

•papers•Sep 30 2025

In computed tomography (CT), non-uniform detector responses often lead to ring artifacts in reconstructed images. For conventional energy-integrating detectors (EIDs), such artifacts can be effectively addressed through dead-pixel correction and flat-dark field calibration. However, the response characteristics of photon-counting detectors (PCDs) are more complex, and standard calibration procedures can only partially mitigate ring artifacts. Consequently, developing high-performance ring artifact removal algorithms is essential for PCD-based CT systems. To this end, we propose the Inter-slice Complementarity Enhanced Ring Artifact Removal (ICE-RAR) algorithm. Since artifact removal in the central region is particularly challenging, ICE-RAR utilizes a dual-branch neural network that could simultaneously perform global artifact removal and enhance the central region restoration. Moreover, recognizing that the detector response is also non-uniform in the vertical direction, ICE-RAR suggests extracting and utilizing inter-slice complementarity to enhance its performance in artifact elimination and image restoration. Experiments on simulated data and two real datasets acquired from PCD-based CT systems demonstrate the effectiveness of ICE-RAR in reducing ring artifacts while preserving structural details. More importantly, since the system-specific characteristics are incorporated into the data simulation process, models trained on the simulated data can be directly applied to unseen real data from the target PCD-based CT system, demonstrating ICE-RAR's potential to address the ring artifact removal problem in practical CT systems. The implementation is publicly available at https://github.com/DarkBreakerZero/ICE-RAR.

CT Reconstruction Methodology In Silico Academic Lab Open Code

Dolphin v1.0 Technical Report

Taohan Weng, Chi zhang, Chaoran Yan, Siya Liu, Xiaoyang Liu, Yalun Wu, Boyang Wang, Boyan Wang, Jiren Ren, Kaiwen Yan, Jinze Yu, Kaibing Hu, Henan Liu, Haoyun Zheng, Zhenyu Liu, Duo Zhang, Xiaoqing Guo, Anjie Le, Hongcheng Guo

•preprint•Sep 30 2025

Ultrasound is crucial in modern medicine but faces challenges like operator dependence, image noise, and real-time scanning, hindering AI integration. While large multimodal models excel in other medical imaging areas, they struggle with ultrasound's complexities. To address this, we introduce Dolphin v1.0 (V1) and its reasoning-augmented version, Dolphin R1-the first large-scale multimodal ultrasound foundation models unifying diverse clinical tasks in a single vision-language framework.To tackle ultrasound variability and noise, we curated a 2-million-scale multimodal dataset, combining textbook knowledge, public data, synthetic samples, and general corpora. This ensures robust perception, generalization, and clinical adaptability.The Dolphin series employs a three-stage training strategy: domain-specialized pretraining, instruction-driven alignment, and reinforcement-based refinement. Dolphin v1.0 delivers reliable performance in classification, detection, regression, and report generation. Dolphin R1 enhances diagnostic inference, reasoning transparency, and interpretability through reinforcement learning with ultrasound-specific rewards.Evaluated on U2-Bench across eight ultrasound tasks, Dolphin R1 achieves a U2-score of 0.5835-over twice the second-best model (0.2968) setting a new state of the art. Dolphin v1.0 also performs competitively, validating the unified framework. Comparisons show reasoning-enhanced training significantly improves diagnostic accuracy, consistency, and interpretability, highlighting its importance for high-stakes medical AI.

Ultrasound Methodology In Silico Breakthrough Benchmark SOTA Open Dataset GenAI

A Multimodal LLM Approach for Visual Question Answering on Multiparametric 3D Brain MRI

Arvind Murari Vepa, Yannan Yu, Jingru Gan, Anthony Cuturrufo, Weikai Li, Wei Wang, Fabien Scalzo, Yizhou Sun

•preprint•Sep 30 2025

We introduce mpLLM, a prompt-conditioned hierarchical mixture-of-experts (MoE) architecture for visual question answering over multi-parametric 3D brain MRI (mpMRI). mpLLM routes across modality-level and token-level projection experts to fuse multiple interrelated 3D modalities, enabling efficient training without image-report pretraining. To address limited image-text paired supervision, mpLLM integrates a synthetic visual question answering (VQA) protocol that generates medically relevant VQA from segmentation annotations, and we collaborate with medical experts for clinical validation. mpLLM outperforms strong medical VLM baselines by 5.3% on average across multiple mpMRI datasets. Our study features three main contributions: (1) the first clinically validated VQA dataset for 3D brain mpMRI, (2) a novel multimodal LLM that handles multiple interrelated 3D modalities, and (3) strong empirical results that demonstrate the medical utility of our methodology. Ablations highlight the importance of modality-level and token-level experts and prompt-conditioned routing.

MRI LLM Radiology Report Neurological Methodology In Silico Academic Lab Breakthrough Open Dataset

Filter Papers

Tags

Automating prostate volume acquisition using abdominal ultrasound scans for prostate-specific antigen density calculations.

Multi scale self supervised learning for deep knowledge transfer in diabetic retinopathy grading.

Optimizing retinal images based carotid atherosclerosis prediction with explainable foundation models.

Advanced MRI based Alzheimer's diagnosis through ensemble learning techniques.

Radiomics-enhanced modelling approach for predicting the need for ECMO in ARDS patients: a retrospective cohort study.

Attention-enhanced hybrid U-Net for prostate cancer grading and explainability.

Enhanced EfficientNet-Extended Multimodal Parkinson's disease classification with Hybrid Particle Swarm and Grey Wolf Optimizer.

Inter-slice Complementarity Enhanced Ring Artifact Removal using Central Region Reinforced Neural Network.

Dolphin v1.0 Technical Report

A Multimodal LLM Approach for Visual Question Answering on Multiparametric 3D Brain MRI

Ready to Sharpen Your Edge?