Latest Papers on Radiology AI. Tags: None

An Efficient Dual-Line Decoder Network with Multi-Scale Convolutional Attention for Multi-organ Segmentation

Riad Hassan, M. Rubaiyat Hossain Mondal, Sheikh Iqbal Ahamed, Fahad Mostafa, Md Mostafijur Rahman

•preprint•Aug 23 2025

Proper segmentation of organs-at-risk is important for radiation therapy, surgical planning, and diagnostic decision-making in medical image analysis. While deep learning-based segmentation architectures have made significant progress, they often fail to balance segmentation accuracy with computational efficiency. Most of the current state-of-the-art methods either prioritize performance at the cost of high computational complexity or compromise accuracy for efficiency. This paper addresses this gap by introducing an efficient dual-line decoder segmentation network (EDLDNet). The proposed method features a noisy decoder, which learns to incorporate structured perturbation at training time for better model robustness, yet at inference time only the noise-free decoder is executed, leading to lower computational cost. Multi-Scale convolutional Attention Modules (MSCAMs), Attention Gates (AGs), and Up-Convolution Blocks (UCBs) are further utilized to optimize feature representation and boost segmentation performance. By leveraging multi-scale segmentation masks from both decoders, we also utilize a mutation-based loss function to enhance the model's generalization. Our approach outperforms SOTA segmentation architectures on four publicly available medical imaging datasets. EDLDNet achieves SOTA performance with an 84.00% Dice score on the Synapse dataset, surpassing baseline model like UNet by 13.89% in Dice score while significantly reducing Multiply-Accumulate Operations (MACs) by 89.7%. Compared to recent approaches like EMCAD, our EDLDNet not only achieves higher Dice score but also maintains comparable computational efficiency. The outstanding performance across diverse datasets establishes EDLDNet's strong generalization, computational efficiency, and robustness. The source code, pre-processed data, and pre-trained weights will be available at https://github.com/riadhassan/EDLDNet .

CT Segmentation Abdominal Methodology In Silico Academic Lab Open Code

Towards expert-level autonomous carotid ultrasonography with large-scale learning-based robotic system.

Jiang H, Zhao A, Yang Q, Yan X, Wang T, Wang Y, Jia N, Wang J, Wu G, Yue Y, Luo S, Wang H, Ren L, Chen S, Liu P, Yao G, Yang W, Song S, Li X, He K, Huang G

•papers•Aug 23 2025

Carotid ultrasound requires skilled operators due to small vessel dimensions and high anatomical variability, exacerbating sonographer shortages and diagnostic inconsistencies. Prior automation attempts, including rule-based approaches with manual heuristics and reinforcement learning trained in simulated environments, demonstrate limited generalizability and fail to complete real-world clinical workflows. Here, we present UltraBot, a fully learning-based autonomous carotid ultrasound robot, achieving human-expert-level performance through four innovations: (1) A unified imitation learning framework for acquiring anatomical knowledge and scanning operational skills; (2) A large-scale expert demonstration dataset (247,000 samples, 100 × scale-up), enabling embodied foundation models with strong generalization; (3) A comprehensive scanning protocol ensuring full anatomical coverage for biometric measurement and plaque screening; (4) The clinical-oriented validation showing over 90% success rates, expert-level accuracy, up to 5.5 × higher reproducibility across diverse unseen populations. Overall, we show that large-scale deep learning offers a promising pathway toward autonomous, high-precision ultrasonography in clinical practice.

Ultrasound Detection Vascular Retrospective Clinical Clinical Pilot Academic Lab Breakthrough Open Dataset

Deep learning-based lightweight model for automated lumbar foraminal stenosis classification: sagittal CT diagnostic performance compared to clinical subspecialists.

Huang JW, Zhang YL, Li KY, Li HL, Ye HB, Chen YH, Lin XX, Tian NF

•papers•Aug 23 2025

Magnetic resonance imaging (MRI) is essential for diagnosing lumbar foraminal stenosis (LFS). However, access remains limited in China due to uneven equipment distribution, high costs, and long waiting times. Therefore, this study developed a lightweight deep learning (DL) model using sagittal CT images to classify LFS severity as a potential clinical alternative where MRI is unavailable. A retrospective study included 868 sagittal CT images from 177 patients (2016-2025). Data were split at the patient level into training (n = 125), validation (n = 31), and test sets (n = 21), with annotations, based on the Lee grading system, provided by two spine surgeons. Two DL models were developed: DL1 (EfficientNet-B0) and DL2 (MobileNetV3-Large-100), both of which incorporated a Faster R-CNN with a ResNet-50-based region-of-interest (ROI) detector. Diagnostic performance was benchmarked against spine surgeons with different levels of clinical experience. DL1 achieved 82.35% diagnostic accuracy (matching the senior spine surgeon's 83.33%), with DL2 at 80.39% (mean 81.37%), both exceeding the junior spine surgeon's 62.75%. DL1 demonstrated near-perfect diagnostic agreement with the senior spine surgeon, as validated by Cohen's kappa analysis (κ = 0.815; 95% CI: 0.723-0.907), whereas DL2 showed substantial consistency (κ = 0.799; 95% CI: 0.703-0.895). Inter-model agreement yielded κ = 0.782 (95% CI: 0.682-0.882). The DL models achieved a mean diagnostic accuracy of 81.37%, comparable to that of the senior spine surgeon (83.33%) in grading LFS severity on sagittal CT. However, given the limited sample size and absence of external validation, their applicability and generalisability to other populations and in multi-centre, large-scale datasets remain uncertain.

CT Classification Musculoskeletal Retrospective Clinical In Silico Academic Lab

Utility of machine learning for predicting severe chronic thromboembolic pulmonary hypertension based on CT metrics in a surgical cohort.

Grubert Van Iderstine M, Kim S, Karur GR, Granton J, de Perrot M, McIntosh C, McInnis M

•papers•Aug 23 2025

The aim of this study was to develop machine learning (ML) models to explore the relationship between chronic pulmonary embolism (PE) burden and severe pulmonary hypertension (PH) in surgical chronic thromboembolic pulmonary hypertension (CTEPH). CTEPH patients with a preoperative CT pulmonary angiogram and pulmonary endarterectomy between 01/2017 and 06/2022 were included. A mean pulmonary artery pressure of > 50 mmHg was classified as severe. CTs were scored by a blinded radiologist who recorded chronic pulmonary embolism extent in detail, and measured the right ventricle (RV), left ventricle (LV), main pulmonary artery (PA) and ascending aorta (Ao) diameters. XGBoost models were developed to identify CTEPH feature importance and compared to a logistic regression model. There were 184 patients included; 54.9% were female, and 21.7% had severe PH. The average age was 57 ± 15 years. PE burden alone was not helpful in identifying severe PH. The RV/LV ratio logistic regression model performed well (AUC 0.76) with a cutoff of 1.4. A baseline ML model (Model 1) including only the RV, LV, Pa and Ao measures and their ratios yielded an average AUC of 0.66 ± 0.10. The addition of demographics and statistics summarizing the CT findings raised the AUC to 0.75 ± 0.08 (F1 score 0.41). While measures of PE burden had little bearing on PH severity independently, the RV/LV ratio, extent of disease in various segments, total webs observed, and patient demographics improved performance of machine learning models in identifying severe PH. Question Can machine learning methods applied to CT-based cardiac measurements and detailed maps of chronic thromboembolism type and distribution predict pulmonary hypertension (PH) severity? Findings The right-to-left ventricle (RV/LV) ratio was predictive of PH severity with an optimal cutoff of 1.4, and detailed accounts of chronic thromboembolic burden improved model performance. Clinical relevance The identification of a CT-based RV/LV ratio cutoff of 1.4 gives radiologists, clinicians, and patients a point of reference for chronic thromboembolic PH severity. Detailed chronic thromboembolic burden data are useful but cannot be used alone to predict PH severity.

CT Classification Cardiac Retrospective Clinical In Silico Academic Lab

Quantitative Evaluation of AI-based Organ Segmentation Across Multiple Anatomical Sites Using Eight Commercial Software Platforms.

Yuan L, Chen Q, Al-Hallaq H, Yang J, Yang X, Geng H, Latifi K, Cai B, Wu QJ, Xiao Y, Benedict SH, Rong Y, Buchsbaum J, Qi XS

•papers•Aug 23 2025

To evaluate organs-at-risk (OARs) segmentation variability across eight commercial AI-based segmentation software using independent multi-institutional datasets, and to provide recommendations for clinical practices utilizing AI-segmentation. 160 planning CT image sets from four anatomical sites: head-and-neck, thorax, abdomen and pelvis were retrospectively pooled from three institutions. Contours for 31 OARs generated by the software were compared to clinical contours using multiple accuracy metrics, including: Dice similarity coefficient (DSC), 95 Percentile of Hausdorff distance (HD95), surface DSC, as well as relative added path length (RAPL) as an efficiency metric. A two-factor analysis of variance was used to quantify variability in contouring accuracy across software platforms (inter-software) and patients (inter-patient). Pairwise comparisons were performed to categorize the software into different performance groups, and inter-software variations (ISV) were calculated as the average performance differences between the groups. Significant inter-software and inter-patient contouring accuracy variations (p<0.05) were observed for most OARs. The largest ISV in DSC in each anatomical region were cervical esophagus (0.41), trachea (0.10), spinal cord (0.13) and prostate (0.17). Among the organs evaluated, 7 had mean DSC >0.9 (i.e., heart, liver), 15 had DSC ranging from 0.7 to 0.89 (i.e., parotid, esophagus). The remaining organs (i.e., optic nerves, seminal vesicle) had DSC<0.7. 16 of the 31 organs (52%) had RAPL less than 0.1. Our results reveal significant inter-software and inter-patient variability in the performance of AI-segmentation software. These findings highlight the need of thorough software commissioning, testing, and quality assurance across disease sites, patient-specific anatomies and image acquisition protocols.

CT Segmentation Whole Body Retrospective Clinical Post Market Consortium Benchmark SOTA

A novel MRI-based habitat analysis and deep learning for predicting perineural invasion in prostate cancer: a two-center study.

Deng S, Huang D, Han X, Zhang H, Wang H, Mao G, Ao W

•papers•Aug 23 2025

To explore the efficacy of a deep learning (DL) model in predicting perineural invasion (PNI) in prostate cancer (PCa) by conducting multiparametric MRI (mpMRI)-based tumor heterogeneity analysis. This retrospective study included 397 patients with PCa from two medical centers. The patients were divided into training, internal validation (in-vad), and independent external validation (ex-vad) cohorts (n = 173, 74, and 150, respectively). mpMRI-based habitat analysis, comprising T2-weighted imaging, diffusion-weighted imaging, and apparent diffusion coefficient sequences, was performed followed by DL, deep feature selection, and filtration to compute a radscore. Subsequently, six models were constructed: one clinical model, four habitat models (habitats 1, 2, 3, and whole-tumor), and one combined model. Receiver operating characteristic curve analysis was performed to evaluate the models' ability to predict PNI. The four habitat models exhibited robust performance in predicting PNI, with area under the curve (AUC) values of 0.862-0.935, 0.802-0.957, and 0.859-0.939 in the training, in-vad, and ex-vad cohorts, respectively. The clinical model had AUC values of 0.832, 0.818, and 0.789 in the training, in-vad, and ex-vad cohorts, respectively. The combined model outperformed the clinical and habitat models, with AUC, sensitivity, and specificity values of 0.999, 1, and 0.955 for the training cohort. Decision curve analysis and clinical impact curve analysis indicated favorable clinical applicability and utility of the combined model. DL models constructed through mpMRI-based habitat analysis accurately predict the PNI status of PCa.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

Epicardial and paracardial adipose tissue quantification in short-axis cardiac cine MRI using deep learning.

Zhang R, Wang X, Zhou Z, Ni L, Jiang M, Hu P

•papers•Aug 23 2025

Epicardial and paracardial adipose tissues (EAT and PAT) are two types of fat depots around the heart and they have important roles in cardiac physiology. Manual quantification of EAT and PAT from cardiac MR (CMR) is time-consuming and prone to human bias. Leveraging the cardiac motion, we aimed to develop deep learning neural networks for automated segmentation and quantification of EAT and PAT in short-axis cine CMR. A modified U-Net equipped with modules of multi-resolution convolution, motion information extraction, feature fusion, and dual attention mechanisms, was developed. Multiple steps of ablation studies were performed to verify the efficacy of each module. The performance of different networks was also compared. The final network incorporating all modules achieved segmentation Dice indices of 77.72% ± 2.53% and 77.18% ± 3.54% for EAT and PAT, respectively, which were significantly higher than the baseline U-Net. It also achieved the highest performance compared to other networks. With our model, the determination coefficients of EAT and PAT volumes to the reference were 0.8550 and 0.8025, respectively. Our proposed network can provide accurate and quick quantification of EAT and PAT on routine short-axis cine CMR, which can potentially aid cardiologists in clinical settings.

MRI Segmentation Cardiac Methodology In Silico

Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data.

Wu C, Zhang X, Zhang Y, Hui H, Wang Y, Xie W

•papers•Aug 23 2025

In this study, as a proof-of-concept, we aim to initiate the development of Radiology Foundation Model, termed as RadFM. We consider three perspectives: dataset construction, model design, and thorough evaluation, concluded as follows: (i), we contribute 4 multimodal datasets with 13M 2D images and 615K 3D scans. When combined with a vast collection of existing datasets, this forms our training dataset, termed as Medical Multi-modal Dataset, MedMD. (ii), we propose an architecture that enables to integrate text input with 2D or 3D medical scans, and generates responses for diverse radiologic tasks, including diagnosis, visual question answering, report generation, and rationale diagnosis; (iii), beyond evaluation on 9 existing datasets, we propose a new benchmark, RadBench, comprising three tasks aiming to assess foundation models comprehensively. We conduct both automatic and human evaluations on RadBench. RadFM outperforms former accessible multi-modal foundation models, including GPT-4V. Additionally, we adapt RadFM for diverse public benchmarks, surpassing various existing SOTAs.

Mixed Modality Classification Whole Body Methodology In Silico Academic Lab Benchmark SOTA Open Dataset Open Code GenAI

The impact of a neuroradiologist on the report of a real-world CT perfusion imaging map derived by AI/ML-driven software.

De Rubeis G, Stasolla A, Piccoli C, Federici M, Cozzolino V, Lovullo G, Leone E, Pesapane F, Fabiano S, Bertaccini L, Pingi A, Galluzzo M, Saba L, Pampana E

•papers•Aug 22 2025

According to guideline the computed tomography perfusion (CTP) should read and analysis using computer-aided software. This study evaluates the efficacy of AI/ML (machine learning) -driven software in CTP imaging and the effect of neuroradiologists interpretation of these automated results. We conducted a retrospective, single-center cohort study from June to December 2023 at a comprehensive stroke center. A total of 132 patients suspected of acute ischemic stroke underwent CTP using. The AI software RAPID.AI was utilized for initial analysis, with subsequent validation and adjustments made by experienced neuroradiologists. The rate of CTP marked as "non reportable", "reportable" and "reportable with correction" by neuroradiologist was recorded. The degree of confidence in the report of basal and angio-CT scan was assessed before and after CTP visualization. Statistical analysis included logistic regression and F1 score assessments to evaluate the predictive accuracy of AI-generated CTP maps RESULTS: The study found that CTP maps derived from AI software were reportable in 65.2% of cases without artifacts, improved to 87.9% reportable cases when reviewed by neuroradiologists. Key predictive factors for artifact-free CTP maps included motion parameters and the timing of contrast peak distances. There was a significant shift to higher confidence scores of the angiographic phase of the CT after the result of CTP CONCLUSIONS: Neuroradiologists play an indispensable role in enhancing the reliability of CTP imaging by interpreting and correcting AI-processed maps. CTP＝computed tomography perfusion; AI/ML＝ Artificial Intelligence/Machine Learning; LVO = Large vessel occlusion.

CT Detection Neurological Retrospective Clinical Clinical Pilot Academic Lab

Deep Learning-based Automated Coronary Plaque Quantification: First Demonstration With Ultra-high Resolution Photon-counting Detector CT at Different Temporal Resolutions.

Klambauer K, Burger SD, Demmert TT, Mergen V, Moser LJ, Gulsun MA, Schöbinger M, Schwemmer C, Wels M, Allmendinger T, Eberhard M, Alkadhi H, Schmidt B

•papers•Aug 22 2025

The aim of this study was to evaluate the feasibility and reproducibility of a novel deep learning (DL)-based coronary plaque quantification tool with automatic case preparation in patients undergoing ultra-high resolution (UHR) photon-counting detector CT coronary angiography (CCTA), and to assess the influence of temporal resolution on plaque quantification. In this retrospective single-center study, 45 patients undergoing clinically indicated UHR CCTA were included. In each scan, 2 image data sets were reconstructed: one in the dual-source mode with 66 ms temporal resolution and one simulating a single-source mode with 125 ms temporal resolution. A novel, DL-based algorithm for fully automated coronary segmentation and intensity-based plaque quantification was applied to both data sets in each patient. Plaque volume quantification was performed at the vessel-level for the entire left anterior descending artery (LAD), left circumflex artery (CX), and right coronary artery (RCA), as well as at the lesion-level for the largest coronary plaque in each vessel. Diameter stenosis grade was quantified for the coronary lesion with the greatest longitudinal extent in each vessel. To assess reproducibility, the algorithm was rerun 3 times in 10 randomly selected patients, and all outputs were visually reviewed and confirmed by an expert reader. Paired Wilcoxon signed-rank tests with Benjamini-Hochberg correction were used for statistical comparisons. One hundred nineteen out of 135 (88.1%) coronary arteries showed atherosclerotic plaques and were included in the analysis. In the reproducibility analysis, repeated runs of the algorithm yielded identical results across all plaque and lumen measurements (P > 0.999). All outputs were confirmed to be anatomically correct, visually consistent, and did not require manual correction. At the vessel level, total plaque volumes were higher in the 125 ms reconstructions compared with the 66 ms reconstructions in 28 of 45 patients (62%), with both calcified and noncalcified plaque volumes being higher in 32 (71%) and 28 (62%) patients, respectively. Total plaque volumes in the LAD, CX, and RCA were significantly higher in the 125 ms reconstructions (681.3 vs. 647.8  mm3, P < 0.05). At the lesion level, total plaque volumes were higher in the 125 ms reconstructions in 44 of 45 patients (98%; 447.3 vs. 414.9  mm3, P < 0.001), with both calcified and noncalcified plaque volumes being higher in 42 of 45 patients (93%). The median diameter stenosis grades for all vessels were significantly higher in the 125 ms reconstructions (35.4% vs. 28.1%, P < 0.01). This study evaluated a novel DL-based tool with automatic case preparation for quantitative coronary plaque in UHR CCTA data sets. The algorithm was technically robust and reproducible, delivering anatomically consistent outputs not requiring manual correction. Reconstructions with lower temporal resolution (125 ms) systematically overestimated plaque burden compared with higher temporal resolution (66 ms), underscoring that protocol standardization is essential for reliable DL-based plaque quantification.

CT Segmentation Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

An Efficient Dual-Line Decoder Network with Multi-Scale Convolutional Attention for Multi-organ Segmentation

Towards expert-level autonomous carotid ultrasonography with large-scale learning-based robotic system.

Deep learning-based lightweight model for automated lumbar foraminal stenosis classification: sagittal CT diagnostic performance compared to clinical subspecialists.

Utility of machine learning for predicting severe chronic thromboembolic pulmonary hypertension based on CT metrics in a surgical cohort.

Quantitative Evaluation of AI-based Organ Segmentation Across Multiple Anatomical Sites Using Eight Commercial Software Platforms.

A novel MRI-based habitat analysis and deep learning for predicting perineural invasion in prostate cancer: a two-center study.

Epicardial and paracardial adipose tissue quantification in short-axis cardiac cine MRI using deep learning.

Towards generalist foundation model for radiology by leveraging web-scale 2D&3D medical data.

The impact of a neuroradiologist on the report of a real-world CT perfusion imaging map derived by AI/ML-driven software.

Deep Learning-based Automated Coronary Plaque Quantification: First Demonstration With Ultra-high Resolution Photon-counting Detector CT at Different Temporal Resolutions.

Ready to Sharpen Your Edge?