Latest Papers on Radiology AI. Tags: Benchmark SOTA

Data-efficient generalization of AI transformers for noise reduction in ultra-fast lung PET scans.

Wang J, Zhang X, Miao Y, Xue S, Zhang Y, Shi K, Guo R, Li B, Zheng G

•papers•Jul 1 2025

Respiratory motion during PET acquisition may produce lesion blurring. Ultra-fast 20-second breath-hold (U2BH) PET reduces respiratory motion artifacts, but the shortened scanning time increases statistical noise and may affect diagnostic quality. This study aims to denoise the U2BH PET images using a deep learning (DL)-based method. The study was conducted on two datasets collected from five scanners where the first dataset included 1272 retrospectively collected full-time PET data while the second dataset contained 46 prospectively collected U2BH and the corresponding full-time PET/CT images. A robust and data-efficient DL method called mask vision transformer (Mask-ViT) was proposed which, after fine-tuned on a limited number of training data from a target scanner, was directly applied to unseen testing data from new scanners. The performance of Mask-ViT was compared with state-of-the-art DL methods including U-Net and C-Gan taking the full-time PET images as the reference. Statistical analysis on image quality metrics were carried out with Wilcoxon signed-rank test. For clinical evaluation, two readers scored image quality on a 5-point scale (5 = excellent) and provided a binary assessment for diagnostic quality evaluation. The U2BH PET images denoised by Mask-ViT showed statistically significant improvement over U-Net and C-Gan on image quality metrics (p < 0.05). For clinical evaluation, Mask-ViT exhibited a lesion detection accuracy of 91.3%, 90.4% and 91.7%, when it was evaluated on three different scanners. Mask-ViT can effectively enhance the quality of the U2BH PET images in a data-efficient generalization setup. The denoised images meet clinical diagnostic requirements of lesion detectability.

PET Reconstruction Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

The impact of multi-modality fusion and deep learning on adult age estimation based on bone mineral density.

Cao Y, Zhang J, Ma Y, Zhang S, Li C, Liu S, Chen F, Huang P

•papers•Jul 1 2025

Age estimation, especially in adults, presents substantial challenges in different contexts ranging from forensic to clinical applications. Bone mineral density (BMD), with its distinct age-related variations, has emerged as a critical marker in this domain. This study aims to enhance chronological age estimation accuracy using deep learning (DL) incorporating a multi-modality fusion strategy based on BMD. We conducted a retrospective analysis of 4296 CT scans from a Chinese population, covering August 2015 to November 2022, encompassing lumbar, femur, and pubis modalities. Our DL approach, integrating multi-modality fusion, was applied to predict chronological age automatically. The model's performance was evaluated using an internal real-world clinical cohort of 644 scans (December 2022 to May 2023) and an external cadaver validation cohort of 351 scans. In single-modality assessments, the lumbar modality excelled. However, multi-modality models demonstrated superior performance, evidenced by lower mean absolute errors (MAEs) and higher Pearson's R² values. The optimal multi-modality model exhibited outstanding R² values of 0.89 overall, 0.88 in females, 0.90 in males, with the MAEs of 4.05 overall, 3.69 in females, 4.33 in males in the internal validation cohort. In the external cadaver validation, the model maintained favourable R² values (0.84 overall, 0.89 in females, 0.82 in males) and MAEs (5.01 overall, 4.71 in females, 5.09 in males), highlighting its generalizability across diverse scenarios. The integration of multi-modalities fusion with DL significantly refines the accuracy of adult age estimation based on BMD. The AI-based system that effectively combines multi-modalities BMD data, presenting a robust and innovative tool for accurate AAE, poised to significantly improve both geriatric diagnostics and forensic investigations.

CT Registration Musculoskeletal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Liver lesion segmentation in ultrasound: A benchmark and a baseline network.

Li J, Zhu L, Shen G, Zhao B, Hu Y, Zhang H, Wang W, Wang Q

•papers•Jul 1 2025

Accurate liver lesion segmentation in ultrasound is a challenging task due to high speckle noise, ambiguous lesion boundaries, and inhomogeneous intensity distribution inside the lesion regions. This work first collected and annotated a dataset for liver lesion segmentation in ultrasound. In this paper, we propose a novel convolutional neural network to learn dual self-attentive transformer features for boosting liver lesion segmentation by leveraging the complementary information among non-local features encoded at different layers of the transformer architecture. To do so, we devise a dual self-attention refinement (DSR) module to synergistically utilize self-attention and reverse self-attention mechanisms to extract complementary lesion characteristics between cascaded multi-layer feature maps, assisting the model to produce more accurate segmentation results. Moreover, we propose a False-Positive-Negative loss to enable our network to further suppress the non-liver-lesion noise at shallow transformer layers and enhance more target liver lesion details into CNN features at deep transformer layers. Experimental results show that our network outperforms state-of-the-art methods quantitatively and qualitatively.

Ultrasound Segmentation Abdominal Methodology In Silico Academic Lab Open Dataset Benchmark SOTA

CQENet: A segmentation model for nasopharyngeal carcinoma based on confidence quantitative evaluation.

Qi Y, Wei L, Yang J, Xu J, Wang H, Yu Q, Shen G, Cao Y

•papers•Jul 1 2025

Accurate segmentation of the tumor regions of nasopharyngeal carcinoma (NPC) is of significant importance for radiotherapy of NPC. However, the precision of existing automatic segmentation methods for NPC remains inadequate, primarily manifested in the difficulty of tumor localization and the challenges in delineating blurred boundaries. Additionally, the black-box nature of deep learning models leads to insufficient quantification of the confidence in the results, preventing users from directly understanding the model's confidence in its predictions, which severely impacts the clinical application of deep learning models. This paper proposes an automatic segmentation model for NPC based on confidence quantitative evaluation (CQENet). To address the issue of insufficient confidence quantification in NPC segmentation results, we introduce a confidence assessment module (CAM) that enables the model to output not only the segmentation results but also the confidence in those results, aiding users in understanding the uncertainty risks associated with model outputs. To address the difficulty in localizing the position and extent of tumors, we propose a tumor feature adjustment module (FAM) for precise tumor localization and extent determination. To address the challenge of delineating blurred tumor boundaries, we introduce a variance attention mechanism (VAM) to assist in edge delineation during fine segmentation. We conducted experiments on a multicenter NPC dataset, validating that our proposed method is effective and superior to existing state-of-the-art models, possessing considerable clinical application value.

CT Segmentation Neurological Methodology In Silico Academic Lab Benchmark SOTA

Interstitial-guided automatic clinical tumor volume segmentation network for cervical cancer brachytherapy.

Tan S, He J, Cui M, Gao Y, Sun D, Xie Y, Cai J, Zaki N, Qin W

•papers•Jul 1 2025

Automatic clinical tumor volume (CTV) delineation is pivotal to improving outcomes for interstitial brachytherapy cervical cancer. However, the prominent differences in gray values due to the interstitial needles bring great challenges on deep learning-based segmentation model. In this study, we proposed a novel interstitial-guided segmentation network termed advance reverse guided network (ARGNet) for cervical tumor segmentation with interstitial brachytherapy. Firstly, the location information of interstitial needles was integrated into the deep learning framework via multi-task by a cross-stitch way to share encoder feature learning. Secondly, a spatial reverse attention mechanism is introduced to mitigate the distraction characteristic of needles on tumor segmentation. Furthermore, an uncertainty area module is embedded between the skip connections and the encoder of the tumor segmentation task, which is to enhance the model's capability in discerning ambiguous boundaries between the tumor and the surrounding tissue. Comprehensive experiments were conducted retrospectively on 191 CT scans under multi-course interstitial brachytherapy. The experiment results demonstrated that the characteristics of interstitial needles play a role in enhancing the segmentation, achieving the state-of-the-art performance, which is anticipated to be beneficial in radiotherapy planning.

CT Segmentation Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

CMT-FFNet: A CMT-based feature-fusion network for predicting TACE treatment response in hepatocellular carcinoma.

Wang S, Zhao Y, Cai X, Wang N, Zhang Q, Qi S, Yu Z, Liu A, Yao Y

•papers•Jun 30 2025

Accurately and preoperatively predicting tumor response to transarterial chemoembolization (TACE) treatment is crucial for individualized treatment decision-making hepatocellular carcinoma (HCC). In this study, we propose a novel feature fusion network based on the Convolutional Neural Networks Meet Vision Transformers (CMT) architecture, termed CMT-FFNet, to predict TACE efficacy using preoperative multiphase Magnetic Resonance Imaging (MRI) scans. The CMT-FFNet combines local feature extraction with global dependency modeling through attention mechanisms, enabling the extraction of complementary information from multiphase MRI data. Additionally, we introduce an orthogonality loss to optimize the fusion of imaging and clinical features, further enhancing the complementarity of cross-modal features. Moreover, visualization techniques were employed to highlight key regions contributing to model decisions. Extensive experiments were conducted to evaluate the effectiveness of the proposed modules and network architecture. Experimental results demonstrate that our model effectively captures latent correlations among features extracted from multiphase MRI data and multimodal inputs, significantly improving the prediction performance of TACE treatment response in HCC patients.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Multicenter Evaluation of Interpretable AI for Coronary Artery Disease Diagnosis from PET Biomarkers

Zhang, W., Kwiecinski, J., Shanbhag, A., Miller, R. J., Ramirez, G., Yi, J., Han, D., Dey, D., Grodecka, D., Grodecki, K., Lemley, M., Kavanagh, P., Liang, J. X., Zhou, J., Builoff, V., Hainer, J., Carre, S., Barrett, L., Einstein, A. J., Knight, S., Mason, S., Le, V., Acampa, W., Wopperer, S., Chareonthaitawee, P., Berman, D. S., Di Carli, M. F., Slomka, P.

•preprint•Jun 30 2025

BackgroundPositron emission tomography (PET)/CT for myocardial perfusion imaging (MPI) provides multiple imaging biomarkers, often evaluated separately. We developed an artificial intelligence (AI) model integrating key clinical PET MPI parameters to improve the diagnosis of obstructive coronary artery disease (CAD). MethodsFrom 17,348 patients undergoing cardiac PET/CT across four sites, we retrospectively enrolled 1,664 subjects who had invasive coronary angiography within 180 days and no prior CAD. Deep learning was used to derive coronary artery calcium score (CAC) from CT attenuation correction maps. XGBoost machine learning model was developed using data from one site to detect CAD, defined as left main stenosis [≥]50% or [≥]70% in other arteries. The model utilized 10 image-derived parameters from clinical practice: CAC, stress/rest left ventricle ejection fraction, stress myocardial blood flow (MBF), myocardial flow reserve (MFR), ischemic and stress total perfusion deficit (TPD), transient ischemic dilation ratio, rate pressure product, and sex. Generalizability was evaluated in the remaining three sites--chosen to maximize testing power and capture inter-site variability--and model performance was compared with quantitative analyses using the area under the receiver operating characteristic curve (AUC). Patient-specific predictions were explained using shapley additive explanations. ResultsThere was a 61% and 53% CAD prevalence in the training (n=386) and external testing (n=1,278) set, respectively. In the external evaluation, the AI model achieved a higher AUC (0.83 [95% confidence interval (CI): 0.81-0.85]) compared to clinical score by experienced physicians (0.80 [0.77-0.82], p=0.02), ischemic TPD (0.79 [0.77-0.82], p<0.001), MFR (0.75 [0.72-0.78], p<0.001), and CAC (0.69 [0.66-0.72], p<0.001). The models performances were consistent in sex, body mass index, and age groups. The top features driving the prediction were stress/ischemic TPD, CAC, and MFR. ConclusionAI integrating perfusion, flow, and CAC scoring improves PET MPI diagnostic accuracy, offering automated and interpretable predictions for CAD diagnosis.

PET Classification Cardiac Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Multimodal, Multi-Disease Medical Imaging Foundation Model (MerMED-FM)

Yang Zhou, Chrystie Wan Ning Quek, Jun Zhou, Yan Wang, Yang Bai, Yuhe Ke, Jie Yao, Laura Gutierrez, Zhen Ling Teo, Darren Shu Jeng Ting, Brian T. Soetikno, Christopher S. Nielsen, Tobias Elze, Zengxiang Li, Linh Le Dinh, Lionel Tim-Ee Cheng, Tran Nguyen Tuan Anh, Chee Leong Cheng, Tien Yin Wong, Nan Liu, Iain Beehuat Tan, Tony Kiat Hon Lim, Rick Siow Mong Goh, Yong Liu, Daniel Shu Wei Ting

•preprint•Jun 30 2025

Current artificial intelligence models for medical imaging are predominantly single modality and single disease. Attempts to create multimodal and multi-disease models have resulted in inconsistent clinical accuracy. Furthermore, training these models typically requires large, labour-intensive, well-labelled datasets. We developed MerMED-FM, a state-of-the-art multimodal, multi-specialty foundation model trained using self-supervised learning and a memory module. MerMED-FM was trained on 3.3 million medical images from over ten specialties and seven modalities, including computed tomography (CT), chest X-rays (CXR), ultrasound (US), pathology patches, color fundus photography (CFP), optical coherence tomography (OCT) and dermatology images. MerMED-FM was evaluated across multiple diseases and compared against existing foundational models. Strong performance was achieved across all modalities, with AUROCs of 0.988 (OCT); 0.982 (pathology); 0.951 (US); 0.943 (CT); 0.931 (skin); 0.894 (CFP); 0.858 (CXR). MerMED-FM has the potential to be a highly adaptable, versatile, cross-specialty foundation model that enables robust medical imaging interpretation across diverse medical disciplines.

Mixed Modality Classification Whole Body Methodology In Silico Academic Lab Benchmark SOTA GenAI

Leveraging Representation Learning for Bi-parametric Prostate MRI to Disambiguate PI-RADS 3 and Improve Biopsy Decision Strategies.

Umapathy L, Johnson PM, Dutt T, Tong A, Chopra S, Sodickson DK, Chandarana H

•papers•Jun 30 2025

Despite its high negative predictive value (NPV) for clinically significant prostate cancer (csPCa), MRI suffers from a substantial number of false positives, especially for intermediate-risk cases. In this work, we determine whether a deep learning model trained with PI-RADS-guided representation learning can disambiguate the PI-RADS 3 classification, detect csPCa from bi-parametric prostate MR images, and avoid unnecessary benign biopsies. This study included 28,263 MR examinations and radiology reports from 21,938 men imaged for known or suspected prostate cancer between 2015 and 2023 at our institution (21 imaging locations with 34 readers), with 6352 subsequent biopsies. We trained a deep learning model, a representation learner (RL), to learn how radiologists interpret conventionally acquired T2-weighted and diffusion-weighted MR images, using exams in which the radiologists are confident in their risk assessments (PI-RADS 1 and 2 for the absence of csPCa vs. PI-RADS 4 and 5 for the presence of csPCa, n=21,465). We then trained biopsy-decision models to detect csPCa (Gleason score ≥7) using these learned image representations, and compared them to the performance of radiologists, and of models trained on other clinical variables (age, prostate volume, PSA, and PSA density) for treatment-naïve test cohorts consisting of only PI-RADS 3 (n=253, csPCa=103) and all PI-RADS (n=531, csPCa=300) cases. On the 2 test cohorts (PI-RADS-3-only, all-PI-RADS), RL-based biopsy-decision models consistently yielded higher AUCs in detecting csPCa (AUC=0.73 [0.66, 0.79], 0.88 [0.85, 0.91]) compared with radiologists (equivocal, AUC=0.79 [0.75, 0.83]) and the clinical model (AUCs=0.69 [0.62, 0.75], 0.78 [0.74, 0.82]). In the PIRADS-3-only cohort, all of whom would be biopsied using our institution's standard of care, the RL decision model avoided 41% (62/150) of benign biopsies compared with the clinical model (26%, P<0.001), and improved biopsy yield by 10% compared with the PI-RADS ≥3 decision strategy (0.50 vs. 0.40). Furthermore, on the all-PI-RADS cohort, RL decision model avoided 27% of additional benign biopsies (138/231) compared to radiologists (33%, P<0.001) with comparable sensitivity (93% vs. 92%), higher NPV (0.87 vs. 0.77), and biopsy yield (0.75 vs. 0.64). The combination of clinical and RL decision models further avoided benign biopsies (46% in PI-RADS-3-only and 62% in all-PI-RADS) while improving NPV (0.82, 0.88) and biopsy yields (0.52, 0.76) across the 2 test cohorts. Our PI-RADS-guided deep learning RL model learns summary representations from bi-parametric prostate MR images that can provide additional information to disambiguate intermediate-risk PI-RADS 3 assessments. The resulting RL-based biopsy decision models also outperformed radiologists in avoiding benign biopsies while maintaining comparable sensitivity to csPCa for the all-PI-RADS cohort. Such AI models can easily be integrated into clinical practice to supplement radiologists' reads in general and improve biopsy yield for any equivocal decisions.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Radiation Dose Reduction and Image Quality Improvement of UHR CT of the Neck by Novel Deep-learning Image Reconstruction.

Messerle DA, Grauhan NF, Leukert L, Dapper AK, Paul RH, Kronfeld A, Al-Nawas B, Krüger M, Brockmann MA, Othman AE, Altmann S

•papers•Jun 30 2025

We evaluated a dedicated dose-reduced UHR-CT for head and neck imaging, combined with a novel deep learning reconstruction algorithm to assess its impact on image quality and radiation exposure. Retrospective analysis of ninety-eight consecutive patients examined using a new body weight-adapted protocol. Images were reconstructed using adaptive iterative dose reduction and advanced intelligent Clear-IQ engine with an already established (DL-1) and a newly implemented reconstruction algorithm (DL-2). Additional thirty patients were scanned without body-weight-adapted dose reduction (DL-1-SD). Three readers evaluated subjective image quality regarding image quality and assessment of several anatomic regions. For objective image quality, signal-to-noise ratio and contrast-to-noise ratio were calculated for temporalis and masseteric muscle and the floor of the mouth. Radiation dose was evaluated by comparing the computed tomography dose index (CTDIvol) values. Deep learning-based reconstruction algorithms significantly improved subjective image quality (diagnostic acceptability: DL‑1 vs AIDR OR of 25.16 [6.30;38.85], p < 0.001 and DL‑2 vs AIDR 720.15 [410.14;> 999.99], p < 0.001). Although higher doses (DL-1-SD) resulted in significantly enhanced image quality, DL‑2 demonstrated significant superiority over all other techniques across all defined parameters (p < 0.001). Similar results were demonstrated for objective image quality, e.g. image noise (DL‑1 vs AIDR OR of 19.0 [11.56;31.24], p < 0.001 and DL‑2 vs AIDR > 999.9 [825.81;> 999.99], p < 0.001). Using weight-adapted kV reduction, very low radiation doses could be achieved (CTDIvol: 7.4 ± 4.2 mGy). AI-based reconstruction algorithms in ultra-high resolution head and neck imaging provide excellent image quality while achieving very low radiation exposure.

CT Reconstruction Neurological Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Filter Papers

Tags

Data-efficient generalization of AI transformers for noise reduction in ultra-fast lung PET scans.

The impact of multi-modality fusion and deep learning on adult age estimation based on bone mineral density.

Liver lesion segmentation in ultrasound: A benchmark and a baseline network.

CQENet: A segmentation model for nasopharyngeal carcinoma based on confidence quantitative evaluation.

Interstitial-guided automatic clinical tumor volume segmentation network for cervical cancer brachytherapy.

CMT-FFNet: A CMT-based feature-fusion network for predicting TACE treatment response in hepatocellular carcinoma.

Multicenter Evaluation of Interpretable AI for Coronary Artery Disease Diagnosis from PET Biomarkers

Multimodal, Multi-Disease Medical Imaging Foundation Model (MerMED-FM)

Leveraging Representation Learning for Bi-parametric Prostate MRI to Disambiguate PI-RADS 3 and Improve Biopsy Decision Strategies.

Radiation Dose Reduction and Image Quality Improvement of UHR CT of the Neck by Novel Deep-learning Image Reconstruction.

Ready to Sharpen Your Edge?