Latest Papers on Radiology AI. Tags: Classification

Preoperative Assessment of Lymph Node Metastasis in Rectal Cancer Using Deep Learning: Investigating the Utility of Various MRI Sequences.

Zhao J, Zheng P, Xu T, Feng Q, Liu S, Hao Y, Wang M, Zhang C, Xu J

•papers•Jun 24 2025

This study aimed to develop a deep learning (DL) model based on three-dimensional multi-parametric magnetic resonance imaging (mpMRI) for preoperative assessment of lymph node metastasis (LNM) in rectal cancer (RC) and to investigate the contribution of different MRI sequences. A total of 613 eligible patients with RC from four medical centres who underwent preoperative mpMRI were retrospectively enrolled and randomly assigned to training (n = 372), validation (n = 106), internal test (n = 88) and external test (n = 47) cohorts. A multi-parametric multi-scale EfficientNet (MMENet) was designed to effectively extract LNM-related features from mpMR for preoperative LNM assessment. Its performance was compared with other DL models and radiologists using metrics of area under the receiver operating curve (AUC), accuracy (ACC), sensitivity, specificity and average precision with 95% confidence interval (CI). To investigate the utility of various MRI sequences, the performances of the mono-parametric model and the MMENet with different sequences combinations as input were compared. The MMENet using a combination of T2WI, DWI and DCE sequence achieved an AUC of 0.808 (95% CI 0.720-0.897) with an ACC of 71.6% (95% CI 62.3-81.0) in the internal test cohort and an AUC of 0.782 (95% CI 0.636-0.925) with an ACC of 76.6% (95% CI 64.6-88.6) in the external test cohort, outperforming the mono-parametric model, the MMENet with other sequences combinations and the radiologists. The MMENet, leveraging a combination of T2WI, DWI and DCE sequences, can accurately assess LNM in RC preoperatively and holds great promise for automated evaluation of LNM in clinical practice.

MRI Classification Abdominal Retrospective Clinical In Silico Academic Lab

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Jakob Ambsdorf, Asbjørn Munk, Sebastian Llambias, Anders Nymark Christensen, Kamil Mikolaj, Randall Balestriero, Martin Tolsgaard, Aasa Feragen, Mads Nielsen

•preprint•Jun 24 2025

With access to large-scale, unlabeled medical datasets, researchers are confronted with two questions: Should they attempt to pretrain a custom foundation model on this medical data, or use transfer-learning from an existing generalist model? And, if a custom model is pretrained, are novel methods required? In this paper we explore these questions by conducting a case-study, in which we train a foundation model on a large regional fetal ultrasound dataset of 2M images. By selecting the well-established DINOv2 method for pretraining, we achieve state-of-the-art results on three fetal ultrasound datasets, covering data from different countries, classification, segmentation, and few-shot tasks. We compare against a series of models pretrained on natural images, ultrasound images, and supervised baselines. Our results demonstrate two key insights: (i) Pretraining on custom data is worth it, even if smaller models are trained on less data, as scaling in natural image pretraining does not translate to ultrasound performance. (ii) Well-tuned methods from computer vision are making it feasible to train custom foundation models for a given medical domain, requiring no hyperparameter tuning and little methodological adaptation. Given these findings, we argue that a bias towards methodological innovation should be avoided when developing domain specific foundation models under common computational resource constraints.

Ultrasound Classification Abdominal Methodology In Silico Academic Lab Benchmark SOTA

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

Xuesong Li, Dianye Huang, Yameng Zhang, Nassir Navab, Zhongliang Jiang

•preprint•Jun 24 2025

Understanding medical ultrasound imaging remains a long-standing challenge due to significant visual variability caused by differences in imaging and acquisition parameters. Recent advancements in large language models (LLMs) have been used to automatically generate terminology-rich summaries orientated to clinicians with sufficient physiological knowledge. Nevertheless, the increasing demand for improved ultrasound interpretability and basic scanning guidance among non-expert users, e.g., in point-of-care settings, has not yet been explored. In this study, we first introduce the scene graph (SG) for ultrasound images to explain image content to ordinary and provide guidance for ultrasound scanning. The ultrasound SG is first computed using a transformer-based one-stage method, eliminating the need for explicit object detection. To generate a graspable image explanation for ordinary, the user query is then used to further refine the abstract SG representation through LLMs. Additionally, the predicted SG is explored for its potential in guiding ultrasound scanning toward missing anatomies within the current imaging view, assisting ordinary users in achieving more standardized and complete anatomical exploration. The effectiveness of this SG-based image explanation and scanning guidance has been validated on images from the left and right neck regions, including the carotid and thyroid, across five volunteers. The results demonstrate the potential of the method to maximally democratize ultrasound by enhancing its interpretability and usability for ordinaries.

Ultrasound Classification Vascular Methodology Prototype Academic Lab GenAI

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Ankita Raj, Harsh Swaika, Deepankar Varma, Chetan Arora

•preprint•Jun 24 2025

The success of deep learning in medical imaging applications has led several companies to deploy proprietary models in diagnostic workflows, offering monetized services. Even though model weights are hidden to protect the intellectual property of the service provider, these models are exposed to model stealing (MS) attacks, where adversaries can clone the model's functionality by querying it with a proxy dataset and training a thief model on the acquired predictions. While extensively studied on general vision tasks, the susceptibility of medical imaging models to MS attacks remains inadequately explored. This paper investigates the vulnerability of black-box medical imaging models to MS attacks under realistic conditions where the adversary lacks access to the victim model's training data and operates with limited query budgets. We demonstrate that adversaries can effectively execute MS attacks by using publicly available datasets. To further enhance MS capabilities with limited query budgets, we propose a two-step model stealing approach termed QueryWise. This method capitalizes on unlabeled data obtained from a proxy distribution to train the thief model without incurring additional queries. Evaluation on two medical imaging models for Gallbladder Cancer and COVID-19 classification substantiates the effectiveness of the proposed attack. The source code is available at https://github.com/rajankita/QueryWise.

Mixed Modality Classification Abdominal Methodology In Silico Academic Lab Open Code

Multimodal Deep Learning Based on Ultrasound Images and Clinical Data for Better Ovarian Cancer Diagnosis.

Su C, Miao K, Zhang L, Yu X, Guo Z, Li D, Xu M, Zhang Q, Dong X

•papers•Jun 24 2025

This study aimed to develop and validate a multimodal deep learning model that leverages 2D grayscale ultrasound (US) images alongside readily available clinical data to improve diagnostic performance for ovarian cancer (OC). A retrospective analysis was conducted involving 1899 patients who underwent preoperative US examinations and subsequent surgeries for adnexal masses between 2019 and 2024. A multimodal deep learning model was constructed for OC diagnosis and extracting US morphological features from the images. The model's performance was evaluated using metrics such as receiver operating characteristic (ROC) curves, accuracy, and F1 score. The multimodal deep learning model exhibited superior performance compared to the image-only model, achieving areas under the curves (AUCs) of 0.9393 (95% CI 0.9139-0.9648) and 0.9317 (95% CI 0.9062-0.9573) in the internal and external test sets, respectively. The model significantly improved the AUCs for OC diagnosis by radiologists and enhanced inter-reader agreement. Regarding US morphological feature extraction, the model demonstrated robust performance, attaining accuracies of 86.34% and 85.62% in the internal and external test sets, respectively. Multimodal deep learning has the potential to enhance the diagnostic accuracy and consistency of radiologists in identifying OC. The model's effective feature extraction from ultrasound images underscores the capability of multimodal deep learning to automate the generation of structured ultrasound reports.

Ultrasound Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Non-invasive prediction of NSCLC immunotherapy efficacy and tumor microenvironment through unsupervised machine learning-driven CT Radiomic subtypes: a multi-cohort study.

Guo Y, Gong B, Li Y, Mo P, Chen Y, Fan Q, Sun Q, Miao L, Li Y, Liu Y, Tan W, Yang L, Zheng C

•papers•Jun 24 2025

Radiomics analyzes quantitative features from medical images to reveal tumor heterogeneity, offering new insights for diagnosis, prognosis, and treatment prediction. This study explored radiomics based biomarkers to predict immunotherapy response and its association with the tumor microenvironment in non-small cell lung cancer (NSCLC) using unsupervised machine learning models derived from CT imaging. This study included 1539 NSCLC patients from seven independent cohorts. For 1834 radiomic features extracted from 869 NSCLC patients, K-means unsupervised clustering was applied to identify radiomic subtypes. A random forest model extended subtype classification to external cohorts, model accuracy, sensitivity, and specificity were evaluated. By conducting bulk RNA sequencing (RNA-seq) and single-cell transcriptome sequencing (scRNA-seq) of tumors, the immune microenvironment characteristics of tumors can be obtained to evaluate the association between radiomic subtypes and immunotherapy efficacy, immune scores, and immune cells infiltration. Unsupervised clustering stratified NSCLC patients into two subtypes (Cluster 1 and Cluster 2). Principal component analysis confirmed significant distinctions between subtypes across all cohorts. Cluster 2 exhibited significantly longer median overall survival (35 vs. 30 months, P = 0.006) and progression-free survival (19 vs. 16 months, P = 0.020) compared to Cluster 1. Multivariate Cox regression identified radiomic subtype as an independent predictor of overall survival (HR: 0.738, 95% CI 0.583-0.935, P = 0.012), validated in two external cohorts. Bulk RNA seq showed elevated interaction signaling and immune scores in Cluster 2 and scRNA-seq demonstrated higher proportions of T cells, B cells, and NK cells in Cluster 2. This study establishes a radiomic subtype associated with NSCLC immunotherapy efficacy and tumor immune microenvironment. The findings provide a non-invasive tool for personalized treatment, enabling early identification of immunotherapy-responsive patients and optimized therapeutic strategies.

CT Classification Chest Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Comprehensive predictive modeling in subarachnoid hemorrhage: integrating radiomics and clinical variables.

Urbanos G, Castaño-León AM, Maldonado-Luna M, Salvador E, Ramos A, Lechuga C, Sanz C, Juárez E, Lagares A

•papers•Jun 24 2025

Subarachnoid hemorrhage (SAH) is a severe condition with high morbidity and long-term neurological consequences. Radiomics, by extracting quantitative features from Computed Tomograhpy (CT) scans, may reveal imaging biomarkers predictive of outcomes. This study evaluates the predictive value of radiomics in SAH for multiple outcomes and compares its performance to models based on clinical data.Radiomic features were extracted from admission CTs using segmentations of brain tissue (white and gray matter) and hemorrhage. Machine learning models with cross-validation were trained using clinical data, radiomics, or both, to predict 6-month mortality, Glasgow Outcome Scale (GOS), vasospasm, and long-term hydrocephalus. SHapley Additive exPlanations (SHAP) analysis was used to interpret feature contributions.The training dataset included 403 aneurysmal SAH patients; GOS predictions used all patients, while vasospasm and hydrocephalus predictions excluded those with incomplete data or early death, leaving 328 and 332 patients, respectively. Radiomics and clinical models demonstrated comparable performance, achieving in validation set AUCs more than 85% for six-month mortality and clinical outcome, and 75% and 86% for vasospasm and hydrocephalus, respectively. In an independent cohort of 41 patients, the combined models yielded AUCs of 89% for mortality, 87% for clinical outcome, 66% for vasospasm, and 72% for hydrocephalus. SHAP analysis highlighted significant contributions of radiomic features from brain tissue and hemorrhage segmentation, alongside key clinical variables, in predicting SAH outcomes.This study underscores the potential of radiomics-based approaches for SAH outcome prediction, demonstrating predictive power comparable to traditional clinical models and enhancing understanding of SAH-related complications.Clinical trial number Not applicable.

CT Classification Neurological Retrospective Clinical In Silico Academic Lab

DeepSeek-assisted LI-RADS classification: AI-driven precision in hepatocellular carcinoma diagnosis.

Zhang J, Liu J, Guo M, Zhang X, Xiao W, Chen F

•papers•Jun 24 2025

The clinical utility of the DeepSeek-V3 (DSV3) model in enhancing the accuracy of Liver Imaging Reporting and Data System (LI-RADS, LR) classification remains underexplored. This study aimed to evaluate the diagnostic performance of DSV3 in LR classifications compared to radiologists with varying levels of experience and to assess its potential as a decision-support tool in clinical practice. A dual-phase retrospective-prospective study analyzed 426 liver lesions (300 retrospective, 126 prospective) in high-risk HCC patients who underwent Magnetic Resonance Imaging (MRI) or Computed Tomography (CT). Three radiologists (one junior, two seniors) independently classified lesions using LR v2018 criteria, while DSV3 analyzed unstructured radiology reports to generate corresponding classifications. In the prospective cohort, DSV3 processed inputs in both Chinese and English to evaluate language impact. Performance was compared using chi-square test or Fisher's exact test, with pathology as the gold standard. In the retrospective cohort, DSV3 significantly outperformed junior radiologists in diagnostically challenging categories: LR-3 (17.8% vs. 39.7%, p<0.05), LR-4 (80.4% vs. 46.2%, p<0.05), and LR-5 (86.2% vs. 66.7%, p<0.05), while showing comparable accuracy in LR-1 (90.8% vs. 88.7%), LR-2 (11.9% vs. 25.6%), and LR-M (79.5% vs. 62.1%) classifications (all p>0.05). Prospective validation confirmed these findings, with DSV3 demonstrating superior performance for LR-3 (13.3% vs. 60.0%), LR-4 (93.3% vs. 66.7%), and LR-5 (93.5% vs. 67.7%) compared to junior radiologists (all p<0.05). Notably, DSV3 achieved diagnostic parity with senior radiologists across all categories (p>0.05) and maintained consistent performance between Chinese and English inputs. The DSV3 model effectively improves diagnostic accuracy of LR-3 to LR-5 classifications among junior radiologists . Its language-independent performance and ability to match senior-level expertise suggest strong potential for clinical implementation to standardize HCC diagnosis and optimize treatment decisions.

Mixed Modality Classification Abdominal Retrospective Clinical Clinical Pilot Academic Lab Benchmark SOTA

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

Xuesong Li, Dianye Huang, Yameng Zhang, Nassir Navab, Zhongliang Jiang

•preprint•Jun 24 2025

Ultrasound Classification Vascular Methodology In Silico Academic Lab GenAI

Machine learning-based construction and validation of an radiomics model for predicting ISUP grading in prostate cancer: a multicenter radiomics study based on [68Ga]Ga-PSMA PET/CT.

Zhang H, Jiang X, Yang G, Tang Y, Qi L, Chen M, Hu S, Gao X, Zhang M, Chen S, Cai Y

•papers•Jun 24 2025

The International Society of Urological Pathology (ISUP) grading of prostate cancer (PCa) is a crucial factor in the management and treatment planning for PCa patients. An accurate and non-invasive assessment of the ISUP grading group could significantly improve biopsy decisions and treatment planning. The use of PSMA-PET/CT radiomics for predicting ISUP has not been widely studied. The aim of this study is to investigate the role of 68Ga-PSMA PET/CT radiomics in predicting the ISUP grading of primary PCa. This study included 415 PCa patients who underwent 68Ga-PSMA PET/CT scans before prostate biopsy or radical prostatectomy. Patients were from three centers: Xiangya Hospital, Central South University (252 cases), Qilu Hospital of Shandong University (External Validation 1, 108 cases), and Qingdao University Medical College (External Validation 2, 55 cases). Xiangya Hospital cases were split into training and testing groups (1:1 ratio), with the other centers serving as external validation groups. Feature selection was performed using Minimum Redundancy Maximum Relevance (mRMR) and Least Absolute Shrinkage and Selection Operator (LASSO) algorithms. Eight machine learning classifiers were trained and tested with ten-fold cross-validation. Sensitivity, specificity, and AUC were calculated for each model. Additionally, we combined the radiomic features with maximum Standardized Uptake Value (SUVmax) and prostate-specific antigen (PSA) to create prediction models and tested the corresponding performances. The best-performing model in the Xiangya Hospital training cohort achieved an AUC of 0.868 (sensitivity 72.7%, specificity 96.0%). Similar trends were seen in the testing cohort and external validation centers (AUCs: 0.860, 0.827, and 0.812). After incorporating PSA and SUVmax, a more robust model was developed, achieving an AUC of 0.892 (sensitivity 77.9%, specificity 96.0%) in the training group. This study established and validated a radiomics model based on 68Ga-PSMA PET/CT, offering an accurate, non-invasive method for predicting ISUP grades in prostate cancer. A multicenter design with external validation ensured the model's robustness and broad applicability. This is the largest study to date on PSMA radiomics for predicting ISUP grades. Notably, integrating SUVmax and PSA metrics with radiomic features significantly improved prediction accuracy, providing new insights and tools for personalized diagnosis and treatment.

PET Classification Abdominal Retrospective Clinical In Silico Academic Lab Benchmark SOTA

Filter Papers

Tags

Preoperative Assessment of Lymph Node Metastasis in Rectal Cancer Using Deep Learning: Investigating the Utility of Various MRI Sequences.

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Multimodal Deep Learning Based on Ultrasound Images and Clinical Data for Better Ovarian Cancer Diagnosis.

Non-invasive prediction of NSCLC immunotherapy efficacy and tumor microenvironment through unsupervised machine learning-driven CT Radiomic subtypes: a multi-cohort study.

Comprehensive predictive modeling in subarachnoid hemorrhage: integrating radiomics and clinical variables.

DeepSeek-assisted LI-RADS classification: AI-driven precision in hepatocellular carcinoma diagnosis.

Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance

Machine learning-based construction and validation of an radiomics model for predicting ISUP grading in prostate cancer: a multicenter radiomics study based on [68Ga]Ga-PSMA PET/CT.

Ready to Sharpen Your Edge?