Development and validation of an artificial intelligence-based model for diagnosing benign, borderline, and malignant adnexal masses.

February 3, 2026

papers

DOI: 10.1038/s41698-026-01320-5 PMID: 41634163

Authors

Wu Y,Dai W,Li X,Zhang S,Gong L,Wang J,Cui A,Li S,Zhu M,Dong S,Wang Y,Zhou L,Kong D,Zhao J,Sun L

Affiliations (7)

Cancer Centre, Department of Ultrasound Medicine, Zhejiang Provincial People's Hospital (Affiliated People's Hospital), Hangzhou Medical College, Hangzhou, Zhejiang, China.
Key Discipline of Zhejiang Province in Public Health and Preventive Medicine (First Class, Category A), Hangzhou Medical College, Hangzhou, Zhejiang, China.
School of Mathematical Sciences, Zhejiang University, Zijingang Campus, Hangzhou, Zhejiang, China.
Institute of Pathology and Southwest Cancer Center, Southwest Hospital, Third Military Medical University (Army Medical University) and Key Laboratory of Tumor Immunopathology, Ministry of Education of China, Chongqing, China.
Department of Ultrasound Medicine, Second Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China.
Department of Ultrasound Medicine, Sichuan Provincial Maternity and Child Health Care Hospital, Chengdu, Sichuan, China. [email protected].
Cancer Centre, Department of Ultrasound Medicine, Zhejiang Provincial People's Hospital (Affiliated People's Hospital), Hangzhou Medical College, Hangzhou, Zhejiang, China. [email protected].

Abstract

Classification of benign, borderline, and malignant adnexal masses is critical to effective clinical management, but remains a challenge. We developed Clinical-Ovarian Multi-Task Attention (Clinical-OMTA), an artificial intelligence model based on a dual-backbone architecture (benign vs. non-benign, and borderline vs. malignant) that integrates ultrasound, age, and Carbohydrate Antigen 125 (CA125) for multi-class classification. The model's performance, generalisability, and clinical utility were evaluated. Retrospective data were collected from 23 hospitals (1882 patients for training, validation, and internal testing from 21 hospitals; 340 and 159 patients for external testing from two hospitals). In the external image dataset, Clinical-OMTA demonstrated comparable diagnostic performance to ADNEX (area under the receiver operating characteristic curve [AUC]: 0.950 vs. 0.953, 0.870 vs. 0.853, 0.930 vs. 0.938) and subjective assessment by an expert examiner (accuracy: 85.6% vs. 87.4%). While Clinical-OMTA supported multimodal integration, it did not outperform Ovarian Multi-Task Attention (OMTA) that trained only with images, indicating that including age and CA125 did not improve performance. Clinical-OMTA performed similarly across acquisition modes, equipment types, scanning methods, and different centres (accuracy: 79.9%-87.7%). With Clinical-OMTA as a decision support tool, radiologists showed significantly improved inter-reader agreement (kappa: 0.17-0.78 vs. 0.86-0.98) and diagnostic accuracy (72.3% vs. 88.0%). Clinical-OMTA appears generalisable and could be especially useful in low-resource or remote settings where expert ultrasound examiners are scarce.

View Source Full Text PDF

Topics

Journal Article

Development and validation of an artificial intelligence-based model for diagnosing benign, borderline, and malignant adnexal masses.

Authors

Affiliations (7)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?