Back to all papers

Development and validation of an artificial intelligence-based model for diagnosing benign, borderline, and malignant adnexal masses.

February 3, 2026pubmed logopapers

Authors

Wu Y,Dai W,Li X,Zhang S,Gong L,Wang J,Cui A,Li S,Zhu M,Dong S,Wang Y,Zhou L,Kong D,Zhao J,Sun L

Affiliations (7)

  • Cancer Centre, Department of Ultrasound Medicine, Zhejiang Provincial People's Hospital (Affiliated People's Hospital), Hangzhou Medical College, Hangzhou, Zhejiang, China.
  • Key Discipline of Zhejiang Province in Public Health and Preventive Medicine (First Class, Category A), Hangzhou Medical College, Hangzhou, Zhejiang, China.
  • School of Mathematical Sciences, Zhejiang University, Zijingang Campus, Hangzhou, Zhejiang, China.
  • Institute of Pathology and Southwest Cancer Center, Southwest Hospital, Third Military Medical University (Army Medical University) and Key Laboratory of Tumor Immunopathology, Ministry of Education of China, Chongqing, China.
  • Department of Ultrasound Medicine, Second Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China.
  • Department of Ultrasound Medicine, Sichuan Provincial Maternity and Child Health Care Hospital, Chengdu, Sichuan, China. [email protected].
  • Cancer Centre, Department of Ultrasound Medicine, Zhejiang Provincial People's Hospital (Affiliated People's Hospital), Hangzhou Medical College, Hangzhou, Zhejiang, China. [email protected].

Abstract

Classification of benign, borderline, and malignant adnexal masses is critical to effective clinical management, but remains a challenge. We developed Clinical-Ovarian Multi-Task Attention (Clinical-OMTA), an artificial intelligence model based on a dual-backbone architecture (benign vs. non-benign, and borderline vs. malignant) that integrates ultrasound, age, and Carbohydrate Antigen 125 (CA125) for multi-class classification. The model's performance, generalisability, and clinical utility were evaluated. Retrospective data were collected from 23 hospitals (1882 patients for training, validation, and internal testing from 21 hospitals; 340 and 159 patients for external testing from two hospitals). In the external image dataset, Clinical-OMTA demonstrated comparable diagnostic performance to ADNEX (area under the receiver operating characteristic curve [AUC]: 0.950 vs. 0.953, 0.870 vs. 0.853, 0.930 vs. 0.938) and subjective assessment by an expert examiner (accuracy: 85.6% vs. 87.4%). While Clinical-OMTA supported multimodal integration, it did not outperform Ovarian Multi-Task Attention (OMTA) that trained only with images, indicating that including age and CA125 did not improve performance. Clinical-OMTA performed similarly across acquisition modes, equipment types, scanning methods, and different centres (accuracy: 79.9%-87.7%). With Clinical-OMTA as a decision support tool, radiologists showed significantly improved inter-reader agreement (kappa: 0.17-0.78 vs. 0.86-0.98) and diagnostic accuracy (72.3% vs. 88.0%). Clinical-OMTA appears generalisable and could be especially useful in low-resource or remote settings where expert ultrasound examiners are scarce.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.