Artificial intelligence potential in ovarian endometriosis imaging: a comparative meta-analysis of transvaginal ultrasound-based AI models and human readers.
Authors
Affiliations (9)
Affiliations (9)
- Danube Private University, Krems, Austria.
- Tehran University of Medical Sciences, Tehran, Iran, Islamic Republic of.
- Urmia University of Medical Sciences, Urmia, Iran, Islamic Republic of.
- Department of Gynaecology, Hospital St John of God, Vienna, Austria.
- Jagiellonian University, Krakow, Poland.
- Department of Gynaecology, Jaghiellonian University, Krakow, Poland.
- University of Southern Queensland, Toowoomba, Australia.
- Danube Private University, Krems, Austria. [email protected].
- Austrian Center for Medical Innovation and Technology (ACMIT), Wiener Neustadt, Austria.
Abstract
Transvaginal ultrasound (TVUS) is widely used for diagnosing ovarian endometriosis but remains limited by significant operator dependency. This systematic review and meta-analysis evaluated the diagnostic accuracy of ultrasound-based artificial intelligence (AI) models for ovarian endometriosis and directly compared their performance with that of human readers. We conducted a comprehensive search of five databases (PubMed, Embase, Scopus, Web of Science, and Cochrane Library) up to 5 December 2025 to identify studies reporting diagnostic metrics for AI models, compared with human readers, for detecting ovarian endometriomas. Pooled sensitivity, specificity, and area under the curve (AUC) were calculated using a bivariate random-effects model. Seven studies with 2737 patients (6061 images) were included. AI models demonstrated a pooled sensitivity of 91% (confidence interval 81%-96%) and specificity of 95% (confidence interval 92%-96%). Human readers achieved a pooled sensitivity of 80% (95% confidence interval 65-90) and specificity of 85% (95% confidence interval 74-92). AI models significantly outperformed readers in specificity (p = 0.001) and overall diagnostic discrimination, with an AUC of 0.97 (95% confidence interval 0.95-0.98) compared with 0.84 (95% confidence interval 0.82-0.88) for human readers (p < 0.001), while sensitivity remained comparable (p = 0.10). Heterogeneity across included studies was minimal (0%). AI algorithms show indications of promising but preliminary diagnostic performance relative to human readers in specificity and overall discrimination metrics. These results raise the possibility that AI might serve as a supplementary tool in sonographic evaluation of ovarian endometriosis, potentially contributing to more standardized interpretation and reduced false-positive diagnoses.