Artificial intelligence applied to ultrasound diagnosis of pelvic gynecological tumors: a systematic review and meta-analysis.

May 8, 2025pubmed logopapers

Authors

Geysels A,Garofalo G,Timmerman S,Barreñada L,De Moor B,Timmerman D,Froyman W,Van Calster B

Abstract

To perform a systematic review on artificial intelligence (AI) studies focused on identifying and differentiating pelvic gynecological tumors on ultrasound scans. Studies developing or validating AI models for diagnosing gynecological pelvic tumors on ultrasound scans were eligible for inclusion. We systematically searched PubMed, Embase, Web of Science, and Cochrane Central from their database inception until April 30th, 2024. To assess the quality of the included studies, we adapted the QUADAS-2 risk of bias tool to address the unique challenges of AI in medical imaging. Using multi-level random effects models, we performed a meta-analysis to generate summary estimates of the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. To provide a reference point of current diagnostic support tools for ultrasound examiners, we descriptively compared the pooled performance to that of the well-recognized ADNEX model on external validation. Subgroup analyses were performed to explore sources of heterogeneity. From 9151 records retrieved, 44 studies were eligible: 40 on ovarian, three on endometrial, and one on myometrial pathology. Overall, 95% were at high risk of bias - primarily due to inappropriate study inclusion criteria, the absence of a patient-level split of training and testing image sets, and no calibration assessment. For ovarian tumors, the summary AUC for AI models distinguishing benign from malignant tumors was 0.89 (95% CI: 0.85-0.92). In lower-risk studies (at least three low-risk domains), the summary AUC dropped to 0.87 (0.83-0.90), with deep learning models outperforming radiomics-based machine learning approaches in this subset. Only five studies included an external validation, and six evaluated calibration performance. In a recent systematic review of external validation studies, the ADNEX model had a pooled AUC of 0.93 (0.91-0.94) in studies at low risk of bias. Studies on endometrial and myometrial pathologies were reported individually. Although AI models show promising discriminative performances for diagnosing gynecological tumors on ultrasound, most studies have methodological shortcomings that result in a high risk of bias. In addition, the ADNEX model appears to outperform most AI approaches for ovarian tumors. Future research should emphasize robust study designs - ideally large, multicenter, and prospective cohorts that mirror real-world populations - along with external validation, proper calibration, and standardized reporting. This study was pre-registered with Open Science Framework (OSF): https://doi.org/10.17605/osf.io/bhkst.

Topics

Journal ArticleSystematic Review
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.