Artificial Intelligence-assisted Ultrasound for Developmental Dysplasia of the Hip: Systematic Review and Meta-analysis Across Segmentation, Angle Measurement, and Diagnostic Tasks.
Authors
Affiliations (2)
Affiliations (2)
- Faculty of Medicine, TOBB University of Economics and Technology, Sogutozu Cad. No:43, Sogutozu, 06500, Ankara, Turkey (R.S., E.D.M., Y.D.K.).
- Department of Radiology, Faculty of Medicine, TOBB University of Economics and Technology, Sogutozu Cad. No:43, Sogutozu, 06500, Ankara, Turkey (A.O.). Electronic address: [email protected].
Abstract
Artificial intelligence (AI) applications for developmental dysplasia of the hip (DDH) assessment have emerged across multiple analytical tasks, yet comprehensive evidence synthesis remains limited. This systematic review and meta-analysis evaluated AI-assisted hip ultrasonography performance across segmentation, angle measurement, and diagnostic classification tasks in infants. Following PROSPERO registration (CRD420251133940), we searched Medline/PubMed, Web of Science, Scopus, and Cochrane Library databases. Studies reporting AI performance for DDH ultrasonographic evaluation were included. Quality assessment employed QUADAS-2 and CLAIM checklists. Random-effects meta-analyses synthesized Dice Similarity Coefficients (DSC) for segmentation, Intraclass Correlation Coefficients (ICC) for angle measurements, and diagnostic accuracy metrics. Heterogeneity was assessed using I² statistics and prediction intervals. Thirty-four studies met inclusion criteria. Segmentation performance (7 studies, n=3011) yielded pooled DSC of 0.876 (95% CI: 0.827-0.924). Angle measurement reliability (4 studies, n=2012) demonstrated pooled ICC of 0.883 (95% CI: 0.763-0.945) for alpha angles and 0.864 (95% CI: 0.653-0.951) for beta angles. Diagnostic classification showed pooled AUC of 0.937 (95% CI: 0.885-0.966) across 5 studies (n=3035), with pooled sensitivity 0.888 (95% CI: 0.840-0.923) and specificity 0.893 (95% CI: 0.735-0.961) from 10 studies (n=5341). Substantial heterogeneity (I²>75%) was observed across all domains. CLAIM adherence averaged 67.9%, with critical gaps in external validation (20.6%). AI-assisted hip ultrasonography demonstrates strong performance across multiple tasks, supporting potential clinical application. However, high heterogeneity, limited external validation, and reporting deficiencies necessitate standardization before clinical deployment.