Artificial Intelligence in Neonatal Hip Ultrasound: A Scoping Review of Methods and Clinical Evidence.
Authors
Affiliations (4)
Affiliations (4)
- Program in Innovative Technologies in Biomedical Sciences, Faculty of Medicine and Surgery, Kore University of Enna, Enna, Italy.
- Pediatric Unit, Department of Pediatrics, Azienda Ospedaliero Universitaria Policlinico G. Rodolico-San Marco, University of Catania, Catania, Italy.
- Department of Engineering and Architecture, Kore University of Enna, Enna, Italy.
- Department of Pediatrics, Faculty of Medicine and Surgery, Kore University of Enna, Enna, Italy.
Abstract
Developmental dysplasia of the hip (DDH) is among the most common musculoskeletal disorders in infants. Ultrasound, especially the Graf method, is the gold standard for early diagnosis but suffers from operator dependence and variability. Artificial intelligence (AI) and machine learning (ML) have recently been applied to improve acquisition, measurement, and classification tasks in neonatal hip ultrasound. To systematically map and synthesise the literature on AI/ML applications for neonatal hip ultrasound, identifying algorithmic approaches, tasks, study characteristics, performance metrics, and knowledge gaps. A scoping review was conducted using the Arksey and O'Malley framework, refined by the Joanna Briggs Institute, and reported according to PRISMA-ScR. The protocol was registered in PROSPERO (CRD420251150044). The protocol was initially registered as a systematic review and subsequently conducted as a scoping review; this methodological adaptation is acknowledged. PubMed/MEDLINE, Embase, Scopus, and IEEE Xplore were searched for English-language studies from January 1, 2015, to September 10, 2025. Eligible studies evaluated AI/ML algorithms for neonatal hip ultrasound in infants ≤ 12 months. From 192 records, 41 studies met inclusion criteria. Publications increased from 2020 onward, with China and Canada contributing over half. Most were retrospective, single-centre developments using convolutional neural networks; a minority used YOLO detectors or transformer models. Tasks included landmark detection/segmentation (53.7%), automated α/β angle estimation (26.8%), classification (19.5%), and scan quality assessment (≈10%). Because some studies addressed multiple analytic tasks, these categories were not mutually exclusive and percentages therefore exceed 100%. Internal performance was high (ICC up to 0.94, Dice > 0.90, AUC ≥ 0.95), but external validation was rare (≈5%). AI/ML has been applied to neonatal hip ultrasound primarily for landmark detection, angle estimation, classification, and scan-quality assessment. Reported internal performance is often high; however, clinically relevant diagnostic accuracy studies remain limited, external validation is rare, and most evidence derives from single-centre datasets. These limitations currently constrain clinical generalizability.