Performance of adult-trained artificial intelligence models in paediatric imaging-a scoping review.
Authors
Affiliations (8)
Affiliations (8)
- Mohn Medical Imaging and Visualization Centre, Department of Radiology, Haukeland University Hospital, Bergen, Norway. [email protected].
- Department of Clinical Medicine, University of Bergen, Bergen, Norway. [email protected].
- Department of Clinical Radiology, Great Ormond Street Hospital for Children, London, UK.
- Department of Radiology, St George's Hospital NHS Foundation Trust, London, UK.
- City St George's, University of London, London, UK.
- Medical Library, University of Bergen, Bergen, Norway.
- UCL Great Ormond Street Institute of Child Health, Great Ormond Street Hospital for Children, London, UK.
- NIHR Great Ormond Street Hospital Biomedical Research Centre, Bloomsbury, London, UK.
Abstract
This scoping review aims to evaluate the performance of artificial intelligence (AI) models designed for adults when applied to paediatric imaging datasets without additional adaptations, and to quantify performance degradation across different modalities, use-cases and age groups. A literature search was conducted covering 10 years (1/01/2014-23/06/2025) using terms relating to "child", "adult", "artificial intelligence", "radiology" and "validation/performance". Two reviewers independently extracted data using standardised templates and conducted a narrative analysis. Of 5642 abstracts, 20 studies met the inclusion criteria. The studies evaluated AI tools across 16 paediatric dataset cohorts ranging from 30 to 7357 subjects. Three datasets were used more than once to evaluate different AI model performance metrics. The tools were applied to radiography (nā=ā7), CT (nā=ā7), MRI (nā=ā2), Dual-energy-x-ray-absorptiometry (DEXA) (nā=ā2) and ultrasound (nā=ā2) across different AI tasks: segmentation (nā=ā9), classification (nā=ā4), detection (nā=ā3), and mixed tasks (nā=ā4). Apart from two studies, all articles reported performance reduction when adult-trained AI tools were applied to paediatric populations. Cohort overlap represents the risk of duplication bias. Detection tasks showed the most severe deterioration, with sensitivity dropping from 68-100% in adults to 26-68% in children for pulmonary nodule detection. For segmentation tasks, Dice score reductions > 0.10 were noted across organs and imaging modalities. Children ⤠2 years consistently showed the greatest performance deficits across all task types. AI tools intended for adult use do not perform to the same standard when used in a paediatric population without additional adaptation, particularly for children under 2 years. Careful model evaluation is required before clinical implementation. Question How do artificial intelligence-based radiology tools designed for adults perform when applied to paediatric imaging without additional adaptation? Findings Adult-trained AI models consistently demonstrated reduced performance in children, particularly in those under 2 years, with detection tasks showing the most severe deterioration. Clinical relevance Healthcare professionals should not assume that adult-trained radiology AI tools intended for adult use can be directly applied to the paediatric population without validation, additional training or fine-tuning, particularly for the youngest age groups.