ARTIFICIAL INTELLIGENCE ENHANCES DIAGNOSTIC ACCURACY OF CONTRAST ENEMAS IN HIRSCHSPRUNG DISEASE COMPARED TO CLINICAL EXPERTS.
Authors
Affiliations (3)
Affiliations (3)
- Pediatric Surgery, Hospital Universitario Miguel Servet, Zaragoza, Spain.
- Institute of Experimental Physics, Slovak Academy of Sciences in Kosice, Košice, Slovakia.
- Pediatric Radiology, Hospital Universitario Miguel Servet, Zaragoza, Spain.
Abstract
Introduction Contrast enema (CE) is widely used in the evaluation of suspected Hirschsprung disease (HD). Deep learning is a promising tool to standardize image assessment and support clinical decision-making. This study assesses the diagnostic performance of a deep neural network (DNN), with and without clinical data, and compares its interpretation with that of pediatric surgeons and radiologists. Materials and Methods In this retrospective study, 1471 contrast enema images from patients <15 years were analysed, with 218 images used for testing. A deep neural network, pediatric radiologists, and surgeons independently reviewed the testing set, with and without clinical data. Diagnostic performance was assessed using ROC and PR curves, and interobserver agreement was evaluated using Fleiss' kappa. Results The deep neural network achieved high diagnostic accuracy (AUC-ROC = 0.87) in contrast enema interpretation, with improved performance when combining anteroposterior and lateral images (AUC-ROC = 0.92). Clinical data integration further enhanced model sensitivity and negative predictive value. The super-surgeon (majority voting of colorectal surgeons) outperformed most individual clinicians (sensitivity 81.8%, specificity 79.1%), while the super-radiologist (majority voting of radiologist) showed moderate accuracy. Interobserver analysis revealed strong agreement between the model and surgeons (Cohen's kappa = 0.73), and overall consistency among experts and the model (Fleiss' kappa = 0.62). Conclusions AI-assisted CE interpretation achieved higher specificity and comparable sensitivity to those of the clinicians. Its consistent performance and substantial agreement with experts support its potential role in improving CE assessment in HD.