ARTIFICIAL INTELLIGENCE ENHANCES DIAGNOSTIC ACCURACY OF CONTRAST ENEMAS IN HIRSCHSPRUNG DISEASE COMPARED TO CLINICAL EXPERTS.

Authors

Vargova P,Varga M,Izquierdo Hernandez B,Gutierrez Alonso C,Gonzalez Esgueda A,Cobos Hernandez MV,Fernandez R,González-Ruiz Y,Bragagnini Rodriguez P,Del Peral Samaniego M,Corona Bellostas C

Affiliations (3)

  • Pediatric Surgery, Hospital Universitario Miguel Servet, Zaragoza, Spain.
  • Institute of Experimental Physics, Slovak Academy of Sciences in Kosice, Košice, Slovakia.
  • Pediatric Radiology, Hospital Universitario Miguel Servet, Zaragoza, Spain.

Abstract

Introduction Contrast enema (CE) is widely used in the evaluation of suspected Hirschsprung disease (HD). Deep learning is a promising tool to standardize image assessment and support clinical decision-making. This study assesses the diagnostic performance of a deep neural network (DNN), with and without clinical data, and compares its interpretation with that of pediatric surgeons and radiologists. Materials and Methods In this retrospective study, 1471 contrast enema images from patients <15 years were analysed, with 218 images used for testing. A deep neural network, pediatric radiologists, and surgeons independently reviewed the testing set, with and without clinical data. Diagnostic performance was assessed using ROC and PR curves, and interobserver agreement was evaluated using Fleiss' kappa. Results The deep neural network achieved high diagnostic accuracy (AUC-ROC = 0.87) in contrast enema interpretation, with improved performance when combining anteroposterior and lateral images (AUC-ROC = 0.92). Clinical data integration further enhanced model sensitivity and negative predictive value. The super-surgeon (majority voting of colorectal surgeons) outperformed most individual clinicians (sensitivity 81.8%, specificity 79.1%), while the super-radiologist (majority voting of radiologist) showed moderate accuracy. Interobserver analysis revealed strong agreement between the model and surgeons (Cohen's kappa = 0.73), and overall consistency among experts and the model (Fleiss' kappa = 0.62). Conclusions AI-assisted CE interpretation achieved higher specificity and comparable sensitivity to those of the clinicians. Its consistent performance and substantial agreement with experts support its potential role in improving CE assessment in HD.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.