Deep learning-based automated contrast enema analysis to improve the assessment of Hirschsprung disease.
Authors
Affiliations (6)
Affiliations (6)
- Department of Pediatric Surgery, Miguel Servet University Hospital, Zaragoza, Spain. [email protected].
- Instituto de Investigación Sanitaria Aragón (IIS Aragón), Zaragoza, Spain. [email protected].
- Slovak Academy of Sciences, Institute of Experimental Physics, Košice, Slovakia.
- Department of Radiology, Miguel Servet University Hospital, Zaragoza, Spain.
- Department of Pediatric Surgery, Miguel Servet University Hospital, Zaragoza, Spain.
- Instituto de Investigación Sanitaria Aragón (IIS Aragón), Zaragoza, Spain.
Abstract
To compare the radiologic assessment of Hirschsprung disease (HD) based on contrast enema with automated image analysis using a deep neural network (DNN) for image recognition. A retrospective observational single-centre study was conducted at a tertiary care hospital, including paediatric patients who underwent contrast enema between January 2011 and December 2023, either for suspected HD or other clinical indications. A classifier based on a pretrained DNN (DenseNet121) was developed to detect HD in contrast enema images. DNN performance was assessed using balanced accuracy, sensitivity, and the area under the receiver-operating characteristic curve (AUC-ROC) and area under the precision-recall curve (AUC-PR) analyses. Rectal biopsy was the reference standard, with clinical follow-up in cases where a biopsy was not performed. The DNN classification performance was compared to historical expert radiologic assessment. A total of 278 contrast enemas were performed in 221 patients (64.8% male, 35.2% female), mean age of 4.14 years and a median of 2.65 years. DenseNet121 achieved 75.3% balanced accuracy, 58.5% sensitivity, and 92.1% specificity per individual image, improving to 82.8%, 72.7%, and 93.0%, respectively, at the contrast enema level. The model achieved a similar AUC-ROC compared to expert radiologists in their original reports (0.830 vs 0.804), and the interobserver agreement was moderate (Cohen´s kappa = 0.475). The DNN model demonstrated higher specificity than radiologists in the interpretation of contrast enemas in patients with suspected HD. Moderate interobserver agreement underscores the model's potential value as a tool for diagnostic support and standardisation, particularly in settings where access to experienced specialists may be limited or in borderline cases. Question Contrast enema is commonly used to evaluate suspected HD, but its diagnostic accuracy is variable and dependent on the radiologist's expertise. Findings A deep learning model outperformed radiologists in specificity (93.0% vs 79.1%), however, the difference was not statistically significant, and the interobserver agreement was moderate (Cohen´s kappa = 0.475). Clinical relevanceA DNN trained for automated analysis of contrast enema can identify patterns suggestive of HD with performance comparable to conventional radiological assessment, underscoring its value as a tool for diagnostic support in borderline cases or when access to experienced specialists may be limited.