Radiologist-AI Collaboration for Ischemia Diagnosis in Small-Bowel Obstruction: Multicentric Development and External Validation of a Multimodal Deep Learning Model.
Authors
Affiliations (6)
Affiliations (6)
- Department of Radiology, AP-HP.Sorbonne, Saint Antoine Hospital, 184 Rue du Faubourg Saint-Antoine, 75012, Paris, France. [email protected].
- lLIB, UMR 7371, Université Sorbonne, CNRS, Inserm U114615, rue de l'École de Médecine, rue de l'École de Médecine, 75006, Paris, France. [email protected].
- Department of Radiology, AP-HP.Sorbonne, Saint Antoine Hospital, 184 Rue du Faubourg Saint-Antoine, 75012, Paris, France.
- CVN, CentraleSupélec, INRIA Paris Saclay, Université Paris-Saclay, Gif-Sur-Yvette, France.
- Department of Medical Imaging, Saint Joseph Hospital, 185 rue Raymond Losserand, 75014, Paris, France.
- lLIB, UMR 7371, Université Sorbonne, CNRS, Inserm U114615, rue de l'École de Médecine, rue de l'École de Médecine, 75006, Paris, France.
Abstract
This study aims to develop and externally validate a multimodal AI model for detecting ischemia complicating small-bowel obstruction (SBO). We combined 3D CT data with routine laboratory markers (C-reactive protein, neutrophil count) and, optionally, radiology report indication/history text. From two centers, 1350 CT examinations were curated; 771 confirmed SBO scans were used for model development with patient-level splits. Ischemia labels were defined by surgical confirmation within 24 h of imaging. Models (MViT, ResNet-101, DaViT) were trained as unimodal and multimodal variants. External testing was used for 66 independent cases from a third center. Four radiologists (two residents and two experts) read the test set with and without AI assistance. Performance was assessed using AUC, sensitivity, specificity, and 95% bootstrap confidence intervals; predictions included a confidence score. The image-plus-laboratory model performed best on external testing (AUC 0.69 [0.59-0.79], sensitivity 0.89 [0.76-1.00], and specificity 0.44 [0.35-0.54]). Adding report text improved internal validation but did not generalize externally; image + text and full multimodal variants did not exceed image + laboratory performance. Across readers, baseline AUC ranged from 0.496 [0.361-0.640] to 0.745 [0.589-0.875] and increased with reader experience. With AI assistance, AUC ranged from 0.565 [0.419-0.717] to 0.845 [0.714-0.952] and from 0.519 [0.373-0.669] to 0.845 [0.708-0.954] when confidence scores were displayed, showing consistent but non-significant changes whatever the experience level. A multimodal model combining CT and lab data surpassed unimodal approaches for 24-h ischemia detection; as a triage-support tool, it showed a consistent but non-significant improvement in radiologist performance.