Clinical Grading of Artificial Intelligence-Based 3D Fetal Brain Segmentations: A Cross-Vendor Evaluation of Deep Learning in Fetal Neuroimaging.
Authors
Affiliations (2)
Affiliations (2)
- Department of Obstetrics and Fetal Medicine, Leiden University Medical Center, Leiden, the Netherlands.
- Department of Computer Science, University of Oxford, Oxford, UK.
Abstract
To evaluate the performance of automated (sub)cortical fetal brain segmentation methods on a novel 3D ultrasound dataset acquired from a different vendor, and to introduce a clinician-focused visual evaluation framework complementary to the widely used Dice Similarity Coefficient (DSC). This cohort study included 270 volumes (141 fetuses, 19-26Â +Â 6Â weeks gestation). Deep learning models were applied to segment the cavum septum pellucidum et vergae (CSPV), lateral posterior ventricle horn (LPVH), choroid plexus (ChP), cerebellum (CBM) and cortical plate (CoP) on a new dataset acquired by a different ultrasound vendor. Segmentations were visually graded (1-4Â =Â high to poor quality) based on predefined criteria. Grades were analyzed as "adequate" (1Â +Â 2) or "inadequate" (3Â +Â 4). CSPV, ChP and CBM showed the best segmentation grades (> 83.1% grade 1, > 90.5% adequate) and were robust across gestation. LPVH showed the lowest performance (73.9% adequate). Overall segmentation quality across all structures was high (87.2% adequate). Intra- and interobserver agreement was 90.1% and 82.1%-92.7%, respectively. These deep-learning methods can reliably segment (sub)cortical structures when applied to a novel dataset acquired with a different ultrasound vendor, demonstrating robustness. Incorporating visual assessment alongside quantitative metrics provides insight into anatomical accuracy and clinical usability.