Comparison of publicly available artificial intelligence models for pancreatic segmentation on T1-weighted Dixon images.

Authors

Sonoda Y,Fujisawa S,Kurokawa M,Gonoi W,Hanaoka S,Yoshikawa T,Abe O

Affiliations (3)

  • Department of Radiology, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan. [email protected].
  • Department of Radiology, Graduate School of Medicine, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan.
  • Department of Computational Diagnostic Radiology and Preventive Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan.

Abstract

This study aimed to compare three publicly available deep learning models (TotalSegmentator, TotalVibeSegmentator, and PanSegNet) for automated pancreatic segmentation on magnetic resonance images and to evaluate their performance against human annotations in terms of segmentation accuracy, volumetric measurement, and intrapancreatic fat fraction (IPFF) assessment. Twenty upper abdominal T1-weighted magnetic resonance series acquired using the two-point Dixon method were randomly selected. Three radiologists manually segmented the pancreas, and a ground-truth mask was constructed through a majority vote per voxel. Pancreatic segmentation was also performed using the three artificial intelligence models. Performance was evaluated using the Dice similarity coefficient (DSC), 95th-percentile Hausdorff distance, average symmetric surface distance, positive predictive value, sensitivity, Bland-Altman plots, and concordance correlation coefficient (CCC) for pancreatic volume and IPFF. PanSegNet achieved the highest DSC (mean ± standard deviation, 0.883 ± 0.095) and showed no statistically significant difference from the human interobserver DSC (0.896 ± 0.068; p = 0.24). In contrast, TotalVibeSegmentator (0.731 ± 0.105) and TotalSegmentator (0.707 ± 0.142) had significantly lower DSC values compared with the human interobserver average (p < 0.001). For pancreatic volume and IPFF, PanSegNet demonstrated the best agreement with the ground truth (CCC values of 0.958 and 0.993, respectively), followed by TotalSegmentator (0.834 and 0.980) and TotalVibeSegmentator (0.720 and 0.672). PanSegNet demonstrated the highest segmentation accuracy and the best agreement with human measurements for both pancreatic volume and IPFF on T1-weighted Dixon images. This model appears to be the most suitable for large-scale studies requiring automated pancreatic segmentation and intrapancreatic fat evaluation.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.