Back to all papers

Evaluation of 2D and 3D nnU-Net models with two-label and three-label strategies for automatic segmentation and total metabolic tumor volume estimation of metastatic differentiated thyroid carcinoma on FDG-PET/CT.

May 20, 2026pubmed logopapers

Authors

Li Y,Endo H,Hirata K,Takenaka J,Tang M,Watanabe S,Kimura R,Kudo K

Affiliations (8)

  • Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 14, Nishi 5, Kita-ku, Sapporo, 060-8648, Japan.
  • Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 14, Nishi 5, Kita-ku, Sapporo, 060-8648, Japan. [email protected].
  • Department of Nuclear Medicine, Hokkaido University Hospital, Sapporo, Japan. [email protected].
  • Global Center for Biomedical Science and Engineering, Faculty of Medicine, Hokkaido University, Sapporo, Japan. [email protected].
  • Department of Nuclear Medicine, Hokkaido University Hospital, Sapporo, Japan.
  • Department of Nuclear Medicine and Comprehensive Heart Failure Center, University Hospital Würzburg, Würzburg, Germany.
  • Global Center for Biomedical Science and Engineering, Faculty of Medicine, Hokkaido University, Sapporo, Japan.
  • Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, Sapporo, Japan.

Abstract

To evaluate the segmentation performance and total metabolic tumor volume (TMTV) prediction accuracy of 2D and 3D nnU-Net models under two-label and three-label strategies for metastatic differentiated thyroid carcinoma (DTC) on FDG PET/CT images. A total of 194 patients with FDG-avid metastatic DTC who underwent PET/CT prior to iodine-131 treatment (2009-2022) were retrospectively analyzed. The dataset was divided into Cohort 1 (n = 160) for five-fold cross-validation and Cohort 2 (n = 34) for independent testing. Both 2D and 3D nnU-Net architectures were trained under two-label and three-label schemes. Segmentation performance was assessed using the Dice similarity coefficient (DSC). TMTV prediction was evaluated using the coefficient of determination (R<sup>2</sup>) and error analyses. Under the two-label scheme, mean DSC values in Cohort 1 were 0.63 ± 0.28 (2D) and 0.60 ± 0.34 (3D), and in Cohort 2 were 0.60 ± 0.31 and 0.50 ± 0.32, respectively. Under the three-label scheme, mean DSC values in Cohort 1 were 0.66 ± 0.28 (2D) and 0.70 ± 0.30 (3D), and in Cohort 2 were 0.61 ± 0.33 and 0.61 ± 0.35, respectively. For TMTV prediction, R<sup>2</sup> values in Cohort 1 were 0.33 (2D) and 0.06 (3D) under the two-label scheme, while 0.20 (2D) and 0.12 (3D) under the three-label scheme. In Cohort 2, R<sup>2</sup> values were 0.87 (2D) and 0.80 (3D) for the two-label scheme, and 0.86 (2D) and 0.84 (3D) for the three-label scheme. Error analyses demonstrated systematic underestimation of TMTV across architectures and labeling strategies. The 2D and 3D nnU-Net models demonstrated comparable performance for segmentation and TMTV prediction under both two-label and three-label strategies. While labeling strategy influenced segmentation metrics, systematic underestimation of TMTV was observed across architectures.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.