Evaluation of 2D and 3D nnU-Net models with two-label and three-label strategies for automatic segmentation and total metabolic tumor volume estimation of metastatic differentiated thyroid carcinoma on FDG-PET/CT.

May 20, 2026

papers

DOI: 10.1007/s11604-026-02006-5 PMID: 42159908

Authors

Li Y,Endo H,Hirata K,Takenaka J,Tang M,Watanabe S,Kimura R,Kudo K

Affiliations (8)

Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 14, Nishi 5, Kita-ku, Sapporo, 060-8648, Japan.
Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 14, Nishi 5, Kita-ku, Sapporo, 060-8648, Japan. [email protected].
Department of Nuclear Medicine, Hokkaido University Hospital, Sapporo, Japan. [email protected].
Global Center for Biomedical Science and Engineering, Faculty of Medicine, Hokkaido University, Sapporo, Japan. [email protected].
Department of Nuclear Medicine, Hokkaido University Hospital, Sapporo, Japan.
Department of Nuclear Medicine and Comprehensive Heart Failure Center, University Hospital Würzburg, Würzburg, Germany.
Global Center for Biomedical Science and Engineering, Faculty of Medicine, Hokkaido University, Sapporo, Japan.
Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, Sapporo, Japan.

Abstract

To evaluate the segmentation performance and total metabolic tumor volume (TMTV) prediction accuracy of 2D and 3D nnU-Net models under two-label and three-label strategies for metastatic differentiated thyroid carcinoma (DTC) on FDG PET/CT images. A total of 194 patients with FDG-avid metastatic DTC who underwent PET/CT prior to iodine-131 treatment (2009-2022) were retrospectively analyzed. The dataset was divided into Cohort 1 (n = 160) for five-fold cross-validation and Cohort 2 (n = 34) for independent testing. Both 2D and 3D nnU-Net architectures were trained under two-label and three-label schemes. Segmentation performance was assessed using the Dice similarity coefficient (DSC). TMTV prediction was evaluated using the coefficient of determination (R2) and error analyses. Under the two-label scheme, mean DSC values in Cohort 1 were 0.63 ± 0.28 (2D) and 0.60 ± 0.34 (3D), and in Cohort 2 were 0.60 ± 0.31 and 0.50 ± 0.32, respectively. Under the three-label scheme, mean DSC values in Cohort 1 were 0.66 ± 0.28 (2D) and 0.70 ± 0.30 (3D), and in Cohort 2 were 0.61 ± 0.33 and 0.61 ± 0.35, respectively. For TMTV prediction, R2 values in Cohort 1 were 0.33 (2D) and 0.06 (3D) under the two-label scheme, while 0.20 (2D) and 0.12 (3D) under the three-label scheme. In Cohort 2, R2 values were 0.87 (2D) and 0.80 (3D) for the two-label scheme, and 0.86 (2D) and 0.84 (3D) for the three-label scheme. Error analyses demonstrated systematic underestimation of TMTV across architectures and labeling strategies. The 2D and 3D nnU-Net models demonstrated comparable performance for segmentation and TMTV prediction under both two-label and three-label strategies. While labeling strategy influenced segmentation metrics, systematic underestimation of TMTV was observed across architectures.

View Source Full Text PDF

Topics

Journal Article

Evaluation of 2D and 3D nnU-Net models with two-label and three-label strategies for automatic segmentation and total metabolic tumor volume estimation of metastatic differentiated thyroid carcinoma on FDG-PET/CT.

Authors

Affiliations (8)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?