Dosimetric assessment of deep learning based organ-at-risk segmentation: insights from the HaN-Seg challenge.
Authors
Affiliations (4)
Affiliations (4)
- University of Ljubljana, Faculty Electrical Engineering, Tržaška cesta 25, Ljubljana 1000, Slovenia. Electronic address: [email protected].
- University of Ljubljana, Faculty Electrical Engineering, Tržaška cesta 25, Ljubljana 1000, Slovenia; University of Copenhagen, Department of Computer Science, Universitetsparken 1, Copenhagen 2100, Denmark.
- Institute of Oncology Ljubljana, Zaloška cesta 2, Ljubljana 1000, Slovenia.
- University of Ljubljana, Faculty Electrical Engineering, Tržaška cesta 25, Ljubljana 1000, Slovenia.
Abstract
To extend the previously reported geometric analysis of HaN-Seg: The Head and Neck Organ-at-Risk CT and MR Segmentation Challenge by integrating a dosimetric evaluation, thereby offering a comprehensive assessment of challenge results with practical insights into their clinical applicability. Participating teams of the HaN-Seg challenge were tasked to auto-segment 30 organs-at-risk (OARs) in the head and neck region using paired contrast-enhanced computed tomography and T1-weighted magnetic resonance images. The teams were ranked according to their geometric performance, measured by the Dice similarity coefficient (DSC) and 95th-percentile Hausdorff distance (HD<sub>95</sub>). Here, we extend this evaluation with a forward dosimetric analysis, also known as dosimetric impact approximation, including the verification of OAR dosimetric restriction compliance, assessment of OAR priority ratings, evaluation of segmentation performance relative to tumor proximity, and correlation analysis between geometric and dosimetric metrics. All six teams from the previous geometric analysis were assessed for dosimetric performance on the original 14 test cases. Dosimetric analysis revealed minor performance differences among teams, with the best- and worst-performing teams achieving dosimetric compliance in 70.7% and 67.7% of OAR auto-segmentations, respectively. Most teams successfully met priority 1 dosimetric restrictions including the spinal cord, brainstem, optic chiasm, and optic nerves in 11 out of 14 test cases. The lowest compliance rates were observed for the oral cavity and submandibular glands. Correlation analysis revealed no clear relationship between geometric and dosimetric metrics. The high dosimetric compliance highlights the practical utility of deep learning OAR auto-segmentation methods. Lower compliance for the oral cavity and submandibular glands most probably stems from their proximity to tumors and the corresponding steep dose gradients, where certain dosimetric constraints are inherently challenging to meet in clinical practice, or from the limitations of the forward dosimetric analysis. These findings underpin the critical need for both geometric and dosimetric evaluations of OAR auto-segmentation tools to ensure robust validation. Such a comprehensive assessment will be essential as commercial deep learning tools become increasingly integrated into the radiotherapy planning workflow.