RT-SAM: Visual-Prompt Fusion and Uncertainty Enhancement for Nasopharyngeal Carcinoma Radiotherapy Target Delineation.

March 3, 2026

DOI: 10.1109/JBHI.2026.3669979 PMID: 41774629

Authors

Khor HG,Yang X,Sun Y,Huang S,Wang Y,Wang J,Wang S,Bai L,Ma L,Liao H

Abstract

Precise delineation of the clinical target volume (CTV) and nodal CTV (CTV$_{{\mathit{nd}}}$) is crucial for effective radiotherapy planning in nasopharyngeal carcinoma (NPC). Manual contouring is labor-intensive and subject to substantial inter-observer variability, particularly in regions with complex anatomy and indistinct boundaries. This study presents RT-SAM, a novel framework that adapts the Medical Segment Anything Model 2 (MedSAM-2) for automated CTV (i.e., primary CTV and CTV$_{nd}$) contouring in NPC computed tomography (CT) images. The framework synergistically integrates a generalist foundation model (MedSAM-2) with a domain-specific specialist network (2D U-Net) through three principal contributions: (1) automated generation of multi-modal prompts-comprising mask, bounding box, and point representations-derived from specialist network predictions to guide the generalist model; (2) a Visual-Prompt Fusion Attention (ViPFA) mechanism that optimizes feature-prompt interactions through bidirectional cross-modal attention; and (3) an Uncertainty-Enhanced Prediction Adjustment (UEPA) mechanism that enhances model robustness via confidence-based refinement and selective domain adaptation. Comprehensive evaluation on a multi-center cohort of 256 clinical NPC cases from Sun Yat-sen University Cancer Center and 212 public NPC cases from the SegRap2025 lymph node CTV dataset using 5- fold cross-validation demonstrates that RT-SAM achieves a mean DICE coefficient of 0.796 $\pm$ 0.033 (mean $\pm$ standard deviation), significantly outperforming current state-of-the-art methods. Clinical validation by eight radiation oncologists demonstrates that RT-SAM contours are clinically indistinguishable from expert delineations in blinded Turing assessments, achieve superior quality ratings in 75% of comparisons with mean scores of 2.73 for RT-SAM versus 2.66 for manual expert contours, and attain clinically acceptable ratings in over 97% of cases. These results demonstrate that RT-SAM is a clinically feasible solution for automated CTV contouring, with strong potential to standardize treatment planning and mitigate inter-observer variability in NPC radiotherapy.

View Source Full Text PDF

Topics

Journal Article

RT-SAM: Visual-Prompt Fusion and Uncertainty Enhancement for Nasopharyngeal Carcinoma Radiotherapy Target Delineation.

Authors

Abstract

Tags

Topics

Ready to Sharpen Your Edge?