Transformer-based cardiac substructure segmentation from contrast and non-contrast computed tomography for radiotherapy planning.

June 2, 2026

papers

DOI: 10.1016/j.phro.2026.101011 PMID: 42293110

Authors

Rangnekar A,Mankuzhy N,Willmann J,Seo Choi CM,Wu A,Thor M,Rimner A,Veeraraghavan H

Affiliations (3)

Department of Medical Physics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA.
Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA.
Department of Radiation Oncology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, German Cancer Consortium (DKTK), Partner Site DKTK-Freiburg, Freiburg, Germany.

Abstract

Accurate segmentation of cardiac substructures on computed tomography (CT) scans is essential for radiotherapy planning. This study evaluated whether pretrained transformers enabled data-efficient training using a fixed architecture with balanced curriculum learning while achieving robust generalization to imaging and patient variations. A hybrid pretrained transformer-convolutional network, self-distilled masked image transformer (SMIT), was fine-tuned using lung cancer patient scans (Cohort I, training N = 180) and tested on held-out Cohort I lung cancer scans (testing N = 60) and breast cancer scans (Cohort II, N = 65). Two configurations were evaluated: SMIT-Balanced (32 contrast-enhanced CTs, 32 non-contrast CTs) and SMIT-Oracle (180 CTs). Performance was compared with nnU-Net and TotalSegmentator. Segmentation accuracy was assessed primarily using the 95th percentile Hausdorff distance (HD95), along with radiation dose and overlap-based metrics as secondary endpoints. SMIT-Balanced approached SMIT-Oracle performance despite using 64% fewer training scans, with mean HD95 of 6.6 versus 5.4 mm in Cohort I and 10.0 versus 9.4 mm in Cohort II. On the Cohort I held-out test set, SMIT-Balanced mean HD95 was within 1.0 mm of nnU-Net. Cross-cohort testing showed larger accuracy degradation with nnU-Net than SMIT-Balanced (62% versus 50%, absolute change 4.5 mm versus 3.4 mm). Dose metrics derived from SMIT-Balanced were equivalent to manual delineations. Balanced curriculum training reduced labeled data requirements within the SMIT architecture. SMIT-Balanced was comparable to nnU-Net on Cohort I held-out data and showed smaller cross-cohort HD95 degradation.

View Source Full Text PDF

Topics

Journal Article

Transformer-based cardiac substructure segmentation from contrast and non-contrast computed tomography for radiotherapy planning.

Authors

Affiliations (3)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?