LEARNABLE HIERARCHICAL VISUAL CONTEXTS FOR TUMOR SEGMENTATION IN COMPUTED TOMOGRAPHY IMAGES.

May 20, 2026

papers

DOI: 10.1109/isbi61048.2026.11515689 PMID: 42367197

Authors

Jiang J,Veeraraghavan H

Affiliations (1)

Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, 10065 NY, USA.

Abstract

Despite advances in deep learning (DL), automated tumor segmentation on computed tomography (CT) scans remains challenging for radiotherapy applications due to due to large variability in tumor shapes, appearance, and diffuse boundaries. We present LeVal-learnable visual query contexts that refines attention towards tumor-relevant regions for improved segmentation. LeVal combines task-agnostic learnable tokens called semantic contexts with task-specific query tokens. Semantic contexts cross attend to multi-scale features of a 3D Swin transformer encoder, which are jointly subject to 2-stage pretraining: (a) self-supervised learning (SSL) using 14,000 unlabeled CTs and (b), supervised pretraining for multi-organ segmentation use pseudo-contours generated by bespoke methods. Task queries are refined through cross-attention with semantic contexts, which then modulate the decoder output to generate segmentation. LeVal was evaluated across four public datasets involving pancreas, colon, adrenal, and head-and-neck cancers. It consistently outperformed existing methods. Leval also demonstrated stronger embedding separation between tumor and surrounding healthy tissues, indicating better discriminability. Code and model checkpoints will be made available through GitHub upon manuscript acceptance.

View Source Full Text PDF

Topics

Journal Article

LEARNABLE HIERARCHICAL VISUAL CONTEXTS FOR TUMOR SEGMENTATION IN COMPUTED TOMOGRAPHY IMAGES.

Authors

Affiliations (1)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?