Text-guided few-shot liver and tumor segmentation.

May 29, 2026

papers

DOI: 10.3389/fdgth.2026.1823870 PMID: 42294051

Authors

Chen H,Xu A,Zhang L,Ouyang M,Zhu Z,Wang J

Affiliations (3)

Nantong Tumor Hospital/Nantong University Affiliated Tumor Hospital, Nantong, Jiangsu, China.
School of Electronic Information, Wuhan University of Science and Technology, Wuhan, China.
Manteia Technology, Co., Ltd., Xiamen, Fujian, China.

Abstract

High-precision liver and tumor segmentation is a cornerstone of digital oncology, yet its clinical deployment remains constrained by two persistent challenges: the scarcity of pixel-level annotations and severe performance degradation under cross-center domain shift. Although few shot learning offers a promising direction for data-efficient modeling, existing approaches relying solely on visual similarity often fail to generalize across heterogeneous clinical environments. In this study, we propose a text-guided few-shot segmentation framework that integrates clinical semantic information as an explicit inductive bias to bridge the gap between low-level pixels and high-level diagnostic reasoning. The framework consists of three synergistic modules: an Automated Semantic Generator that employs a large-scale vision-language model to encode clinically structured semantic descriptions into structured linguistic priors. A Text-Guided Gating (TGG) mechanism that adaptively modulates visual representations to filter scanner-dependent artifacts, and a Decoupled Prototype Learner that constructs unbiased class prototypes via per-image averaging and gradient detachment to address extreme class imbalance. Together, these components transform segmentation from a pure visual matching task into a knowledge guided reasoning process, ensuring both data efficiency and improved cross-dataset robustness across heterogeneous datasets. Experiments on LiTS and 3DIRCADb show that the method consistently outperforms state-of-the-art supervised, foundation-model-based, and few-shot baselines. On 3DIRCADb, compared to PANet, the strongest few-shot baseline, our method improves external liver Dice by 8.7 percentage points and external tumor Dice by 26.3 percentage points, effectively mitigating performance degradation in conventional supervised models. These results demonstrate that cross-modal semantic guidance enhances robust medical image segmentation under domain shift.

View Source Full Text PDF

Topics

Journal Article

Text-guided few-shot liver and tumor segmentation.

Authors

Affiliations (3)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?