Back to all papers

Text-guided few-shot liver and tumor segmentation.

May 29, 2026pubmed logopapers

Authors

Chen H,Xu A,Zhang L,Ouyang M,Zhu Z,Wang J

Affiliations (3)

  • Nantong Tumor Hospital/Nantong University Affiliated Tumor Hospital, Nantong, Jiangsu, China.
  • School of Electronic Information, Wuhan University of Science and Technology, Wuhan, China.
  • Manteia Technology, Co., Ltd., Xiamen, Fujian, China.

Abstract

High-precision liver and tumor segmentation is a cornerstone of digital oncology, yet its clinical deployment remains constrained by two persistent challenges: the scarcity of pixel-level annotations and severe performance degradation under cross-center domain shift. Although few shot learning offers a promising direction for data-efficient modeling, existing approaches relying solely on visual similarity often fail to generalize across heterogeneous clinical environments. In this study, we propose a text-guided few-shot segmentation framework that integrates clinical semantic information as an explicit inductive bias to bridge the gap between low-level pixels and high-level diagnostic reasoning. The framework consists of three synergistic modules: an Automated Semantic Generator that employs a large-scale vision-language model to encode clinically structured semantic descriptions into structured linguistic priors. A Text-Guided Gating (TGG) mechanism that adaptively modulates visual representations to filter scanner-dependent artifacts, and a Decoupled Prototype Learner that constructs unbiased class prototypes via per-image averaging and gradient detachment to address extreme class imbalance. Together, these components transform segmentation from a pure visual matching task into a knowledge guided reasoning process, ensuring both data efficiency and improved cross-dataset robustness across heterogeneous datasets. Experiments on LiTS and 3DIRCADb show that the method consistently outperforms state-of-the-art supervised, foundation-model-based, and few-shot baselines. On 3DIRCADb, compared to PANet, the strongest few-shot baseline, our method improves external liver Dice by 8.7 percentage points and external tumor Dice by 26.3 percentage points, effectively mitigating performance degradation in conventional supervised models. These results demonstrate that cross-modal semantic guidance enhances robust medical image segmentation under domain shift.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.