Prompt-based Dynamic Token Pruning to Guide Transformer Attention in Efficient Segmentation

June 19, 2025

Authors

Pallabi Dutta,Anubhab Maity,Sushmita Mitra

Abstract

The high computational demands of Vision Transformers (ViTs), in processing a huge number of tokens, often constrain their practical application in analyzing medical images. This research proposes an adaptive prompt-guided pruning method to selectively reduce the processing of irrelevant tokens in the segmentation pipeline. The prompt-based spatial prior helps to rank the tokens according to their relevance. Tokens with low-relevance scores are down-weighted, ensuring that only the relevant ones are propagated for processing across subsequent stages. This data-driven pruning strategy facilitates end-to-end training, maintains gradient flow, and improves segmentation accuracy by focusing computational resources on essential regions. The proposed framework is integrated with several state-of-the-art models to facilitate the elimination of irrelevant tokens; thereby, enhancing computational efficiency while preserving segmentation accuracy. The experimental results show a reduction of $\sim$ 35-55\% tokens; thus reducing the computational costs relative to the baselines. Cost-effective medical image processing, using our framework, facilitates real-time diagnosis by expanding its applicability in resource-constrained environments.

View Source Full Text PDF

Topics

cs.CV

Prompt-based Dynamic Token Pruning to Guide Transformer Attention in Efficient Segmentation

Authors

Abstract

Tags

Topics

Ready to Sharpen Your Edge?