Pro-Tuning: Prototype Tuning of Foundation Models for Volumetric Medical Image Segmentation.
Authors
Abstract
Accurate volumetric medical image segmentation plays a crucial role in identifying and analyzing human organs, tissues, or areas of lesions. It serves as a critical foundation for clinical diagnosis and treatment planning, guides surgical procedures, and facilitates early disease intervention. Recently, foundation models have been extensively implemented in the field of volumetric medical image segmentation, achieving remarkable outcomes. Nevertheless, neither the direct application of foundation models nor fine-tuning them with point or box prompts for specific medical segmentation tasks has yielded satisfactory results. In this paper, we propose a simple yet efficient method named Pro-Tuning for tuning medical foundation models. By utilizing a pre-trained Prototype Insight Network, prototype features are extracted at the semantic level of the target organ without introducing additional prompts. Furthermore, to overcome the observed limitations in the applicability of Prototype Insight Network and adapt them to specific tasks in areas with densely populated multiple target organs, we introduce a Prototype Projection Network that employs target position-encoded image embeddings to predict two projection parameters to tailor prototype features. Without additional prompts, our method greatly improves the tuning performance of medical foundation models in specific volumetric segmentation tasks. We validate the performance of our framework on 13 medical datasets covering brain, neck, chest, and abdomen regions. Our method surpasses other foundation model fine-tuning methods on ten major organs in the dataset, achieving an average Dice score of 83.26%.