A Single Reference-Guided Adaptation of Foundation Model Predictions for High-Performance Image Segmentation.
Authors
Abstract
Foundation models (FMs) in AI have garnered significant attention and demonstrated substantial potential across a broad range of applications. However, applying them to specific domains like biomedical imaging often necessitates further fine-tuning to achieve desirable prediction accuracy. This process typically requires large labeled datasets and substantial computational resources, posing a major barrier to their widespread adoption. Here, we introduce a fundamentally different strategy, called reference-guided adaptation (RGA), which enables ultra-data-efficient and interpretable FM prediction adaptation for a specific testing (a.k.a. inference) sample using only a single training (a.k.a. reference) example. RGA aligns the reference and inference sample by leveraging their semantic relationship and trains a lightweight refinement model to enhance the FM prediction while keeping the FM intact. We demonstrate the potential of the RGA framework through a series of medical image segmentation studies in different anatomical sites using three popular FMs: SAM, MedSAM, and SAM2. Our results indicate that RGA helps narrow the performance gap between generic FM predictions and task-specific segmentation requirements under limited-data settings by effectively leveraging task-specific knowledge from a single reference, addressing the "last-mile" challenge in FM deployment and paving the way for ultra-data-efficient and explainable AI modeling.