Back to all papers

FM-Adapt: Foundation model adaptation with photoacoustic-supervised learning for interventional ultrasound.

March 21, 2026pubmed logopapers

Authors

Hasan J,Rajendran P,Cai Y,Pramanik M

Affiliations (3)

  • Department of Computer Science, Iowa State University, Ames, IA, USA.
  • Department of Radiation Oncology, Winship Cancer Institute, Emory University, Atlanta, GA, USA.
  • Department of Electrical and Computer Engineering, Iowa State University, Ames, IA, USA.

Abstract

Foundation models (FMs), such as the Segment Anything Model (SAM), have remarkable capabilities for general-purpose segmentation tasks through large-scale pre-training. However, a substantial domain shift limits their effectiveness in complex medical imaging. Here we introduce FM-Adapt, the first parameter-efficient adaptation of a FM (SAM-based vision transformer) into a resolution-agnostic architecture with photoacoustic (PA)-supervised learning for dual-target interventional ultrasound (US) segmentation. We demonstrate FM-Adapt in the context of PA-supervised interventions, specifically for US-guided needle tracking and simultaneous target identification (breast tumor segmentation). We train once with this unified adaptation framework to produce two specialized model weights: USPA-SAM for real-time tracking of needles and BT-SAM for segmenting breast tumors. This framework utilizes frozen pre-trained encoder components and fine-tunes only the mask decoder, allowing the model to process native (256 × 256) clinical images without spatial degradation while achieving state-of-the-art performance with high computational efficiency. USPA-SAM achieves a mean modified Hausdorff Distance (MHD) of 0.34 mm, a targeting error (TE) of 0.83 mm, and a 100% needle localization success rate (NLSR), outperforming baselines by a factor of 3- <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>17</mn> <mo>×</mo></mrow> </math> in spatial precision. Notably, on tumor segmentation, BT-SAM achieves Dice scores of 93.6% and 96.3%, along with IoU scores of 89.2% and 94.0%, demonstrating strong generalization to unseen data. This work demonstrates that our models achieve a <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mn>27</mn> <mo>×</mo></mrow> </math> improvement in computational efficiency to process native clinical images at 34 FPS on a single GPU to enable real-time clinical adaptation.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.