General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

June 24, 2025

arXiv: 2506.19552v1

Authors

Jakob Ambsdorf,Asbjørn Munk,Sebastian Llambias,Anders Nymark Christensen,Kamil Mikolaj,Randall Balestriero,Martin Tolsgaard,Aasa Feragen,Mads Nielsen

Abstract

With access to large-scale, unlabeled medical datasets, researchers are confronted with two questions: Should they attempt to pretrain a custom foundation model on this medical data, or use transfer-learning from an existing generalist model? And, if a custom model is pretrained, are novel methods required? In this paper we explore these questions by conducting a case-study, in which we train a foundation model on a large regional fetal ultrasound dataset of 2M images. By selecting the well-established DINOv2 method for pretraining, we achieve state-of-the-art results on three fetal ultrasound datasets, covering data from different countries, classification, segmentation, and few-shot tasks. We compare against a series of models pretrained on natural images, ultrasound images, and supervised baselines. Our results demonstrate two key insights: (i) Pretraining on custom data is worth it, even if smaller models are trained on less data, as scaling in natural image pretraining does not translate to ultrasound performance. (ii) Well-tuned methods from computer vision are making it feasible to train custom foundation models for a given medical domain, requiring no hyperparameter tuning and little methodological adaptation. Given these findings, we argue that a bias towards methodological innovation should be avoided when developing domain specific foundation models under common computational resource constraints.

View Source Full Text PDF

Topics

cs.CV

General Methods Make Great Domain-specific Foundation Models: A Case-study on Fetal Ultrasound

Authors

Abstract

Tags

Topics

Ready to Sharpen Your Edge?