MedSapiens: Taking a pose to rethink medical imaging landmark detection.

March 7, 2026

papers

DOI: 10.1016/j.media.2026.104015 PMID: 41830658

Authors

Elbatel M,Wang A,Liu K,Mouheb K,Almar-Munoz E,Lin L,Yang Y,Lekadir K,Li X

Affiliations (9)

The Hong Kong University of Science and Technology, Hong Kong, China. Electronic address: [email protected].
The Hong Kong University of Science and Technology, Hong Kong, China. Electronic address: [email protected].
The University of Hong Kong, Hong Kong, China. Electronic address: [email protected].
Erasmus MC, Rotterdam, Netherlands. Electronic address: [email protected].
Medical University of Innsbruck, Innsbruck, Austria. Electronic address: [email protected].
The University of Hong Kong, Hong Kong, China. Electronic address: [email protected].
The University of Hong Kong, Hong Kong, China.
Artificial Intelligence in Medicine Lab, Universitat de Barcelona, Barcelona, Spain; InstituciÃş Catalana de Recerca i Estudis AvanÃğats, Barcelona, Spain. Electronic address: [email protected].
The Hong Kong University of Science and Technology, Hong Kong, China. Electronic address: [email protected].

Abstract

Accurate anatomical landmark detection is crucial for medical image analysis, yet progress in the medical domain is constrained by the scarcity of large, diverse datasets and by methods that rely heavily on domain-specific priors. Notably, human landmark detection models trained on large and diverse datasets offer spatial localization abilities that conceptually align with medical landmark detection. In this study, we investigate the adaptation of Sapiens, a human-centric foundation model designed for pose estimation, to medical imaging through a multi-dataset pretraining strategy, establishing new state-of-the-art performance across multiple benchmarks. Our proposed model, MedSapiens, demonstrates that human-centric foundation models, originally optimized for spatial pose localization, provide strong and transferable priors for anatomical landmark detection. We evaluate MedSapiens across six tasks spanning three imaging modalities. On the internal landmark detection benchmarks, MedSapiens achieves up to 5.26% improvement over generalist foundation models and up to 21.81% improvement over specialist methods. To assess cross-domain generalization, we further evaluate MedSapiens on two novel external downstream tasks: a dental CBCT landmark detection task and an echocardiography video measurement estimation task. MedSapiens achieves a 2.69% relative gain in success detection rate on the dental CBCT task and up to 43% reduction in measurement error compared with state-of-the-art methods. Code and model weights are available at https://github.com/xmed-lab/MedSapiens.

View Source Full Text PDF

Topics

Anatomic LandmarksImage Processing, Computer-AssistedJournal Article

MedSapiens: Taking a pose to rethink medical imaging landmark detection.

Authors

Affiliations (9)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?