Back to all papers

Foundation model-enhanced unsupervised 3D deformable medical image registration.

January 8, 2026pubmed logopapers

Authors

Jiang Z,Zhang Z,Xing L,Ren L,Dai X

Affiliations (5)

  • Radiation Oncology, Stanford University, 875 Blake Wilbur Drive, Stanford, California, 94305-5101, UNITED STATES.
  • Duke University, Duke University, Durham, 27708-0187, UNITED STATES.
  • Department of Radiation Oncology, Stanford University Medical Center, 875 Blake Wilbur Drive, Stanford, California, CA94305-5847, Stanford, California, 94305-5847, UNITED STATES.
  • Department of Radiation Oncology, University of Maryland Baltimore School of Medicine, University of Maryland School of Medicine, Baltimore, Maryland, 21201, UNITED STATES.
  • Department of Radiation Oncology, Stanford University, Stanford University, Stanford, 94305-2004, UNITED STATES.

Abstract

Unsupervised deep learning has shown great promise in deformable image registration (DIR). These methods update model weights to optimize image similarity without requiring ground truth deformation vector fields (DVFs). However, they inherently face the ill-conditioning challenges due to structural ambiguities. This study aims to address these issues by integrating the implicit anatomical understanding of vision foundation models into a multi-scale unsupervised framework for accurate and robust DIR. Our method takes moving and fixed images as inputs and leverages a pre-trained encoder from a vision foundation model to extract latent features. These features are merged with those extracted by convolutional adaptors to incorporate inductive bias. Correlation-aware multi-layer perceptrons decode the features into DVFs. A pyramid architecture is implemented to capture multi-range dependencies, further enhancing the DIR robustness and accuracy. We evaluated our method using a multi-modality, cross-institutional database consisting of 150 cardiac cine MR and 40 liver CT. Our model generates realistic and accurate DVFs. Moving images deformed by our method showed excellent similarity to fixed images, achieving a registration Dice score of 0.869 ± 0.093 for cardiac MRI and an average landmark error of 1.60±1.44 mm for liver CT, substantially surpassing the state-of-the-art methods. Ablation studies further verified the effectiveness of integrating foundation features to improve DIR accuracy (p<0.05). Our novel approach demonstrates significant advancements in DIR for multi-modality images with complex structures and low contrasts, making it a powerful tool for a wide range of applications in medical image analysis.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 8,100+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.