DDVMM: A dual-branch pyramid model for mono-modal medical image registration.
Authors
Affiliations (3)
Affiliations (3)
- School of Artificial Intelligence (CUIT Shuangliu Industrial College), Chengdu University of Information Technology, Chengdu, 610225, Sichuan, China; Advanced Cryptography and System Security Key Laboratory of Sichuan Province, Chengdu University of Information Technology, Chengdu, 610225, Sichuan, China. Electronic address: [email protected].
- School of Artificial Intelligence (CUIT Shuangliu Industrial College), Chengdu University of Information Technology, Chengdu, 610225, Sichuan, China.
- Ophthalmology Department, West China Hospital of Sichuan University, Chengdu, 610041, Sichuan, China.
Abstract
Medical image registration is pivotal in clinical diagnosis and image-guided intervention. However, convolutional neural network (CNN)- and Transformer-based approaches still face challenges in achieving high registration efficiency and accuracy simultaneously, particularly when dealing with large deformations and significant anatomical differences. Therefore, a more robust and efficient registration framework is highly desirable for medical image analysis. We propose DDVMM, a dual-branch pyramidal registration model integrating Transformer and State Space Models (SSMs), to enhance multi-scale feature extraction and global structural perception. The encoder incorporates a Directional Depthwise Convolution (DDC) module and a Deformable Multi-Head Self-Attention (D-MHSA) module to enhance local boundary and directional feature extraction, as well as long-range spatial dependency modeling. The decoder employs a State Space Model-based guided feature interaction module (SSM Decoder) that captures large-scale spatial context through hidden state propagation and fuses multi-scale features to improve structural alignment and detail refinement. The network adopts a context-guided coarse-to-fine deformation estimation strategy, progressively refining deformation predictions to improve overall registration accuracy. Extensive experiments are conducted on two public datasets: the ACDC cardiac dataset and the OASIS-1 brain dataset. The results demonstrate that the proposed method achieves higher average Dice similarity scores and lower average boundary errors compared with the competing methods. This work presents a novel dual-branch pyramidal registration framework that effectively combines Transformer and State Space Models(SSMs) to enhance both global structural perception and local detail alignment. The proposed DDVMM achieves better registration performance in handling large deformations and anatomical differences while using fewer parameters, resulting in a better performance-parameter trade-off. Code is available at https://github.com/wyl32123/DDVMM.