Back to all papers

RMT-match: an unsupervised 3D medical image registration network based on RMT and wavelet convolution.

June 10, 2026pubmed logopapers

Authors

Shen J,Wei G,Tian Y

Affiliations (2)

  • Business School, University of Shanghai for Science and Technology, Shanghai 200093, People's Republic of China.
  • School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, People's Republic of China.

Abstract

Deformable image registration plays a crucial role in the field of medical image analysis. Although medical image registration (MIR) models based on vision transformer (ViT) can establish long-range dependencies between patches, their core component, self-attention, lacks important spatial priors. Meanwhile, the down-sampling operation similar to the U-Net structure in MIR tasks leads to the loss of spatial information. Traditional convolutional down-sampling is more sensitive to high-frequency information. Although high-frequency information such as image edges and details is more important for 3D MIR tasks, information containing low-frequency components like the overall image contour is also crucial during the multi-scale sampling process. To address these issues, this paper proposes a novel deformable MIR framework called RMT-Match. This framework extends the traditional Retentive Networks Meet vision transformers (RMT) structure to a 3D form. It refines the spatial attenuation matrix in RMT based on the Manhattan distance to enhance the self-attention mechanism, thereby introducing 3D spatial prior information. Moreover, it adopts the attention decomposition form of RMT to alleviate the burden of global modeling. In addition, this paper proposes a 3D wavelet convolutional down-sampling module and achieves multi-frequency responses to the input data to make up for the deficiencies of ordinary convolutional down-sampling. After thorough experimental validation on the IXI and OASIS datasets, the RMT network has proven to be a promising approach in the field of MIR. Compared to the traditional CNN-based VoxelMorph method, RMT achieved performance improvements of 5.3% on the IXI dataset and 2.7% on the OASIS dataset. Furthermore, compared to the state-of-the-art transformer-based method TransMatch, it achieved performance gains of 0.1% and 0.7% while reducing the parameter count by 40%, thereby balancing model performance with computational efficiency. These results further confirm the significant advantages and potential of this method for MIR tasks.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.