Back to all papers

Structure-Aware DeformConv and Hierarchical Multi-scale Transformer for Medical Image Registration.

February 2, 2026pubmed logopapers

Authors

Wang X,Qin Y,Pan Z,Qiu H,Zhang Z,Zheng Y,Xing L,Yin X,Zhao S

Affiliations (9)

  • College of Electronic and Information Engineering, Hebei University, Baoding, China. [email protected].
  • Key Laboratory of Digital Medical Engineering of Hebei Province, Baoding, China. [email protected].
  • College of Electronic and Information Engineering, Hebei University, Baoding, China.
  • Key Laboratory of Digital Medical Engineering of Hebei Province, Baoding, China.
  • Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China.
  • Affiliated Hospital of Hebei University, Baoding, China.
  • Hebei Key Laboratory of Precise Imaging of Inflammation Related Tumors, Baoding, China.
  • Affiliated Hospital of Hebei University, Baoding, China. [email protected].
  • Hebei Key Laboratory of Precise Imaging of Inflammation Related Tumors, Baoding, China. [email protected].

Abstract

Medical image multimodal registration is not only an indispensable processing step in medical image analysis but also plays a crucial role in disease diagnosis and treatment planning. However, the complex and unknown spatial deformation relationships between different organs and different modalities pose significant challenges to multimodal image registration. To address this problem, this study proposes an unsupervised and discriminator-free multimodal registration method based on a dual loss function-SA-HMT. Specifically, to address the challenge of cross-modal feature matching, the multi-scale skip Transformer module proposed in this study employs a hierarchical architecture to capture multi-scale deformation features. In the shallow network, the multi-scale skip pyramid module extracts modality-independent local structural features through parallel multi-branch convolution, effectively overcoming the differential expression of edges and textures across different modalities. In the deep network, the Transformer module establishes long-range dependencies via self-attention mechanism, enabling adaptive fusion of local deformation features with global semantics and effectively alleviating the matching difficulty of cross-modal structural features. In addition, this study further proposes a structure-aware deformable convolution module. The two-stage joint mechanism of "feature perception-offset generation" enhances the accuracy of feature matching through their progressive collaboration. The effectiveness of SA-HMT has been fully verified in five public data sets (covering chest and abdomen CT-MR, lung CT, brain CT-MR, cardiac MRI) and clinical abdominal data. Compared with the advanced method R2Net, our model achieves improvements in core indicators such as DSC, and the registration accuracy is generally comparable or better.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 9,300+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.