Structure-Aware DeformConv and Hierarchical Multi-scale Transformer for Medical Image Registration.
Authors
Affiliations (9)
Affiliations (9)
- College of Electronic and Information Engineering, Hebei University, Baoding, China. [email protected].
- Key Laboratory of Digital Medical Engineering of Hebei Province, Baoding, China. [email protected].
- College of Electronic and Information Engineering, Hebei University, Baoding, China.
- Key Laboratory of Digital Medical Engineering of Hebei Province, Baoding, China.
- Department of Liver Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing, China.
- Affiliated Hospital of Hebei University, Baoding, China.
- Hebei Key Laboratory of Precise Imaging of Inflammation Related Tumors, Baoding, China.
- Affiliated Hospital of Hebei University, Baoding, China. [email protected].
- Hebei Key Laboratory of Precise Imaging of Inflammation Related Tumors, Baoding, China. [email protected].
Abstract
Medical image multimodal registration is not only an indispensable processing step in medical image analysis but also plays a crucial role in disease diagnosis and treatment planning. However, the complex and unknown spatial deformation relationships between different organs and different modalities pose significant challenges to multimodal image registration. To address this problem, this study proposes an unsupervised and discriminator-free multimodal registration method based on a dual loss function-SA-HMT. Specifically, to address the challenge of cross-modal feature matching, the multi-scale skip Transformer module proposed in this study employs a hierarchical architecture to capture multi-scale deformation features. In the shallow network, the multi-scale skip pyramid module extracts modality-independent local structural features through parallel multi-branch convolution, effectively overcoming the differential expression of edges and textures across different modalities. In the deep network, the Transformer module establishes long-range dependencies via self-attention mechanism, enabling adaptive fusion of local deformation features with global semantics and effectively alleviating the matching difficulty of cross-modal structural features. In addition, this study further proposes a structure-aware deformable convolution module. The two-stage joint mechanism of "feature perception-offset generation" enhances the accuracy of feature matching through their progressive collaboration. The effectiveness of SA-HMT has been fully verified in five public data sets (covering chest and abdomen CT-MR, lung CT, brain CT-MR, cardiac MRI) and clinical abdominal data. Compared with the advanced method R2Net, our model achieves improvements in core indicators such as DSC, and the registration accuracy is generally comparable or better.