Patch2Space: a registration-free segmentation method for misaligned multimodal medical images.
Authors
Affiliations (5)
Affiliations (5)
- Beihang University, No. 37, Xueyuan Road, Haidian district, Beijing, 100191, CHINA.
- School of Computer, Beihang University, No. 37, Xueyuan Road Haidian district Beijing, CN 100191, Beijing, 100091, CHINA.
- The Musculoskeletal Tumor Center, Peking University, No.11 Xizhimen South Street, Xicheng District, Beijing, Beijing, 100871, CHINA.
- Chinese PLA General Hospital, Fuxing road No. 28, Haidian district, Beijing, Beijing, 100853, CHINA.
- The Musculoskeletal Tumor Center, Peking University, No.11 Xizhimen South Street, Xicheng District, Beijing, P.R.China, Beijing, 100871, CHINA.
Abstract
Multimodal images contain complementary information that is valuable for deep learning (DL)-based image segmentation. To enable effective multimodal feature learning and fusion for accurate segmentation, multimodal images usually need to be registered to achieve anatomical alignment.However, in clinical settings, multimodal image registration is often challenging. For instance, to reduce radiation exposure, CT scans usually have a smaller field of view (FoV) than MR, i.e., inconsistent anatomical content in CT and MR images, hindering accurate registration. Using such misaligned multimodal images, segmentation performance could be significantly degraded. This study aims to develop a DL-based multimodal image segmentation method that is capable of learning high-quality and strongly related image features from misaligned multimodal images without registration and produce accurate segmentation results comparable to that obtained with well-aligned multimodal images. In our method, a unified body space (UBS) module is presented, where image patches cropped from misaligned modalities are encoded to positions and projected into a unified body space, thereby largely mitigating the misalignment among multimodal images. Built upon the UBS module, a new spatial-attention is proposed and integrated into a multilevel feature fusion (MFF) module, where features learned from misaligned multimodal images are effectively fused at internal-, spatial-, and modal-levels, leading the segmentation of misaligned multimodal images to a high accuracy level. We validate our method on both public and inhouse multimodal image datasets containing 1472 patients. Experimental results demonstrate that our method outperforms state-of-the-art (SOTA) methods. The ablation study further confirms that the UBS modules can accurately project image patches from different modalities into the unified body space. Moreover, the internal-, spatial-, and modal-level feature fusion in the MFF module substantially enhances segmentation accuracy for misaligned multimodal images. Codes are available at https://github.com/BH-MICom/Patch2Space.