Memory efficient training for 3D brain image registration networks using PatchMorph.
Authors
Affiliations (6)
Affiliations (6)
- Department of Informatics,Faculty of Informatics, Matsuyama University, Matsuyama, Ehime, Japan. [email protected].
- Brain Image Analysis Unit, RIKEN Center for Brain Science, Wako, Saitama, Japan. [email protected].
- Institute of Fundamental Technological Research, Polish Academy of Sciences, Warsaw, Poland.
- Laboratory for Molecular Mechanisms of Brain Development, RIKEN Center for Brain Science, Wako, Saitama, Japan.
- Department of Biological Functions and Regulation, CIEM (Central Institute of Experimental Medicine and Life Science), Kawasaki, Japan.
- Department of Radiology, Medical Physics, and Department of Stereotactic and Functional Neurosurgery, Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
Abstract
We present PatchMorph, a novel stochastic framework designed to significantly reduce memory demand during training and inference of existing unsupervised convolutional and transformer-based registration networks for 3D brain image registration. PatchMorph extends existing methods to handle global registration tasks that are impractical with large image sizes and differing array dimensions, and is also designed to handle varying voxel resolutions, which are common in real-world datasets. It supports the integration of a VoxelMorph-like architecture, CNN or Transformer, and, in experiments, reduced the training memory requirement of a transformer-based network on [Formula: see text] voxel images from around 40 GB to under 10 GB while maintaining performance. PatchMorph decouples spatial logic from network architecture. It operates by matching patches of constant size across multiple scales, from coarse to fine, combining global transformations with local deformations. Unlike conventional multiscale cascade networks, PatchMorph performs geometrically linked, patchwise cascading in world coordinate space. Patch placement and resolution are managed independently of the registration network, enabling the reuse of compact networks with minimal memory footprint. At finer scales, each patch zooms in on a region of a coarser level patch, ensuring spatial hierarchy and continuity. This patch-based, coarse-to-fine refinement in world coordinates is technically challenging, as patches at different scales capture varying image content and must propagate deformation information coherently. PatchMorph addresses these challenges through a modular architecture that preserves spatial consistency across scales and orientations. Experiments on human T1 MRI and marmoset brain images from serial two-photon tomography demonstrate that PatchMorph achieves state-of-the-art registration performance while maintaining a significantly reduced memory footprint. By enabling the use of sophisticated architectures on large, high-resolution, or heterogeneously sampled data, PatchMorph removes a major bottleneck in the development and deployment of deep learning-based image registration networks. While this patch-based approach incurs a higher inference time than single-shot full-volume networks, it remains significantly faster than classical iterative methods.