Enhancing CNN regressors with contour encoding and self-supervision for improved 3D/2D X-ray to CT registration in spinal surgery navigation.
Authors
Affiliations (4)
Affiliations (4)
- Suzhou University of Science and Technology, School of Electronic and Information Engineering, Suzhou University of Science and Technology, No. 99, Xuefu Road, Huqiu District, Suzhou City, Jiangsu Province, China, 215009, Suzhou, 215009, China.
- School of Biological Science and Medical Engineering , Southeast University, Medical Engineering Building, Southeast University Longshan Campus, No. 2 Southeast University Road, Jiangning District, Nanjing, Jiangsu Province, 211189, China. 1, Nanjing, Jiangsu, 210096, China.
- School of Artificial Intelligence and Computer Science, Jiangnan University, School of Artificial Intelligence and Computer Science Jiangnan University 1800 Lihu Avenue Wuxi City Jiangsu Province, Wuxi, Jiangsu, 214122, China.
- Department of Orthopaedics, PLA Central Military Command General Hospital, General Hospital of Central Theater Command, PLA Wuhan 430070, Hubei, China, Wuhan, Hubei, 430061, China.
Abstract
With advances in deep learning, regression-based methods have shown promising results in 3D/2D medical image registration. However, strict intraoperative radiation dose constraints produce low-dose X-ray images with severe blur and reduced contrast, significantly degrading registration accuracy and limiting precise image-guided spinal interventions. We propose the Contour Feature Encoding Regressor (CER), a novel end-to-end CNN framework that extracts highly discriminative features directly from binary contour masks of intraoperative X-rays without any restrictions on contour length, shape, or morphology. These contour features are efficiently encoded by a dedicated module and seamlessly fused into the regression pipeline to improve robustness against image degradation. To further enhance pose estimation, CER employs a dual-branch architecture that explicitly decouples rotational and translational parameters, thereby reducing mutual interference and improving overall accuracy. In addition, a self-supervised fine-tuning strategy with a tailored multi-component loss function is introduced to adapt the model to blurred low-dose conditions and minimize residual errors. On low-dose X-ray images, CER achieves a mean target registration error (mTRE) of 1.39 mm-a clinically acceptable threshold-while outperforming state-of-the-art methods in accuracy and enabling real-time performance (0.03--0.06 s per frame on clinically accessible GPUs). These improvements meet the stringent precision and speed requirements of intraoperative navigation, offering strong potential to enhance surgical safety and outcomes in minimally invasive spinal procedures.