CeLR: A Transformer-based Regression Network for Accurate Cephalometric Landmark Detection in High-Resolution X-ray Imaging.
Authors
Abstract
Reliable localization of cephalometric landmarks is essential for automated orthodontic analysis. Existing approaches are limited by high computational cost or complex multi-stage pipelines, hindering end-to-end optimization. In this work, we propose an end-to-end Transformer-based Cephalometric Landmark Regression network (CeLR) for high-resolution X-ray images. First, a feature extractor captures both low-level anatomical details and high-level semantic information. Then, a reference encoder models global dependencies and generates coarse landmark estimates. Next, a finetune decoder employs cross-attention mechanisms to refine these predictions. Additionally, a denoising module is introduced to enhance the robustness of the model. Experiments on multiple public cephalometric datasets demonstrate that CeLR achieves state-of-the-art performance. Specifically, on the ISBI 2015 Challenge Test1 dataset, CeLR achieves an Mean Radial Error (MRE) of 0.98 mm and a 2 mm Success Detection Rate (SDR) of 89.82%. Moreover, it maintains a computational cost of only 91.3 GFLOPs, demonstrating a satisfactory balance between accuracy and efficiency. These results highlight the effectiveness and clinical potential of the proposed method. Code is available at: https://github.com/huang229/D-CeLR.