Back to all papers

Self-supervised learning enhances periapical films segmentation with limited labeled data.

Authors

Hu M,Zhang Q,Wei Z,Jia P,Yuan M,Yu H,Yin XC,Peng J

Affiliations (3)

  • School of Computer and Communication Engineering, University of Science and Technology Beijing, No. 30 Academy Road, Haidian District, Beijing 100083, China.
  • Department of the Fourth Clinical Division, Peking University School and Hospital of Stomatology & National Center of Stomatology & National Clinical Research Center for Oral Disease & National Engineering Laboratory for Digital and Material Technology of Stomatology, No. 41, Jiatai International Building, Dongsihuanzhong Road, Chaoyang District, Beijing 100081, China.
  • School of Computer and Communication Engineering, University of Science and Technology Beijing, No. 30 Academy Road, Haidian District, Beijing 100083, China. Electronic address: [email protected].

Abstract

To overcome reliance on large-scale, costly labeled datasets and annotation variability for accurate periapical film segmentation. This study develops a self-supervised learning framework requiring limited labeled data, enhancing practical applicability while reducing extensive manual annotation efforts. This research proposes a two-stage framework: 1) Self-supervised pre-training. A Vision Transformer (ViT), initialized with weights from the DINOv2 model pre-trained on 142M natural images (LVD-142M), undergoes further self-supervised pre-training on our dataset of 74,292 unlabeled periapical films using student-teacher contrastive learning. 2) Fine-tuning adapts these features for segmentation. The domain-adapted ViT is fine-tuned with a Mask2Former head on only 229 labeled films for segmenting seven critical dental structures (tooth, pulp, crown, fillings, root canal fillings, caries, periapical lesions). The domain-adapted self-supervised method significantly outperformed traditional fully-supervised models like U-Net and DeepLabV3+ (average Dice coefficient: 74.77% vs 33.53%-41.55%; 80%-123% relative improvement). Comprehensive comparison with cutting-edge SSL methods through cross-validation demonstrated the superiority of our DINOv2-based approach (74.77 ± 1.87%) over MAE (72.53 ± 1.90%), MoCov3 (65.92 ± 1.68%) and BEiTv3 (65.17 ± 1.77%). The method surpassed its supervised Mask2Former counterparts with statistical significance (p<0.01). This two-stage, domain-specific self-supervised framework effectively learns robust anatomical features. It enables accurate, reliable periapical film segmentation using very limited annotations. The approach addresses the challenge of labeled data scarcity in medical imaging. This approach provides a feasible pathway for developing AI-assisted diagnostic tools. It can improve diagnostic accuracy through consistent segmentation and enhance workflow efficiency by reducing manual analysis time, especially in resource-constrained dental practices.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.