Back to all papers

Multiteacher Knowledge Distillation for Canine Scoring Using Dental Panoramic Radiographs to Support Primary Care.

July 3, 2026pubmed logopapers

Authors

C H R,Mayya V,Sivakumar V,Patil V,Singhal A

Affiliations (4)

  • Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India.
  • Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India. Electronic address: [email protected].
  • Department of Oral Medicine and Radiology, Manipal College of Dental Sciences, Manipal Academy of Higher Education, Manipal, India.
  • College of Dentistry, University of Florida, Gainesville, Florida, USA.

Abstract

Automated classification of radiographic findings remains challenging due to limited annotated datasets and interobserver variability. Artificial intelligence-assisted diagnostic technologies can improve accessibility to health care, especially in areas with limited radiologic expertise, while contributing to universal health care. This study examined whether a student model taught by multiteacher knowledge distillation (KD) outperformed individual teacher models in a 4-class radiographic grading task for impacted maxillary canines on panoramic radiographs. Three teacher ResNet18 models were trained independently using different preprocessing versions from the identical dental panoramic radiograph (DPR) dataset: jaw region-of-interest crops, sharpness-enhanced crops, and original full DPR images. A student ResNet18 was then trained using KD, with averaged soft logits from all 3 frozen teachers. All models used ImageNet-pretrained weights and were tested on a hold-out test set (n = 92). Teacher models obtained test accuracies of 63.04%, 57.61%, and 59.78% (area under the curve [AUC], 0.73, 0.72, and 0.80). The student model surpassed all teachers, with 79.35% accuracy, weighted F1 = 0.77, macro F1 = 0.71, k = 0.58, Matthews correlation coefficient = 0.60, and AUC = 0.88 under equal teacher weighting; it further improved to 80.43% accuracy and AUC = 0.89 with differential teacher weighting. Multiteacher KD with complementing DPR preprocessing variants resulted in a student model that outperformed each teacher, demonstrating its suitability for automated radiographic scoring with minimal annotated data. Such approaches may support access to primary care by enabling reliable automated diagnostics in resource-constrained clinical environments.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.