Back to all papers

Accuracy of deep learning-based AI models for early caries lesion detection: the influence of annotation quality and reference choice.

December 4, 2025pubmed logopapers

Authors

Gonzalez-Valenzuela RE,Mettes P,Loos BG,Marquering H,Berkhout E

Affiliations (7)

  • Department of Oral Radiology, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Gustav Mahlerlaan 3004 (Office 4N-73), Amsterdam, Noord-Holland, 1081 LA, the Netherlands. [email protected].
  • Department of Biomedical Engineering and Physics, Amsterdam University Medical Center (AUMC), University of Amsterdam and Vrije Universiteit Amsterdam, Meibergdreef 15, Amsterdam, Noord-Holland, 1105 AZ, The Netherlands. [email protected].
  • VISlab, Informatics Institute, University of Amsterdam (UvA), Science Park 904, Amsterdam, Noord-Holland, 1098 XH, Netherlands.
  • Department of Periodontology, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Gustav Mahlerlaan 3004, Amsterdam, Noord-Holland, 1081 LA, the Netherlands.
  • Department of Biomedical Engineering and Physics, Amsterdam University Medical Center (AUMC), University of Amsterdam and Vrije Universiteit Amsterdam, Meibergdreef 15, Amsterdam, Noord-Holland, 1105 AZ, The Netherlands.
  • Department of Radiology and Nuclear Medicine, Amsterdam University Medical Center (AUMC), University of Amsterdam and Vrije Universiteit Amsterdam, Meibergdreef 15, Amsterdam, Noord-Holland, 1105 AZ, The Netherlands.
  • Department of Oral Radiology, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and Vrije Universiteit Amsterdam, Gustav Mahlerlaan 3004 (Office 4N-73), Amsterdam, Noord-Holland, 1081 LA, the Netherlands.

Abstract

The objective of this study is to assess how different annotation methods used during AI model training affect the accuracy of early caries lesion detection, and how the choice of the evaluation reference standard leads to significant differences in assessing AI models' outcomes. AI-based tools for caries detection are becoming common in dentistry. This study shows that how these models are evaluated can significantly impact perceived performance. Clinicians and developers should ensure that evaluation standards are independent and clinically relevant to avoid overestimating AI's diagnostic abilities and to build trust for real-world use and regulatory approval. Multiple AI caries lesion segmentation models were trained on the ACTA-DIRECT dataset using annotations from (1) single dentists, (2) aggregated strategies (majority vote, consensus meetings, STAPLE), and (3) micro-CT-based methods. Model accuracy was evaluated using two approaches: (1) comparison against micro-CT-based annotations and (2) comparison against the training-matched annotations. Statistical significance of differences in model diagnostic accuracy across annotation strategies was assessed using the McNemar test. There was no statistically significant difference in diagnostic accuracy among AI models when compared to micro-CT-based annotations. However, the diagnostic accuracy was considered statistically significantly higher when the results of the AI models were evaluated with the training-matched annotations. Our findings indicate a strong influence of reference standards on AI model evaluation. While annotation strategies during training did not significantly affect AI accuracy in caries lesion segmentation, evaluation was subject to bias when models were tested against different reference standards. CLINICAL RELEVANCE : AI-based tools for caries detection are becoming common in dentistry. This study shows that how these models are evaluated can significantly impact perceived performance. Clinicians and developers should ensure that evaluation standards are independent and clinically relevant to avoid overestimating AI's diagnostic abilities and to build trust for real-world use and regulatory approval.

Topics

Dental CariesDeep LearningArtificial IntelligenceJournal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.