Back to all papers

Comparison of the diagnostic accuracy of dentists and ChatGPT in jawbone lesions.

May 18, 2026pubmed logopapers

Authors

Satan NS,Pamukçu U,Türk BE,Akarslan Z,Kemaloğlu SA,Peker I

Affiliations (5)

  • Department of Dentomaxillofacial Radiology, Gazi University Faculty of Dentistry, Bishkek St. (8th St.) 1st St. No:8, Emek, Ankara, 06490, Turkey. [email protected].
  • Department of Dentomaxillofacial Radiology, Gazi University Faculty of Dentistry, Bishkek St. (8th St.) 1st St. No:8, Emek, Ankara, 06490, Turkey.
  • Department of Dentomaxillofacial Surgery, Gazi University Faculty of Dentistry, Ankara, Turkey.
  • Department of Statistics, Ankara University Faculty of Science Department of Statistics, Ankara, Turkey.
  • Department of Dentomaxillofacial Radiology, Autism and Developmental Disorders Research Center, Gazi University Faculty of Dentistry, Ankara, Turkey.

Abstract

Artificial intelligence (AI) is leading to a significant paradigm shift in medical imaging and diagnostic sciences. In particular, Chat Generative Pre-trained Transformer (ChatGPT) is finding increasing application in diagnostic processes due to its ability to generate clinical outcomes. This study aims to evaluate the diagnostic accuracy of ChatGPT for jawbone lesions and also to compare it with that of Oral and Maxillofacial Radiologists (OMFR), Oral and Maxillofacial Surgeons (OMFS), and general dentists. Thirty cases with jawbone lesions, for which clinical information, panoramic radiographs, and histopathological diagnoses were available, were selected. A questionnaire was prepared, including participants' (OMFR, OMFS, and general dentists) demographic information, the cases' clinical findings and panoramic radiographs, and distributed via electronic communication channels. The same cases were loaded into ChatGPT-4 and asked to generate a preliminary diagnosis. The data were statistically analyzed using the Wilcoxon Signed Rank, Mann-Whitney U, and Kruskal-Wallis tests at a significance level of p < 0.05. Overall, ChatGPT's diagnostic accuracy was limited to 46.67%, while the OMFR (67.71%) and OMFS (58.96%) groups had statistically significantly higher success rates (p < 0.05) than ChatGPT. General dentists (37.85%) had lower or similar diagnostic accuracy compared to ChatGPT in most subgroups (gender, age, workplace, professional experience). ChatGPT demonstrated moderate diagnostic accuracy. While OMFR and OMFS participants had significantly higher accuracy rates than ChatGPT, ChatGPT generally outperformed general dentists. These results indicate that such AI systems cannot replace specialist clinicians but can provide valuable contributions as supportive tools that enhance diagnosis.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.