Back to all papers

Artificial Intelligence-Assisted Periapical Radiographic Assessment: Lesion Detection, Endodontic Complication Analysis, and Review of Clinical Treatment Recommendations.

April 20, 2026pubmed logopapers

Authors

Akyüz İE,Kivircik BE,Aslan T

Affiliations (2)

  • Department of Endodontics, Faculty of Dentistry, Erciyes University, Kayseri, Türkiye.
  • Department of Endodontics, Faculty of Dentistry, Erciyes University, Kayseri, Türkiye. Electronic address: [email protected].

Abstract

Artificial intelligence (AI) systems are increasingly used in dental radiology to support endodontic diagnosis. However, their diagnostic reliability across different clinical categories remains unclear. This study compared three vision-language AI models (ChatGPT-5 Plus, Gemini 2.5 Pro, and Copilot Pro) with expert endodontists by assessing sensitivity, specificity, overall diagnostic agreement, and Youden's Index across multiple endodontic conditions. This retrospective diagnostic accuracy study evaluated the relationship between periapical radiographs and treatment decisions, procedural complications, and lesion detection. Expert endodontists served as the gold standard of reference. Diagnostic categories included primary treatment selection, non-surgical retreatment, final treatment decisions, perforation, underfilling, overfilling, broken file, calcification, and periapical lesion detection. There was an almost perfect agreement between the endodontists (κ = 0.95). Gemini 2.5 Pro demonstrated the highest diagnostic accuracy, particularly in periapical lesion detection (sensitivity 100%, specificity 88%), while ChatGPT-5 Plus showed similarly strong performance in treatment selection. Copilot Pro exhibited markedly low sensitivity for complications such as perforation and instrument fracture. Kappa values for preoperative and postoperative treatment decisions were high for Gemini and ChatGPT-5 Plus, but low for Copilot Pro. The Friedman test confirmed significant differences among the groups (p < 0.001). AI systems demonstrated promising diagnostic accuracy in treatment selection tasks and lesion detection, but performed less reliably in identifying complex procedural complications. Gemini 2.5 Pro showed the most balanced performance, whereas Copilot Pro displayed the highest variability across diagnostic categories.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.