Artificial intelligence-assisted decision-making in third molar assessment using ChatGPT: is it really a valid tool?

Authors

Grinberg N,Ianculovici C,Whitefield S,Kleinman S,Feldman S,Peleg O

Affiliations (5)

  • Department of Otolaryngology Head and Neck Surgery and Maxillofacial Surgery, Tel-Aviv Sourasky Medical Center, Grey School of Medicine, Tel Aviv University, 64239, Tel Aviv, Israel. [email protected].
  • , Hashaked 20, 9074017, Mevasseret Zion, Israel. [email protected].
  • Department of Otolaryngology Head and Neck Surgery and Maxillofacial Surgery, Tel-Aviv Sourasky Medical Center, Grey School of Medicine, Tel Aviv University, 64239, Tel Aviv, Israel.
  • Department of Otolaryngology Head and Neck Surgery and Maxillofacial Surgery, Oral Medicine Unit, Tel-Aviv Sourasky Medical Center, Tel Aviv, Israel.
  • The Maurice and Gabriela Goldschleger School of Dental Medicine, Tel Aviv University, 6997801, Tel Aviv, Israel.

Abstract

Artificial intelligence (AI) is becoming increasingly popular in medicine. The current study aims to investigate whether an AI-based chatbot, such as ChatGPT, could be a valid tool for assisting in decision-making when assessing mandibular third molars before extractions. Panoramic radiographs were collected from a publicly available library. Mandibular third molars were assessed by position and depth. Two specialists evaluated each case regarding the need for CBCT referral, followed by introducing all cases to ChatGPT under a uniform script to decide the need for further CBCT radiographs. The process was performed first without any guidelines, Second, after introducing the guidelines presented by Rood et al. (1990), and third, with additional test cases. ChatGPT and a specialist's decision were compared and analyzed using Cohen's kappa test and the Cochrane-Mantel--Haenszel test to consider the effect of different tooth positions. All analyses were made under a 95% confidence level. The study evaluated 184 molars. Without any guidelines, ChatGPT correlated with the specialist in 49% of cases, with no statistically significant agreement (kappa < 0.1), followed by 70% and 91% with moderate (kappa = 0.39) and near-perfect (kappa = 0.81) agreement, respectively, after the second and third rounds (p < 0.05). The high correlation between the specialist and the chatbot was preserved when analyzed by the different tooth locations and positions (p < 0.01). ChatGPT has shown the ability to analyze third molars prior to surgical interventions using accepted guidelines with substantial correlation to specialists.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.