Comparison of GPT-4 omni and physicians for accurate evaluation of traumatic extremity x-rays: A feasibility study.

March 27, 2026

papers

DOI: 10.1097/MD.0000000000048176 PMID: 41894291

Authors

Ağaçkiran İ,Ağaçkiran M

Affiliations (2)

Department of Emergency Medicine, School of Medicine, Hitit University, Corum, Turkey.
Department of Emergency Medicine, Corum Erol Olcok Training and Research Hospital, Corum, Turkey.

Abstract

GPT-4 omni (GPT-4o) is the latest version of Chat-GPT, developed by OpenAI, which provides image interpretation capabilities. In trauma patients, x-rays are the simplest imaging method, and research is ongoing on their interpretation using artificial intelligence. No published studies have evaluated the accuracy of GPT-4o in x-ray interpretation; hence, the aim of this study was to determine the diagnostic accuracy of GPT-4o in x-ray interpretation and compare its performance with that of emergency medicine specialists, residents, and orthopedists. Ten cases were randomly selected from the Northwestern University Feinberg School of Medicine Orthopedic Teaching page and presented to the groups, and the results were recorded. In the research, emergency medicine assistants, emergency medicine specialists, orthopedic specialists and GPT-4o were asked about these ten cases separately in Turkey between June 1, 2024 and July 15, 2024. Ten individuals from each group and the answers to questions asked on GPT-4o different days were included in the study. There was no significant difference between GPT-4o and emergency medicine residents, whereas a significant difference was observed between these 2 groups and between emergency medicine specialists and orthopedists (P = .017). Orthopedists had the highest accuracy rate (86%), whereas emergency medicine specialists, emergency medicine residents, and GPT-4o had accuracy rates of 84%, 76%, and 74%, respectively. Overall abilities of ChatGPT-4o as a general diagnostic tool is less successful in x-ray interpretation than physicians' interpretation. The accuracy increases with the removal of dislocations from the data pool. These results suggest that GPT-4o can assist physicians only in cases of suspected fractures.

View Source Full Text PDF

Topics

RadiographyExtremitiesJournal ArticleComparative Study

Comparison of GPT-4 omni and physicians for accurate evaluation of traumatic extremity x-rays: A feasibility study.

Authors

Affiliations (2)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?