Comparison of GPT-4 omni and physicians for accurate evaluation of traumatic extremity x-rays: A feasibility study.
Authors
Affiliations (2)
Affiliations (2)
- Department of Emergency Medicine, School of Medicine, Hitit University, Corum, Turkey.
- Department of Emergency Medicine, Corum Erol Olcok Training and Research Hospital, Corum, Turkey.
Abstract
GPT-4 omni (GPT-4o) is the latest version of Chat-GPT, developed by OpenAI, which provides image interpretation capabilities. In trauma patients, x-rays are the simplest imaging method, and research is ongoing on their interpretation using artificial intelligence. No published studies have evaluated the accuracy of GPT-4o in x-ray interpretation; hence, the aim of this study was to determine the diagnostic accuracy of GPT-4o in x-ray interpretation and compare its performance with that of emergency medicine specialists, residents, and orthopedists. Ten cases were randomly selected from the Northwestern University Feinberg School of Medicine Orthopedic Teaching page and presented to the groups, and the results were recorded. In the research, emergency medicine assistants, emergency medicine specialists, orthopedic specialists and GPT-4o were asked about these ten cases separately in Turkey between June 1, 2024 and July 15, 2024. Ten individuals from each group and the answers to questions asked on GPT-4o different days were included in the study. There was no significant difference between GPT-4o and emergency medicine residents, whereas a significant difference was observed between these 2 groups and between emergency medicine specialists and orthopedists (P = .017). Orthopedists had the highest accuracy rate (86%), whereas emergency medicine specialists, emergency medicine residents, and GPT-4o had accuracy rates of 84%, 76%, and 74%, respectively. Overall abilities of ChatGPT-4o as a general diagnostic tool is less successful in x-ray interpretation than physicians' interpretation. The accuracy increases with the removal of dislocations from the data pool. These results suggest that GPT-4o can assist physicians only in cases of suspected fractures.