Artificial intelligence in radiology: diagnostic sensitivity of ChatGPT for detecting hemorrhages in cranial computed tomography scans.

Authors

Bayar-Kapıcı O,Altunışık E,Musabeyoğlu F,Dev Ş,Kaya Ö

Affiliations (3)

  • Seyhan State Hospital, Clinic of Radiology, Adana, Türkiye.
  • University of Health Sciences Türkiye, Gaziantep City Hospital, Clinic of Neurology, Gaziantep, Türkiye.
  • Çukurova University Faculty of Medicine, Department of Radiology, Adana, Türkiye.

Abstract

Chat Generative Pre-trained Transformer (ChatGPT)-4V, a large language model developed by OpenAI, has been explored for its potential application in radiology. This study assesses ChatGPT-4V's diagnostic performance in identifying various types of intracranial hemorrhages in non-contrast cranial computed tomography (CT) images. Intracranial hemorrhages were presented to ChatGPT using the clearest 2D imaging slices. The first question, "Q1: Which imaging technique is used in this image?" was asked to determine the imaging modality. ChatGPT was then prompted with the second question, "Q2: What do you see in this image and what is the final diagnosis?" to assess whether the CT scan was normal or showed pathology. For CT scans containing hemorrhage that ChatGPT did not interpret correctly, a follow-up question-"Q3: There is bleeding in this image. Which type of bleeding do you see?"-was used to evaluate whether this guidance influenced its response. ChatGPT accurately identified the imaging technique (Q1) in all cases but demonstrated difficulty diagnosing epidural hematoma (EDH), subdural hematoma (SDH), and subarachnoid hemorrhage (SAH) when no clues were provided (Q2). When a hemorrhage clue was introduced (Q3), ChatGPT correctly identified EDH in 16.7% of cases, SDH in 60%, and SAH in 15.6%, and achieved 100% diagnostic accuracy for hemorrhagic cerebrovascular disease. Its sensitivity, specificity, and accuracy for Q2 were 23.6%, 92.5%, and 57.4%, respectively. These values improved substantially with the clue in Q3, with sensitivity rising to 50.9% and accuracy to 71.3%. ChatGPT also demonstrated higher diagnostic accuracy in larger hemorrhages in EDH and SDH images. Although the model performs well in recognizing imaging modalities, its diagnostic accuracy substantially improves when guided by additional contextual information. These findings suggest that ChatGPT's diagnostic performance improves with guided prompts, highlighting its potential as a supportive tool in clinical radiology.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.