Back to all news

ChatGPT Demonstrates High Diagnostic Agreement on FDG-PET/CT Reports

AuntMinnieIndustry

ChatGPT-4o and ChatGPT-5 matched or surpassed nuclear medicine experts in diagnosing neurodegenerative diseases using textual FDG-PET/CT scan descriptions.

Key Details

  • 1University of Cologne team tested ChatGPT-4o and ChatGPT-5 on 100 F-18 FDG-PET/CT brain scan reports.
  • 2Models achieved median diagnostic agreement scores of 1.00 against expert interpretations.
  • 3ChatGPT-4o identified the correct main diagnosis in 86% of cases, ChatGPT-5 in 89%.
  • 4Performance was highest in typical cases (e.g., Alzheimer's disease), lower in complex or atypical presentations.
  • 5No imaging data or specific fine-tuning was used; models relied on general training.
  • 6Reproducibility from run to run was 75% for ChatGPT-4o and 55% for ChatGPT-5 in a subset.

Why It Matters

This study suggests that large language models can align with expert nuclear medicine interpretation based solely on textual imaging descriptions, potentially augmenting workflow efficiency and consistency in neuroimaging. It highlights opportunities and limitations for LLMs in aiding diagnostic processes, especially when combined with automated image analysis.

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.