Evaluating the interpretation of neck CT/MRI scans by ChatGPT-4V for detecting primary oropharyngeal squamous cell carcinoma: An exploratory study.
Authors
Affiliations (7)
Affiliations (7)
- Technical University Munich, Department of Otolaryngology Head and Neck Surgery, Munich, Germany. Electronic address: [email protected].
- Technical University Munich, Department of Diagnostic and Interventional Radiology, Munich, Germany.
- Technical University Munich, Department of Otolaryngology Head and Neck Surgery, Munich, Germany.
- Medical University Graz, Department of Otolaryngology Head and Neck Surgery, Graz, Austria.
- Technical University Munich, Department of Radiooncology, Munich, Germany.
- Technical University Munich, Institute of Pathology, Munich, Germany.
- RWTH Aachen University, Department of Otolaryngology Head and Neck Surgery, Aachen, Germany.
Abstract
Early and accurate detection of head and neck squamous cell carcinoma and the subset of oropharyngeal squamous cell carcinoma (OPSCC) is essential for the therapy and the prognosis of patients. Computer tomography (CT) is the primary imaging modality and is currently evaluated manually by radiologists and head and neck oncologists. Since image recognition in the form of artificial intelligence (AI) was introduced recently with the large language model (LLM) ChatGPT-4V, this exploratory study for the first time evaluates the application of image recognition by ChatGPT in interpreting neck CT and MRI scans for OPSCC detection, and corresponding images without any oropharyngeal lesion. The most likely diagnosis based on the CT images for 100 CT cases (50 OPSCC, 50 without lesion) and the available corresponding 62 MRI cases (31 OPSCC, 31 without an oropharyngeal lesion) by ChatGPT-4V was rated by two independent reviewers and the overall performance was evaluated in terms of accuracy, sensitivity, and specificity. In this study, ChatGPT-4V reached a sensitivity of 72% and a specificity of 78% in identifying OPSCC from CT images. For MRI scans, sensitivity was 80.6% and specificity 83.9%. Human papillomavirus-positive and more advanced lesions were detected more reliably. In this exploratory study of CT and MRI neck scans of the oropharynx, ChatGPT-4V demonstrated a mediocre performance for detecting OPSCC. Continued research and advancements in AI are essential to improve the reliability and clinical utility of LLMs for the interpretation of neck scans.