Leveraging ChatGPT for Report Error Audit: An Accuracy-Driven and Cost-Efficient Solution for Ophthalmic Imaging Reports.
Authors
Affiliations (12)
Affiliations (12)
- Eye Center of Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China.
- Zhejiang Provincial Key Laboratory of Ophthalmology, Zhejiang Provincial Clinical Research Center for Eye Diseases, Zhejiang Provincial Engineering Institute on Eye Diseases, Hangzhou, China.
- Department of Ophthalmology, Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou, China.
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong.
- Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong.
- Centre for Innovation and Precision Eye Health, National University of Singapore, Singapore, Singapore.
- Department of Ophthalmology, National University of Singapore, Singapore, Singapore.
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore.
- Institute for Research in Ophthalmology, Foundation for Ophthalmology Development, Poznan, Poland.
- Department of Ophthalmology, University of Warmia and Mazury, Olsztyn, Poland.
- Eye Center of Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China. [email protected].
- Zhejiang Provincial Key Laboratory of Ophthalmology, Zhejiang Provincial Clinical Research Center for Eye Diseases, Zhejiang Provincial Engineering Institute on Eye Diseases, Hangzhou, China. [email protected].
Abstract
Accurate ophthalmic imaging reports, including fundus fluorescein angiography (FFA) and ocular B-scan ultrasound, are essential for effective clinical decision-making. The current process, involving drafting by residents followed by review by ophthalmic technicians and ophthalmologists, is time-consuming and prone to errors. This study evaluates the effectiveness of ChatGPT-4o in auditing errors in FFA and ocular B-scan reports and assesses its potential to reduce time and costs within the reporting workflow. Preliminary 100 FFA and 80 ocular B-scan reports drafted by residents were analyzed using GPT-4o to identify the errors in identifying left or right eye and incorrect anatomical descriptions. The accuracy of GPT-4o was compared to retinal specialists, general ophthalmologists, and ophthalmic technicians. Additionally, a cost-effective analysis was conducted to estimate time and cost savings from integrating GPT-4o into the reporting process. A pilot real-world validation with 20 erroneous reports was also performed between GPT-4o and human reviewers. GPT-4o demonstrated a detection rate of 79.0% (158 of 200; 95% CI 73.0-85.0) across all examinations, which was comparable to the average detection performance of general ophthalmologists (78.0% [155 of 200; 95% CI 72.0-83.0]; P ≥ 0.09). Integration of GPT-4o reduced the average report review time by 86%, completing 180 ophthalmic reports in approximately 0.27 h compared to 2.17-3.19 h by human ophthalmologists. Additionally, compared to human reviewers, GPT-4o lowered the cost from $0.21 to $0.03 per report (savings of $0.18). In the real-world evaluation, GPT-4o detected 18 of 20 errors with no false positives, compared to 95-100% by human reviewers. GPT-4o effectively enhances the accuracy of ophthalmic imaging reports by identifying and correcting common errors. Its implementation can potentially alleviate the workload of ophthalmologists, streamline the reporting process, and reduce associated costs, thereby improving overall clinical workflow and patient outcomes.