Evaluating the Accuracy and Efficiency of AI-Generated Radiology Reports Based on Positive Findings-A Qualitative Assessment of AI in Radiology.
Authors
Affiliations (4)
Affiliations (4)
- Department of Orthopaedics, Royal Orthopaedic Hospital, Birmingham, UK (R.F.R.).
- Department of Radiology, AIG Hospitals, Hyderabad, India (S.C., M.A.S.).
- Department of Radiology, Royal Lancaster Infirmary, Lancaster, UK (P.W.).
- Department of Musculoskeletal Radiology, Royal Orthopaedic Hospital, Birmingham, UK (R.B.). Electronic address: [email protected].
Abstract
With increasing imaging demands, radiologists face growing workload pressures, often resulting in delays and reduced diagnostic efficiency. Recent advances in artificial intelligence (AI) have introduced tools for automated report generation, particularly in simpler imaging modalities, such as X-rays. However, limited research has assessed AI performance in complex studies such as MRI and CT scans, where report accuracy and clinical interpretation are critical. To evaluate the performance of a semi-automated AI-based reporting platform in generating radiology reports for complex imaging studies, and to compare its accuracy, efficiency, and user confidence with the traditional dictation method. This study involved 100 imaging cases, including MRI knee (n=21), MRI lumbar spine (n=30), CT head (n=23), and CT Abdomen and Pelvis (n=26). Consultant musculoskeletal radiologists reported each case using both traditional dictation and the AI platform. The radiologist first identified and entered the key positive findings, based on which the AI system generated a full draft report. Reporting time was recorded, and both methods were evaluated on accuracy, user confidence, and overall reporting experience (rated on a scale of 1-5). Statistical analysis was conducted using two-tailed t-tests and 95% confidence intervals. AI-generated reports demonstrated significantly improved performance across all parameters. The mean reporting time reduced from 6.1 to 3.43 min (p<0.0001) with AI-assisted report generation. Accuracy improved from 3.81 to 4.65 (p<0.0001), confidence ratings increased from 3.91 to 4.67 (p<0.0001), and overall reporting experience favored using the AI platform for generating radiology reports (mean 4.7 vs. 3.69, p<0.0001). Minor formatting errors and occasional anatomical misinterpretations were observed in AI-generated reports, but could be easily corrected by the radiologist during review. The AI-assisted reporting platform significantly improved efficiency and radiologist confidence without compromising accuracy. Although the tool performs well when provided with key clinical findings, it still requires expert oversight, especially in anatomically complex reporting. These findings support the use of AI as a supportive tool in radiology practice, with a focus on data integrity, consistency, and human validation.