Prospective Diagnostic Accuracy and Technical Feasibility of Artificial Intelligence-Assisted Rib Fracture Detection on Chest Radiographs: Observational Study.
Authors
Affiliations (5)
Affiliations (5)
- Department of Emergency Medicine, Mackay Memorial Hospital, Taipei, Taiwan.
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, 9F, Education & Research Building, Shuang-Ho Campus, No. 301, Yuantong Rd., Zhonghe Dist., New Taipei City, 235603, Taiwan, 886 266202589 ext 10929.
- College of Medicine, Mackay Medical University, New Taipei City, Taiwan.
- Division of Plastic Surgery, Mackay Memorial Hospital, Taipei, Taiwan.
- Clinical Big Data Research, Taipei Medical University Hospital, Taipei City, Taiwan.
Abstract
Rib fractures are present in 10%-15% of thoracic trauma cases but are often missed on chest radiographs, delaying diagnosis and treatment. Artificial intelligence (AI) may improve detection and triage in emergency settings. This study aims to evaluate diagnostic accuracy, processing speed, and technical feasibility of an artificial intelligence-assisted rib fracture detection system using prospectively collected data within a real-world, high-volume emergency department workflow. We conducted an observational feasibility study with prospective data collection of a faster region-based convolutional neural network-based AI model deployed in the emergency department to analyze 23,251 real-world chest radiographs (22,946 anteroposterior; 305 oblique) from April 1 to July 2, 2023. This study was approved by the Institutional Review Board of MacKay Memorial Hospital (IRB No. 20MMHIS483e). AI operated passively, without influencing clinical decision-making. The reference standard was the final report issued by board-certified radiologists. A subset of discordant cases underwent post hoc computed tomography review for exploratory analysis. AI achieved 74.5% sensitivity (95% CI 0.708-0.780), 93.3% specificity (95% CI 0.930-0.937), 24.2% positive predictive value, and 99.2% negative predictive value. Median inference time was 10.6 seconds versus 3.3 hours for radiologist reports (paired Wilcoxon signed-rank test W=112 987.5, P<.001). The analysis revealed peak imaging demand between 08:00 and 16:00 and Thursday-Saturday evenings. A 14-day graphics processing unit outage underscored the importance of infrastructure resilience. The AI system demonstrated strong technical feasibility for real-time rib fracture detection in a high-volume emergency department setting, with rapid inference and stable performance during prospective deployment. Although the system showed high negative predictive value, the observed false-positive and false-negative rates indicate that it should be considered a supportive screening tool rather than a stand-alone diagnostic solution or a replacement for clinical judgment. These findings support further clinician-in-the-loop studies to evaluate clinical feasibility, workflow integration, and impact on diagnostic decision-making. However, interpretation is limited by reliance on radiology reports as the reference standard and the system's passive, non-interventional deployment.