Hip fracture detection on radiographs using an artificial intelligence-based support tool: a diagnostic accuracy study.
Authors
Affiliations (7)
Affiliations (7)
- Department of Orthopaedic Surgery and Traumatology, Copenhagen University Hospital, Bispebjerg, København NV, Denmark.
- Department of Orthopaedic Surgery and Traumatology, Odense University Hospital, Odense C, Denmark.
- Department of Radiology, Copenhagen University Hospital, Herlev/Gentofte, Herlev, Denmark.
- Department of Radiology, Copenhagen University Hospital, Amager/Hvidovre, Hvidovre, Denmark.
- Department of Radiology and Radiological AI Testcenter (RAIT) Denmark, Copenhagen University Hospital, København NV, Denmark.
- Department of Clinical Medicine, Faculty of Health, and Medical Sciences, University of Copenhagen, København K, Denmark.
- Radiobotics, ApS, Copenhagen, Denmark.
Abstract
BackgroundAssessment of subtle hip fractures on radiographs can be difficult, especially among less experienced emergency physicians, which may prolong the diagnosis and ultimately time to surgery. Clinical artificial intelligence (AI) decision support tools have shown great potential in assisting the detection of fractures on radiographs.PurposeTo investigate how a CE-marked AI fracture detection tool affects junior doctors' diagnostic accuracy in detecting hip fractures on radiographs.Material and MethodsEight junior doctors with affiliation to the Accident and Emergency (A&E) department read 246 hip radiographic examinations with and without AI support. The reference standard was determined by two musculoskeletal radiologists, to measure sensitivity and specificity for readers without and with support from the AI tool as well as the AI tool's standalone performance.ResultsMean sensitivity in detecting hip fractures increased significantly from 0.89 (95% confidence interval [CI] = 0.85-0.93) without AI support to 0.94 (95% CI = 0.92-0.97) (χ<sup>2</sup> = 9.27; <i>P</i> = 0.002) with AI support and the false-negative cases was thereby reduced by 49%. There was no significant change in mean specificity 0.90 (95% CI = 0.86-0.93) to 0.91 (95% CI = 0.88-0.94) (χ<sup>2</sup> = 0.34; <i>P</i> = 0.56). The AI standalone performance was 0.99 (95% CI = 0.99-1.00) and 0.73 (95% CI = 0.67-0.80) in sensitivity and specificity, respectively.ConclusionOut of eight junior doctors, seven detected more fractures with AI assistance than without. The applied performance gain for readers highlights the value of the product.