Back to all papers

External validation of an AI-Based fracture detection tool for hip and pelvic radiographs in a multicenter retrospective cohort.

March 7, 2026pubmed logopapers

Authors

Bruun FJ,Omaraee Y,Nybing JU,Gosvig KK,Boesen MP,Hansen P,Müller FC,Brejnebøl MW

Affiliations (4)

  • Department of Radiology, Bispebjerg hospital, Bispebjerg Bakke 23, 2400 København, Denmark. Electronic address: [email protected].
  • Department of diagnostic imaging, North Zealand hospital, Dyrehavevej 29, 3400 Hillerød, Denmark.
  • Department of Radiology, Bispebjerg hospital, Bispebjerg Bakke 23, 2400 København, Denmark.
  • Department of Radiology, Herlev hospital, Borgmester Ib Juuls Vej 1, 2730 Herlev, Denmark.

Abstract

Artificial intelligence tools show promise in fracture detection but may be impaired by hidden stratification. We aim to evaluate the diagnostic accuracy of a commercially available AI tool for detecting hip and pelvic fractures, with emphasis on clinically relevant subgroups and AO/OTA fracture classifications. This retrospective multicenter diagnostic test accuracy study included consecutive trauma patients who underwent hip or pelvic radiography. The reference standard was based on post-conference clinical radiology reports, incorporating MRI, CT, and radiography in hierarchical order. Fractures were classified according to the AO/OTA system. Studied subgroups included surgical metal, degenerative disease, old fractures, radiographically occult fractures and fracture classification. Sensitivity and specificity were calculated with 95% confidence intervals. Among 642 patients (median age 82 years), 262 (42%) had fractures. Overall sensitivity was 87% [83-91%] and specificity 86% [82-89%]. Specificity was reduced in cases with old fractures (29% [13-51%]). Sensitivity was high for femoral neck (95% [88-98%]) and trochanteric fractures (92% [82-97%]) and moderate for pelvic (82% [70-91%]) and acetabular fractures (58% [28-85%]). Within each segment the classifications with most missed fractures were unilateral anterior pelvic arch (61A2.2), subcapital femur neck (31B1), and simple trochanteric fractures (31A1). In summary, while specificity was notably reduced in cases with old fractures and sensitivity was moderate for pelvic and acetabular fractures, the AI tool demonstrated high diagnostic accuracy for hip-region trauma radiographs, with most missed cases occurring among the subtle AO/OTA fracture groups.

Topics

Hip FracturesArtificial IntelligencePelvic BonesRadiographic Image Interpretation, Computer-AssistedFractures, BoneJournal ArticleMulticenter StudyValidation Study

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.