Back to all papers

Real-world clinical impact of three commercial AI algorithms on musculoskeletal radiography interpretation: A prospective crossover reader study.

Authors

Prucker P,Lemke T,Mertens CJ,Ziegelmayer S,Graf MM,Weller D,Kim SH,Gassert FT,Kader A,Dorfner FJ,Meddeb A,Makowski MR,Lammert J,Huber T,Lohöfer F,Bressem KK,Adams LC,Luiken I,Busch F

Affiliations (7)

  • Institute for Diagnostic and Interventional Radiology, TUM School of Medicine and Health, TUM University Hospital Rechts der Isar, Munich, Germany.
  • Institute for Diagnostic and Interventional Radiology, TUM School of Medicine and Health, TUM University Hospital Rechts der Isar, Munich, Germany; Institute of Diagnostic and Interventional Neuroradiology, TUM School of Medicine and Health, TUM University Hospital Rechts der Isar, Munich, Germany.
  • Department of Neuroradiology, Charité - Universitätsmedizin Berlin, Corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.
  • Department of Neuroradiology, Charité - Universitätsmedizin Berlin, Corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany.
  • Department of Gynecology and Center for Hereditary Breast and Ovarian Cancer, Technical University of Munich (TUM), School of Medicine and Health, Klinikum Rechts der Isar, TUM University Hospital, Munich, Germany.
  • Institute for Diagnostic and Interventional Radiology, TUM School of Medicine and Health, TUM University Hospital Rechts der Isar, Munich, Germany; Institute for Cardiovascular Radiology and Nuclear Medicine, TUM School of Medicine and Health, German Heart Center Munich, Munich, Germany.
  • Institute for Diagnostic and Interventional Radiology, TUM School of Medicine and Health, TUM University Hospital Rechts der Isar, Munich, Germany. Electronic address: [email protected].

Abstract

To prospectively assess the diagnostic performance, workflow efficiency, and clinical impact of three commercial deep-learning tools (BoneView, Rayvolve, RBfracture) for routine musculoskeletal radiograph interpretation. From January to March 2025, two radiologists (4 and 5 years' experience) independently interpreted 1,037 adult musculoskeletal studies (2,926 radiographs) first unaided and, after 14-day washouts, with each AI tool in a randomized crossover design. Ground truth was established by confirmatory CT when available. Outcomes included sensitivity, specificity, accuracy, area under the receiver operating characteristic curve (AUC), interpretation time, diagnostic confidence (5-point Likert), and rates of additional CT recommendations and senior consultations. DeLong tests compared AUCs; Mann-Whitney U and χ2 tests assessed secondary endpoints. AI assistance did not significantly change performance for fractures, dislocations, or effusions. For fractures, AUCs were comparable to baseline (Reader 1: 96.50 % vs. 96.30-96.50 %; Reader 2: 95.35 % vs. 95.97 %; all p > 0.11). For dislocations, baseline AUCs (Reader 1: 92.66 %; Reader 2: 90.68 %) were unchanged with AI (92.76-93.95 % and 92.00 %; p ≥ 0.280). For effusions, baseline AUCs (Reader 1: 92.52 %; Reader 2: 96.75 %) were similar with AI (93.12 % and 96.99 %; p ≥ 0.157). Median interpretation times decreased with AI (Reader 1: 34 s to 21-25 s; Reader 2: 30 s to 21-26 s; all p < 0.001). Confidence improved across tools: BoneView increased combined "very good/excellent" ratings versus unaided reads (Reader 1: 509 vs. 449, p < 0.001; Reader 2: 483 vs. 439, p < 0.001); Rayvolve (Reader 1: 456 vs. 449, p = 0.029; Reader 2: 449 vs. 439, p < 0.001) and RBfracture (Reader 1: 457 vs. 449, p = 0.017; Reader 2: 448 vs. 439, p = 0.001) yielded smaller but significant gains. Reader 1 recommended fewer CT scans with AI assistance (33 vs. 22-23, p = 0.007). In a real-world clinical setting, AI-assisted interpretation of musculoskeletal radiographs reduced reading time and increased diagnostic confidence without materially affecting diagnostic performance. These findings support AI assistance as a lever for workflow efficiency and potential cost-effectiveness at scale.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.