Performance of a complete AI radiographic suite across 258,373 X-rays from 26 countries: A worldwide evaluation.
Authors
Affiliations (17)
Affiliations (17)
- University Hospitals of Brussels, Brussels, Belgium.
- AZmed, Paris, France. Electronic address: [email protected].
- Sisters of Mercy University Hospital Center School of Medicine - Catholic University of Croatia, Zagreb, Croatia.
- Department of Radiodiagnostics and Imaging Techniques, P.J. Safarik University and L. Pasteur University Hospital, Košice, Slovakia.
- Centre D'imagerie Du Languedoc, Narbonne, France.
- Emergency Department and Prehospital Care, General Hospital of Niort, Niort, France.
- Diagnostikum Linz, Linz, Austria.
- Izola General Hospital, Izola, Slovenia.
- Imagerie Médicale Sainte-Marie, Osny, France.
- Medical University Plovdiv, Plovdiv, Bulgaria; Research and Innovation Program for the Development of MU - PLOVDIV- (SRIPD-MUP), Creation of a Network of Research Higher Schools, National Plan for Recovery and Sustainability, European Union - NextGenerationEU, Bulgaria.
- General Hospital Dr. Franc Derganc Nova Gorica, Department of Emergency Medicine, Ĺ empeter Pri Gorici, Slovenia.
- Ibermutua, Madrid, Spain.
- Imagine Barcelona, Barcelona, Spain.
- Area Sanitaria Pontevedra Salnés, Spain.
- Radiologie Paris Ouest, Paris, France.
- Radiologie Paris Ouest, Paris, France; HĂ´pital Franco-Britanique, Levallois-Perret, France.
- SimonMed Imaging, United States of America.
Abstract
Radiographic imaging is the primary imaging tool for assessing the presence of an abnormality with two main objectives: detection and characterization. This study reports a large-scale, international, multi-center, retrospective evaluation of a complete radiographic AI suite across various clinical settings. Radiographs from January 2022 to April 2025 were collected from multiple centers in 26 countries spanning 5 continents. All images were processed by the Rayvolve AI suite developed by AZmed (Paris, France). Two readers annotated each exam, and concordance between the readers was accepted as the ground truth. In cases of discordance, a third senior reader made the final decision. Key performance metrics included the area under the ROC curve (AUC), sensitivity, and specificity, for AZtrauma and AZchest; mean absolute error (MAE) and bias for AZmeasure; and MAE and r<sup>2</sup> for AZboneage. Subgroup analysis was performed by patients' age, sex, and country of acquisition. A total of 258,373 radiographs were analyzed. The AZtrauma algorithm achieved an AUC of 98.3 % with an overall sensitivity of 97.4 % and specificity of 96.4 %. The AUC of AZchest was 97.8 % associated with a sensitivity of 96.7 % and a specificity of 87.9 %. Automated measurements by AZmeasure showed excellent agreement with radiologists (MAE = 1.8° and 1.1 mm). AZboneage predictions correlated strongly with the ground truth (MAE = 0.5 years). Performance remained high across all subgroups, with no significant drop. The AI suite demonstrated robust, generalizable performance across diverse clinical environments. Successful validation of this system could support wider clinical adoption of AI in radiology, in line with ongoing global efforts to enhance workflow efficiency and diagnostic consistency.