An automated hip fracture detection, classification system on pelvic radiographs and comparison with 35 clinicians.
Authors
Affiliations (5)
Affiliations (5)
- Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Imperial College London, London, SW7 2AZ, UK. [email protected].
- Department of Orthopedics and Traumatology, Manisa Alasehir State Hospital, 45600, Manisa, Turkey.
- Mechatronics Engineering Department, Yildiz Technical University, 34349, Istanbul, Turkey.
- Department of Orthopedics and Traumatology, Manisa City Hospital, 45040, Manisa, Turkey.
- Department of Orthopedics and Traumatology, Manisa Celal Bayar University Hafsa Sultan Hospital, 45030, Manisa, Turkey.
Abstract
Accurate diagnosis of orthopedic injuries, especially pelvic and hip fractures, is vital in trauma management. While pelvic radiographs (PXRs) are widely used, misdiagnosis is common. This study proposes an automated system that uses convolutional neural networks (CNNs) to detect potential fracture areas and predict fracture conditions, aiming to outperform traditional object detection-based systems. We developed two deep learning models for hip fracture detection and prediction, trained on PXRs from three hospitals. The first model utilized automated hip area detection, cropping, and classification of the resulting patches. The images were preprocessed using the Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm. The YOLOv5 architecture was employed for the object detection model, while three different pre-trained deep neural network (DNN) architectures were used for classification, applying transfer learning. Their performance was evaluated on a test dataset, and compared with 35 clinicians. YOLOv5 achieved a 92.66% accuracy on regular images and 88.89% on CLAHE-enhanced images. The classifier models, MobileNetV2, Xception, and InceptionResNetV2, achieved accuracies between 94.66% and 97.67%. In contrast, the clinicians demonstrated a mean accuracy of 84.53% and longer prediction durations. The DNN models showed significantly better accuracy and speed compared to human evaluators (p < 0.0005, p < 0.01). These DNN models highlight promising utility in trauma diagnosis due to their high accuracy and speed. Integrating such systems into clinical practices may enhance the diagnostic efficiency of PXRs.