Assessing deep learning model performance in osteoporosis screening with lumbar spine radiographs.
Authors
Affiliations (3)
Affiliations (3)
- Department of Orthopedics, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand.
- Department of Radiology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand.
- Department of Radiology, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand. [email protected].
Abstract
To diagnose osteoporosis, assess the risk of fragility fracture, and determine the necessity for treatment, bone mineral density (BMD) is mostly measured from dual energy X-ray absorptiometry (DXA) as a gold standard. Due to the limited resources of DXA, we proposed the deep learning models to screen for osteoporosis and measure its accuracy on osteoporosis detection from lumbar spine radiographs. The models were developed from the training data set (2244 anteroposterior and 2368 lateral lumbar spine radiographs). We categorized patients into two groups based on DXA BMD T-score: non-osteoporosis (T > - 2.5) and osteoporosis (T ≤ - 2.5). A two-class models were trained to classify non-osteoporosis and osteoporosis. Model performance was tested with the test data set (963 AP and 1018 lateral images) to evaluate the accuracy. The results showed that, for AP images, the ResNet-18 model diagnosing osteoporosis achieved an area under the curve (AUC) of 0.79 (95% confidence interval [CI] 0.76-0.82) with a concomitant sensitivity of 79.7% (95% CI 74.4-85.0%) and specificity of 66.5% (95% CI 63.1-69.9%). For lateral images, the DarkNet-19 model yielded the highest AUC at 0.82 (95% CI 0.80-0.85) with the highest sensitivity for lateral data set at 87.5% (95% CI 83.1-91.9%) and specificity of 79.4% (95% CI 76.6-82.2%). Deep learning models may have the efficacy to anticipate osteoporosis screening based on lumbar spine radiographs which would be helpful as a readily available tool for assessing the risk and determining treatment.