Deep Learning-Assisted Automated Diagnosis of Osteoporosis Based on Computed Tomography Scans: Systematic Review and Meta-Analysis.
Authors
Affiliations (1)
Affiliations (1)
- Department of Orthopedics, Beijing Chaoyang Hospital, Capital Medical University, 5 JingYuan Road, Shijingshan District, Beijing, 100043, China, 86 51718268.
Abstract
Osteoporosis is a prevalent skeletal disorder characterized by decreased bone mass and increased fracture risk; however, it frequently remains underdiagnosed due to limited health care resources and its asymptomatic progression. Deep learning (DL) provides a promising solution for automated screening using computed tomography (CT) scans, enabling earlier detection and improved management. This systematic review and meta-analysis aimed to investigate the diagnostic performance of DL models in diagnosing osteoporosis based on CT scans. This study was conducted under the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines using articles extracted from PubMed, Scopus, Web of Science (Core), and Embase (Ovid). Studies involving adult participants who underwent CT and in which DL was applied for osteoporosis diagnosis were included. The QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies-2) tool was used to estimate the risk of bias in each study. The confusion matrices from the included studies were extracted to summarize the diagnostic performance of DL models for osteoporosis. Within a bivariate random-effects framework, sensitivity and specificity were jointly synthesized to yield the summary estimates. Heterogeneity was quantified with Higgins I² statistics. Subgroup analyses were performed to explore potential sources of heterogeneity among the included studies. This review included 24 studies, encompassing CT images from 29,808 participants. All studies used conventional CT scans and used DL-based architectures. Fifteen, 6, and 3 studies were assessed as having a low, uncertain, and high risk of bias, respectively. The meta-analysis included 20 studies. The pooled sensitivity and specificity were 0.88 (95% CI 0.85-0.91; I2=83.69%) and 0.94 (95% CI 0.91-0.96; I2=95.07%) for osteoporosis diagnosis; 0.81 (95% CI 0.76-0.85; I2=82.38%) and 0.92 (95% CI 0.90-0.94; I2=79.05%) for osteopenia identification; and 0.95 (95% CI 0.92-0.97; I2=98.28%) and 0.93 (95% CI 0.91-0.95; I2=94.93%) for normal case identification. The area under the curve of the DL models for identifying osteoporosis, osteopenia, and normal cases was 0.96 (95% CI 0.93-0.97), 0.94 (95% CI 0.92-0.96), and 0.98 (95% CI 0.96-0.99), respectively. Subgroup analyses revealed that models based on DenseNet variants (P<.01), multislice input (P<.01), 3D architecture (P<.01), and CT as the reference standard (P<.01) demonstrated superior diagnostic performance. This study indicated that CT-based DL models achieve promising diagnostic performance for osteoporosis. However, substantial heterogeneity among the included studies, limited external validation, and incomplete end-to-end pipelines constrain the generalizability of the proposed models. Further research is warranted to support their clinical translation and standardized application.