Accuracy of machine learning in predicting recurrence risk of lumbar disc herniation after percutaneous endoscopic lumbar discectomy: a systematic review and meta-analysis.
Authors
Affiliations (5)
Affiliations (5)
- The Third School of Clinical Medicine, Zhejiang Chinese Medical University, Hangzhou, 310053, Zhejiang, China.
- Department of spinal surgery, The First Affiliated Hospital of Guilin Medical University, Guilin, 541001, Guangxi, China.
- Department of Orthopedics, The Third Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, 310005, Zhejiang, China. [email protected].
- Department of Orthopedics, Zhejiang Rehabilitation Medical Center, Hangzhou, 310051, Zhejiang, China.
- Department of Orthopedics, The Third Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, 310005, Zhejiang, China.
Abstract
Lumbar disc herniation (LDH) is a major cause of low back pain and radicular leg pain. Percutaneous endoscopic lumbar discectomy (PELD) is a widely used minimally invasive procedure for the treatment of LDH; however, recurrent lumbar disc herniation (RLDH) after surgery remains a significant clinical challenge. Recent studies have explored the application of machine learning (ML) models for predicting recurrence risk after PELD, but variations in variable selection and algorithm choice have led to inconsistent predictive performance. This systematic review and meta-analysis evaluated the feasibility and predictive accuracy of ML models for recurrence risk after PELD. This review was conducted in accordance with the PRISMA guidelines and registered in PROSPERO (CRD420261286308). PubMed, Web of Science, Medline, and Embase were systematically searched for studies related to PELD up to December 30, 2025. The PROBAST tool was used to assess the risk of bias of the included studies. The c-index was defined as the primary outcome measure, and a random-effects meta-analysis was performed. The initial search identified 1,769 publications. After screening, 11 studies involving 33 ML models were included in the analysis. To avoid statistical non-independence, one model per study was analyzed. In the training set, the pooled c-index of models integrating clinical characteristics with radiomic features was 0.900 (95% CI: 0.873-0.926). One MRI-based radiomics-only model achieved a c-index of 0.913 (95% CI: 0.872-0.955). In the validation set, the pooled c-index of the combined-feature models was 0.851 (95% CI: 0.769-0.932), whereas the available MRI-based radiomics model achieved a c-index of 0.859 (95% CI: 0.778-0.940). ML models demonstrate promising potential for predicting the risk of RLDH after PELD. Based on the currently available evidence, most prediction models were developed by integrating clinical characteristics with radiomic features, and these models generally exhibited high discriminative performance in the pooled analysis. Among these models, those incorporating magnetic resonance imaging and X-ray features together with clinical variables demonstrated particularly favorable predictive performance. Future studies should focus on establishing robust, high-quality databases to further improve the accuracy and reliability of predictive models.