Multimodal machine learning for surgical decision support in epilepsy: Current evidence and translational gaps.
Authors
Affiliations (3)
Affiliations (3)
- Neurology, Epilepsy, and Movement Disorders Unit, Bambino Gesù Children's Hospital, IRCCS (full member of European Reference Network EpiCARE), Rome, Italy.
- Department of Physiology, Behavioral Neuroscience PhD Program, Sapienza University, Rome, Italy.
- University Hospital KU Leuven, Leuven, Belgium.
Abstract
This systematic review synthesizes evidence on multimodal machine learning (ML) decision support systems for epilepsy surgery focusing on postsurgical outcome prediction, with emphasis on methodological quality and implications for clinical practice. Following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, we searched PubMed, Scopus, and Web of Science using predefined keywords. Seventy records were screened; 10 studies met inclusion/exclusion criteria, reporting ML-based prediction of surgical outcomes in drug-resistant epilepsy (DRE) and using ≥2 data modalities. Extracted items included study design, population, data sources, algorithms, validation strategy, performance metrics, and outcome definitions. Two reviewers independently screened records. Nine studies were retrospective and one prospective; seven were single-center and two multicenter. Most integrated neuroimaging (9/10), electroencephalography (8/10), and clinical variables (7/10); two included neuropsychology, and one added ablation parameters for magnetic resonance-guided laser interstitial thermal therapy. Sample sizes ranged from 15 to 11 067. Performance varied; best results (area under the curve [AUC] ≈ .95) were reported with multimodal gradient boosting, whereas ablation-based models achieved lower discrimination (AUC ≈ .67). The oldest neural-network study reported 98% accuracy on a small, nonstandard dataset. Cross-validation predominated; only two studies assumed prospective validation. Outcome definitions were heterogeneous, and time points were inconsistently specified. Despite variability, several clinically relevant findings emerged; multimodal ML improved, but not universally, prediction of seizure freedom, supported epileptogenic-zone localization, and in one large multicenter study, enabled earlier identification of surgical candidates compared with routine referral pathways. ML shows promise for outcome prediction and presurgical decision support in DRE, particularly when integrating multimodal data. Translation is currently constrained by limited external and prospective validation, inconsistent outcome frameworks, and insufficient interpretability. Future research should prioritize harmonized endpoints, multicenter external validation (including federated approaches), and explainable models capable of informing both patient selection and surgical strategy.