Machine Learning-Based Risk Assessment of Myasthenia Gravis Onset in Thymoma Patients and Analysis of Their Correlations and Causal Relationships.
Liu W, Wang W, Zhang H, Guo M
The study aims to utilize interpretable machine learning models to predict the risk of myasthenia gravis onset in thymoma patients and investigate the intrinsic correlations and causal relationships between them. A comprehensive retrospective analysis was conducted on 172 thymoma patients diagnosed at two medical centers between 2018 and 2024. The cohort was bifurcated into a training set (n = 134) and test set (n = 38) to develop and validate risk predictive models. Radiomic and deep features were extracted from tumor regions across three CT phases: non-enhanced, arterial, and venous. Through rigorous feature selection employing Spearman's rank correlation coefficient and LASSO (Least Absolute Shrinkage and Selection Operator) regularization, 12 optimal imaging features were identified. These were integrated with 11 clinical parameters and one pathological subtype variable to form a multi-dimensional feature matrix. Six machine learning algorithms were subsequently implemented for model construction and comparative analysis. We utilized SHAP (SHapley Additive exPlanation) to interpret the model and employed doubly robust learner to perform a potential causal analysis between thymoma and myasthenia gravis (MG). All six models demonstrated satisfactory predictive capabilities, with the support vector machine (SVM) model exhibiting superior performance on the test cohort. It achieved an area under the curve (AUC) of 0.904 (95% confidence interval [CI] 0.798-1.000), outperforming other models such as logistic regression, multilayer perceptron (MLP), and others. The model's predictive result substantiates the strong correlation between thymoma and MG. Additionally, our analysis revealed the existence of a significant causal relationship between them, and high-risk tumors significantly elevated the risk of MG by an average treatment effect (ATE) of 9.2%. This implies that thymoma patients with types B2 and B3 face a considerably high risk of developing MG compared to those with types A, AB, and B1. The model provides a novel and effective tool for evaluating the risk of MG development in patients with thymoma. Furthermore, correlation and causal analysis have unveiled pathways that connect tumor to the risk of MG, with a notably higher incidence of MG observed in high risk pathological subtypes. These insights contribute to a deeper understanding of MG and drive a paradigm shift in medical practice from passive treatment to proactive intervention.