Interpretable Machine Learning Model for Differentiating Uterine Sarcoma From Atypical Leiomyoma Based on Conventional MRI Features and Radiomics.
Authors
Affiliations (3)
Affiliations (3)
- Department of Graduate, Bengbu Medical University, Bengbu, Anhui 233030, China (Z.Y., W.S., Y.J., K.G., C.W.).
- Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230031, China (T.L., Y.C., C.W., B.S., M.F., C.W.); Department of Radiology, Anhui Provincial Cancer Hospital, Hefei, Anhui 230031, China (T.L., Y.C., C.W., B.S., M.F., C.W.).
- Department of Graduate, Bengbu Medical University, Bengbu, Anhui 233030, China (Z.Y., W.S., Y.J., K.G., C.W.); Department of Radiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230031, China (T.L., Y.C., C.W., B.S., M.F., C.W.); Department of Radiology, Anhui Provincial Cancer Hospital, Hefei, Anhui 230031, China (T.L., Y.C., C.W., B.S., M.F., C.W.). Electronic address: [email protected].
Abstract
This study aims to develop interpretable machine learning (ML) models by integrating conventional magnetic resonance imaging (MRI) features and radiomics to preoperatively differentiate uterine sarcoma (US) and atypical leiomyoma (ALM). In this retrospective study, 160 patients (47 US, 113 ALM) were randomized into training (n=112) and test (n=48) cohorts. Two blinded radiologists assessed 10 MRI features from pelvic MRI examinations, including tumor border morphology, T2-weighted image (T2WI) signal heterogeneity, uterine endometrial cavity, apparent diffusion coefficient (ADC) value, and other features. Significant MRI features were identified through univariable and multivariable logistic regression analyses. Radiomics features were extracted from axial T2WI and diffusion-weighted imaging (DWI) sequences, with least absolute shrinkage and selection operator regression identifying four discriminative features for radiomic score (radscore) calculation. Five ML models are as follows: logistic regression (LR), random forest (RF), eXtreme gradient boosting (XGBoost), support vector machine (SVM), and Gaussian Naive Bayes (GNB) were trained using significant MRI predictors and radscore. Model performance was evaluated via area under the receiver operating characteristic curve (AUC), calibration curves, and decision curve analysis (DCA). The SHapley Additive exPlanation (SHAP) framework provided interpretable visualizations of feature contributions. Multivariable analysis identified four MRI discriminators as follows: heterogeneous hyperintensity on T2WI (odds ratio [OR]=43.767, P=0.021), ill-defined tumor border (OR=4.887, P=0.038), interrupted uterine cavity (OR=15.947, P=0.003), and low ADC values (OR=0.026, P=0.009). The XGBoost model achieved superior performance, with AUCs of 0.991 (95% confidence interval [CI]: 0.978-1.000) and 0.909 (95% CI: 0.822-0.995) in training and test cohorts, respectively. SHAP analysis highlighted ADC value as the most influential predictor, followed by tumor border, signal intensity on T2WI, radscore, and uterine endometrial cavity. DCA confirmed clinical utility across probability thresholds, and calibration curves demonstrated strong agreement between predicted and observed outcomes. Interpretable ML models integrating MRI biomarkers and radiomics provide a transparent and clinically actionable tool for preoperative differentiation of US and ALM. By quantifying feature contributions through SHAP and providing a transparent SHAP value, this framework bridges the "black-box" gap in ML, fostering clinicians trust and empowering clinicians to formulate precise interventions, such as appropriate surgical planning to avoid the morcellation of suspected US.