Multiparametric MRI-based Interpretable Machine Learning Radiomics Model for Distinguishing Between Luminal and Non-luminal Tumors in Breast Cancer: A Multicenter Study.
Zhou Y, Lin G, Chen W, Chen Y, Shi C, Peng Z, Chen L, Cai S, Pan Y, Chen M, Lu C, Ji J, Chen S
•papers•Jul 1 2025To construct and validate an interpretable machine learning (ML) radiomics model derived from multiparametric magnetic resonance imaging (MRI) images to differentiate between luminal and non-luminal breast cancer (BC) subtypes. This study enrolled 1098 BC participants from four medical centers, categorized into a training cohort (n = 580) and validation cohorts 1-3 (n = 252, 89, and 177, respectively). Multiparametric MRI-based radiomics features, including T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), diffusion-weighted imaging (DWI), apparent diffusion coefficient (ADC), and dynamic contrast-enhanced (DCE) imaging, were extracted. Five ML algorithms were applied to develop various radiomics models, from which the best performing model was identified. A ML-based combined model including optimal radiomics features and clinical predictors was constructed, with performance assessed through receiver operating characteristic (ROC) analysis. The Shapley additive explanation (SHAP) method was utilized to assess model interpretability. Tumor size and MR-reported lymph node status were chosen as significant clinical variables. Thirteen radiomics features were identified from multiparametric MRI images. The extreme gradient boosting (XGBoost) radiomics model performed the best, achieving area under the curves (AUCs) of 0.941, 0.903, 0.862, and 0.894 across training and validation cohorts 1-3, respectively. The XGBoost combined model showed favorable discriminative power, with AUCs of 0.956, 0.912, 0.894, and 0.906 in training and validation cohorts 1-3, respectively. The SHAP visualization facilitated global interpretation, identifying "ADC_wavelet-HLH_glszm_ZoneEntropy" and "DCE_wavelet-HLL_gldm_DependenceVariance" as the most significant features for the model's predictions. The XGBoost combined model derived from multiparametric MRI may proficiently differentiate between luminal and non-luminal BC and aid in treatment decision-making. An interpretable machine learning radiomics model can preoperatively predict luminal and non-luminal subtypes in breast cancer, thereby aiding therapeutic decision-making.