A predictive model for pathological complete response in neoadjuvant-treated breast cancer: integrating pretreatment clinicopathologic and contrast-enhanced ultrasound characteristics via interpretable machine learning.

July 3, 2026

papers

DOI: 10.1186/s12885-026-16452-x PMID: 42399808

Authors

Huang D,Chen Y,Chen W,Wang Y,Lin W,Tang L,Liu X

Affiliations (3)

Department of Ultrasound, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian, China.
Department of Ultrasound, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian, China. [email protected].
Department of Radiology, Clinical Oncology School of Fujian Medical University, Fujian Cancer Hospital, Fuzhou, Fujian, China. [email protected].

Abstract

The ability to predict pathological complete response (pCR) prior to neoadjuvant therapy (NAT) would inform and facilitate tailored therapeutic regimens for patients with breast cancer. This study aimed to develop and internally validate an interpretable machine-learning model for pretreatment pCR prediction using routinely available clinicopathologic variables, hematologic indices, and ultrasound (US)/contrast-enhanced ultrasound (CEUS) descriptors. This retrospective analysis included 309 sequential breast cancer cases managed with neoadjuvant therapy followed by surgical intervention. Patients were randomly divided into training and testing cohorts at a ratio of 8:2 using stratified sampling according to pCR status. Feature selection was performed exclusively in the training cohort using least absolute shrinkage and selection operator (LASSO) regression. Five machine-learning classifiers, including CatBoost, LightGBM, XGBoost, SVM, and CART, were developed and evaluated. Model performance was assessed using the area under the receiver operating characteristic curve (AUC, with 95% CI), accuracy, specificity, sensitivity, precision, recall, F1-score, G-mean, Brier score, decision curve analysis (DCA) and calibration analysis. Shapley Additive Explanations (SHAP) were used for model interpretation. pCR was achieved in 29.8% (92/309) patients. LASSO regression selected eight predictors, including CEA, clinical T stage, ER, PR, HER2, Ki-67 index, Hyperechoic halo on US, and Range expansion on CEUS. In the testing cohort, CatBoost achieved an AUC of 0.8295 (95% CI: 0.7166-0.9231) and showed the highest overall accuracy (0.8065), specificity (0.8182), and precision (0.6364). CatBoost also demonstrated acceptable calibration, with a Brier score of 0.1595, and favorable net benefit across clinically relevant threshold probabilities. SHAP analysis identified HER2 status, CEA, clinical T stage, and hyperechoic halo as the leading contributors to the final model. An interpretable pretreatment CatBoost model showed good discrimination, acceptable calibration, and favorable clinical net benefit for predicting pCR after NAT in breast cancer. This approach may help support pretreatment risk stratification and inform multidisciplinary discussions.

View Source Full Text PDF

Topics

Journal Article

A predictive model for pathological complete response in neoadjuvant-treated breast cancer: integrating pretreatment clinicopathologic and contrast-enhanced ultrasound characteristics via interpretable machine learning.

Authors

Affiliations (3)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?