Development and validation of an interpretable machine learning model for diagnosing pathologic complete response in breast cancer.

Authors

Zhou Q,Peng F,Pang Z,He R,Zhang H,Jiang X,Song J,Li J

Affiliations (7)

  • Department of Breast Surgery, Tangshan People's Hospital (Hebei Key Laboratory of Molecular Oncology, Affiliated Tangshan People's Hospital of North China University of Science and Technology), Tangshan, Hebei, China. Electronic address: [email protected].
  • Department of Radiology, Tangshan People's Hospital, Tangshan, Hebei, China.
  • Department of Breast Surgery, Cancer Hospital of China Medical University, Liaoning Cancer Hospital and Institute, Shenyang, Liaoning, China.
  • Department of Breast Diagnosis and Treatment Center, Tangshan People's Hospital (Hebei Key Laboratory of Molecular Oncology, Affiliated Tangshan People's Hospital of North China University of Science and Technology), Tangshan, Hebei, China.
  • Department of Breast Surgery, The Second Hospital of Jilin University, Changchun, Jilin, China.
  • Department of Breast Surgery, The First Hospital of China Medical University, Shenyang, Liaoning, China.
  • Department of Breast Surgery, Tangshan People's Hospital (Hebei Key Laboratory of Molecular Oncology, Affiliated Tangshan People's Hospital of North China University of Science and Technology), Tangshan, Hebei, China.

Abstract

Pathologic complete response (pCR) following neoadjuvant chemotherapy (NACT) is a critical prognostic marker for patients with breast cancer, potentially allowing surgery omission. However, noninvasive and accurate pCR diagnosis remains a significant challenge due to the limitations of current imaging techniques, particularly in cases where tumors completely disappear post-NACT. We developed a novel framework incorporating Dimensional Accumulation for Layered Images (DALI) and an Attention-Box annotation tool to address the unique challenge of analyzing imaging data where target lesions are absent. These methods transform three-dimensional magnetic resonance imaging into two-dimensional representations and ensure consistent target tracking across time-points. Preprocessing techniques, including tissue-region normalization and subtraction imaging, were used to enhance model performance. Imaging features were extracted using radiomics and pretrained deep-learning models, and machine-learning algorithms were integrated into a stacked ensemble model. The approach was developed using the I-SPY 2 dataset and validated with an independent Tangshan People's Hospital cohort. The stacked ensemble model achieved superior diagnostic performance, with an area under the receiver operating characteristic curve of 0.831 (95 % confidence interval, 0.769-0.887) on the test set, outperforming individual models. Tissue-region normalization and subtraction imaging significantly enhanced diagnostic accuracy. SHAP analysis identified variables that contributed to the model predictions, ensuring model interpretability. This innovative framework addresses challenges of noninvasive pCR diagnosis. Integrating advanced preprocessing techniques improves feature quality and model performance, supporting clinicians in identifying patients who can safely omit surgery. This innovation reduces unnecessary treatments and improves quality of life for patients with breast cancer.

Topics

Breast NeoplasmsMachine LearningJournal ArticleValidation Study

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.