Dual-input spatio-temporal transformer model: Predicting the efficacy of NACT in breast cancer based on DCE-MRI images.
Authors
Affiliations (3)
Affiliations (3)
- College of Information Science and Technology, Zhejiang Shuren University, Hangzhou, China.
- Department of Decision & System Sciences, Saint Joseph's University, Philadelphia, PA, USA.
- Department of Breast Surgery, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.
Abstract
Breast cancer remains a leading malignancy among women worldwide. Neoadjuvant chemotherapy (NACT) can shrink tumors and increase the possibility of breast-conserving surgery, yet many patients respond poorly, causing treatment delays and unnecessary toxicity. Accurate early prediction of NACT response is critical for tailoring therapy. To address this challenge, we developed a dual-input spatiotemporal transformer (DIST) model that analyzes dynamic contrast-enhanced MRI scans obtained before and after the first chemotherapy cycle. By jointly learning spatial and temporal tumor changes, DIST provides early, interpretable imaging-based predictions of treatment response. In validation, the model achieved high accuracy and strong generalizability across institutional and external datasets. These findings demonstrate that DIST enables reliable early assessment of chemotherapy efficacy, offering clinicians a valuable tool to optimize treatment strategies, minimize ineffective therapy, and improve patient outcomes.