Explainable Breast Cancer Detection Using Hierarchical Multi-Scale and Edge-Aware Transformer Networks.
Authors
Affiliations (6)
Affiliations (6)
- Department of Computer Science and Information, College of Science Zulfi, Majmaah University, Al-Majmaah 11952, Saudi Arabia.
- Department of Software Engineering, Faculty of Computing, Blekinge Institute of Technology, 371 79 Karlskrona, Sweden.
- Department of Computer Science, University of Wah, Wah Cantt 47040, Pakistan.
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia.
- Center for Scientific Research and Entrepreneurship, Northern Border University, Arar 73213, Saudi Arabia.
- Department of Computer Science and Engineering, Soonchunhyang University, Asan 31538, Republic of Korea.
Abstract
Breast cancer remains the leading cause of cancer-related deaths among women globally. Early detection through mammography is vital for improving survival rates; however, the large volume of medical images and subtle variations in lesion characteristics pose significant challenges to manual interpretation. Recent automated diagnostic models based on deep learning have shown strong potential for breast cancer classification, but challenges such as overfitting, high computational complexity, limited generalization, and insufficient interpretability remain unresolved. This paper proposes a computationally efficient and context-aware deep learning framework for breast cancer classification using transformer-based multi-scale attention mechanisms and explainable artificial intelligence (XAI). The proposed architecture integrates the Hierarchical Multi-Scale Transformer (HMT) and Edge-Aware Local Transformer (ELT) modules to jointly capture global contextual dependencies and boundary-sensitive local representations from mammographic images. ELT improves feature refinement in high-entropy regions, while HMT models global semantic interactions across multiple feature scales. In addition, an Adaptive Contextual Refinement (ACR) module is introduced to preserve semantically consistent feature representations across spatial resolutions. A Meta-Ensemble Classification (MEC) framework combining weighted SVM and K-Nearest Neighbors (KNN) classifiers is further employed using validation-guided class-adaptive weighting. The proposed framework is evaluated on four benchmark mammography datasets, namely CBIS-DDSM, DDSM, INBreast, and MIAS. The proposed model has demonstrated superior accuracy of over 99% across all breast cancer datasets. The model surpassed transformer-based baselines including Swin-T and ViT while maintaining lower parameter complexity and achieving approximately 7% higher accuracy on the CBIS-DDSM dataset. The proposed framework also demonstrated strong cross-dataset generalization and consistently achieved high precision, recall, and F1-score values across all benchmark datasets. To improve model interpretability, Grad-CAM, SHAP, Occlusion Sensitivity Analysis (OSA), and the proposed TIxAI consistency analysis framework are incorporated to provide preliminary explainability assessment for mammographic classification. The explainability analysis demonstrated spatially consistent saliency behavior across benchmark datasets; however, the current evaluation is based on internal saliency consistency rather than external clinical validation using expert lesion annotations. Overall, the proposed framework provides an effective and computationally efficient approach for automated breast cancer classification while improving model explainability and interpretability.