Back to all papers

A deep learning framework for breast cancer diagnosis using Swin Transformer and Dual-Attention Multi-scale Fusion Network.

March 12, 2026pubmed logopapers

Authors

Aldawsari MA,Aldosari SJ,Ismail A,Emam MM

Affiliations (3)

  • Department of Mathematics, College of Sciences and Humanities, Prince Sattam bin Abdulaziz University, 11942, Al-Kharj, Saudi Arabia.
  • Physics Department, Al-Azhar University, Asyut, 71524, Egypt.
  • Faculty of Computers and Information, Minia University, Minia, Egypt. [email protected].

Abstract

Breast cancer is among the most prevalent cancers affecting women worldwide, and early detection through mammography is critical to reducing mortality rates. Convolutional neural networks (CNNs) have demonstrated notable effectiveness in classifying mammograms. However, they are constrained in their ability to capture long-range contextual dependencies. On the other hand, transformer-based models excel at handling global relationships; however, they often require large datasets and substantial computing power, which limits their direct application in medical imaging. To overcome these limitations, we propose Swin-DAMFN, a novel dual-branch hybrid architecture for breast cancer classification from mammograms. The first branch utilizes a Swin Transformer to model global dependencies through shifted window self-attention, while the second branch employs a CNN-based Dual-Attention Multi-scale Fusion Network (DAMFN) to capture fine-grained local features, such as microcalcifications and structural distortions. The CNN branch incorporates two custom modules, Multi Separable Attention (MSA) and Tri-Shuffle Convolution Attention (TSCA), for multi-scale discriminative feature extraction. An attention-guided fusion mechanism integrates global and local features into a unified representation. To address dataset limitations, we adopt an advanced augmentation strategy that combines Generative Adversarial Networks (GANs) to synthesize realistic mammograms and photometric augmentation to introduce appearance variability, thereby mitigating class imbalance and enhancing model generalization. A lightweight classification head based on global average pooling and fully connected layers ensures both efficiency and diagnostic accuracy. Extensive experiments on the MIAS and CBIS-DDSM datasets demonstrate that Swin-DAMFN achieves superior results, reaching 99.30% accuracy, 99.14% sensitivity, and 99.15% F1-score on CBIS-DDSM, while maintaining 98.75% accuracy, 98.37% sensitivity, and 98.42% F1-score on MIAS.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.