Accuracy of Deep Learning for Detecting Axillary Lymph Node Metastasis in Breast Cancer: Systematic Review and Meta-Analysis.
Authors
Affiliations (3)
Affiliations (3)
- Department of Breast Surgery, Northern Jiangsu People's Hospital Affiliated to Yangzhou University, No.98 Nantong West Road, Guangling District, Yangzhou, 225001, China, 86 18051060677.
- Department of Ultrasonography, Northern Jiangsu People's Hospital Affiliated to Yangzhou University, Yangzhou, China.
- Institute of Translational Medicine, Jiangsu Key Laboratory of Integrated Traditional Chinese and Western Medicine for Prevention and Treatment of Senile Diseases, Medical College, Yangzhou University, Yangzhou, China.
Abstract
Axillary lymph node metastasis (ALNM) is an important factor in detecting breast cancer (BC). However, the noninvasive diagnosis of ALNM remains challenging. While some deep learning (DL) models have been developed for preoperative ALNM assessment, their performance lacks systematic evaluation. This study aims to evaluate the effectiveness of DL in detecting ALNM, providing evidence to support clinical diagnostic tools. Embase, Web of Science, PubMed, and Cochrane Library were searched from their inception through January 26, 2026. The Quality Assessment of Diagnostic Accuracy Studies was used to assess the risk of bias in the included studies. A bivariate mixed effects model was applied for analysis, and subgroup analyses were conducted based on different imaging modalities. This meta-analysis included 28 independent studies and pooled data from 20,811 patients with BC. Among them, 7123 cases had confirmed ALNM. The overall diagnostic performance of the DL model (bivariate mixed effects) for detecting ALNM in BC was as follows: sensitivity 0.80 (95% CI 0.76-0.84), specificity 0.85 (95% CI 0.80-0.88), diagnostic odds ratio (DOR) 22 (95% CI 16-30), and area under the summary receiver operating characteristic curve (AUC) 0.89 (95% CI 0.86-0.92). The positive likelihood ratio (LR+) was 5.2 (95% CI 4.1-6.5), and the negative likelihood ratio (LR-) was 0.24 (95% CI 0.19-0.29). For ultrasound-based DL models targeting ALNM detection, the pooled sensitivity and specificity were 0.79 (95% CI 0.72-0.84) and 0.86 (95% CI 0.79-0.91), respectively. Diagnostic performance metrics showed an LR+ of 5.5 (95% CI 3.8-8.1), an LR- of 0.25 (95% CI 0.19-0.32), a DOR of 22 (95% CI 15-33), and an AUC of 0.89 (95% CI 0.86-0.91). Regarding magnetic resonance imaging-based DL models for detecting ALNM, the pooled sensitivity was 0.78 (95% CI 0.71-0.83) and the pooled specificity was 0.82 (95% CI 0.76-0.87). Corresponding metrics included an LR+ of 4.4 (95% CI 3.3-5.9), an LR- of 0.27 (95% CI 0.21-0.35), a DOR of 16 (95% CI 11-25), and an AUC of 0.87 (95% CI 0.84-0.90). For computed tomography (CT)-based models, the sensitivity was 0.90 (95% CI 0.78-0.96), the specificity was 0.88 (95% CI 0.84-0.92), and the AUC was as high as 0.91 (95% CI 0.89-0.94). Current DL methods for detecting ALNM in BC primarily utilize ultrasound, magnetic resonance imaging, and CT. DL models based on all 3 modalities demonstrated good diagnostic performance. CT had the highest sensitivity and AUC, while its specificity was comparable to that of ultrasound. These findings provide supportive evidence for the development or optimization of clinical diagnostic models.