Current state of mammography-based artificial intelligence for future breast cancer risk prediction: a systematic review.
Authors
Affiliations (8)
Affiliations (8)
- Department of Radiology, University of Washington School of Medicine, WA, USA.
- Northwest Screening and Cancer Outcomes Research Enterprise, University of Washington, Seattle, WA, USA.
- Lunit, Seoul, South Korea.
- Department of Surgery, Medical University of South Carolina, Charleston, SC, USA.
- Department of Radiology, University of Wisconsin-Madison School of Medicine and Public Health, Madison, WI, USA.
- Computational Precision Health, University of California Berkeley and University of California San Francisco, CA, USA.
- Departments of Medicine and Epidemiology and Biostatistics, University of California, San Francisco, CA, USA.
- Division of Epidemiology, Department Quantitative Sciences, Mayo Clinic, Rochester, MN, USA.
Abstract
There is growing interest in artificial intelligence (AI) models for predicting future breast cancer (BC). We performed a systematic review of studies of mammography-based AI models for future BC risk prediction to summarize current evidence, identify knowledge gaps and inform future research directions. We searched six databases for studies from January 1, 2012 to February 28, 2025 that evaluated mammography-based AI models for future BC risk prediction. We extracted study design, participants' race and ethnicity, geographic origin, mammogram type, vendor, prediction time frame, BC type predicted, external validation and exclusion of cancers diagnosed on the index screening mammogram. Areas Under the Receiver Operating Curve (AUCs) were summarized overall and by study characteristics. Forty-one studies met our inclusion criteria. All studies were retrospective, and most used 2D mammograms (n = 37 studies) acquired using Hologic equipment (n = 25) and performed in the United States (n = 17); White, non-Hispanic women were most represented. Nearly all (40) studies assessed discrimination performance with median AUC of 0.71 for ≤2-year risk prediction, 0.72 for 3-4 year, and 0.71 for ≥5-year prediction. Median AUC was 0.75 for studies including index cancers versus 0.68 when excluded. Six studies reported model calibration performance ranging from good to overestimation of risk. Future studies should evaluate models using digital breast tomosynthesis, examine performance for aggressive or advanced BC, include diverse populations, and evaluate both discrimination and model calibration. Prospective evaluations are needed to determine the clinical utility of mammography-based AI models for personalized risk-based breast cancer screening before implementation.