Performance of clinical breast cancer risk prediction models versus a mammography-based artificial intelligence risk model.
Authors
Affiliations (8)
Affiliations (8)
- Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA.
- Division of Biology and Biomedical Sciences, Washington University in St. Louis, St. Louis, MO, USA.
- Department of Radiology, Mayo Clinic, Phoenix, AZ, USA.
- Division of Experimental Pathology and Laboratory Medicine, Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA.
- Division of General Internal Medicine, Department of Internal Medicine, Mayo Clinic, Rochester, MN, USA.
- Department of Radiology, Columbia University Irving Medical Center, New York City, NY, USA.
- Departments of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
- Departments of Medicine and Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
Abstract
Artificial intelligence (AI) -based, mammography breast cancer (bc) risk prediction models show improved discriminatory accuracy relative to clinical risk models. However, data on their calibration are limited. This study compared model performance of three clinical bc risk models-Gail, Tyrer-Cuzick (TC) v8, and Breast Cancer Surveillance Consortium (BCSC) v3 - to the MIRAI AI-risk model. Digital mammograms were ascertained from a screening mammography cohort of 12,308 women within the Mayo Clinic Biobank with 250 incident BCs (176 invasive) within five years. We predicted five-year bc risk, estimated discriminatory accuracy (concordance [C]-index) and calibration (observed to expected ratio [O/E]) of both overall and invasive bc, and compared estimates using bootstrapping approaches. MIRAI demonstrated similar or improved discriminatory accuracy of overall bc (C-index = 0.71, 95% confidence interval [CI]=0.68-0.74) and invasive bc (C-index = 0.71, 95%CI = 0.67-0.75) compared to clinical models (Overall bc: C-index = 0.59-0.68, Invasive bc: C-index = 0.60-0.68). MIRAI's calibration for risk of overall bc (O/E = 0.96, 95%CI = 0.85-1.08) was improved compared to Gail (O/E = 1.22, 95%CI = 1.07-1.38) and BCSC (O/E = 1.38, 95%CI = 1.22-1.56) but similar to TC with volumetric percent density and polygenic risk score (O/E = 0.99, 95%CI = 0.87-1.13). However, for low-risk women (approximately 50%), MIRAI overestimated risk of overall bc. MIRAI also overestimated risk of invasive bc across the risk spectrum (O/E = 0.68, 95%CI = 0.58-0.78), while clinical models had good calibration (O/E = 0.86-0.99). MIRAI demonstrated stronger discriminatory accuracy than clinical models for five-year overall and invasive bc risk prediction but overestimated risk for both bc endpoints. AI-based risk models should consider discriminatory accuracy and calibration for invasive cancer before implementation.