Longitudinal Analysis of Changes in Deep Learning Image-based Breast Cancer Risk Scores over Time.
Authors
Affiliations (2)
Affiliations (2)
- Department of Radiology, Massachusetts General Hospital, Harvard Medical School, 55 Fruit St. WAC 240, Boston, MA 02114.
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY.
Abstract
Background Artificial intelligence (AI)-based image-only deep learning (DL) models provide individualized 5-year breast cancer risk estimates directly from screening mammograms. Although these models are validated for static risk prediction, their behavior over time has not been established. Purpose To evaluate whether image-based DL risk scores change over time and whether trajectories differ between women who develop breast cancer and those who remain cancer-free. Materials and Methods In this multisite retrospective cohort study, women who underwent screening mammography between January 2009 and December 2019 were included. Women diagnosed with invasive cancer or ductal carcinoma in situ within 1 year of the index examination (<i>n</i> = 817) were compared with cancer-free controls. A validated image-only DL model generated continuous 5-year risk scores. Linear mixed-effects models with random intercepts and slopes evaluated score trajectories over time, and group comparisons were performed using the Wilcoxon rank sum and χ<sup>2</sup> tests. Subgroup analyses were stratified by age and breast density. Results The final study cohort included 158 807 screening mammograms from 54 014 women (median age, 61 years [IQR, 52-70 years]), including 817 patients with cancer and 53 197 cancer-free controls. Among women who developed cancer, the median risk score increased from 2.1 six years before diagnosis to 6.6 at the index examination. In contrast, women who were cancer-free had stable scores (range, 1.8-2.2). In longitudinal models, scores increased over time in the cancer group (slope, 1.13 per year [95% CI: 1.07, 1.18]; <i>P</i> < .001), whereas scores changed minimally in the cancer-free controls group (slope, 0.09 per year [95% CI: 0.08, 0.10]; <i>P</i> < .001) (slope difference, Δ = 1.04 [95% CI: 0.99, 1.09]; <i>P</i> < .001). The findings were consistent across subgroups. Conclusion AI-based risk scores from screening mammograms evolved over time and diverged between women who did and did not develop breast cancer, supporting their potential as dynamic biomarkers for risk-adaptive screening and prevention strategies. © RSNA, 2026 See also the editorial by Mann and Wang in this issue.