Performance of breast cancer risk prediction algorithms across mammography systems in the UK screening programme.
Authors
Affiliations (7)
Affiliations (7)
- University of Cambridge, Department of Radiology, Cambridge, UK.
- Cambridge University Hospitals NHS Foundation Trust, Department of Radiology, Cambridge, UK.
- University of Cambridge, EPSRC Cambridge Mathematics of Information in Healthcare Hub, Cambridge, UK.
- Barts Health NHS Trust, Department of Radiology, London, UK.
- Norfolk and Norwich University Hospital, Department of Radiology, Norwich, UK.
- University of Cambridge, Department of Radiology, Cambridge, UK. [email protected].
- Cambridge University Hospitals NHS Foundation Trust, Department of Radiology, Cambridge, UK. [email protected].
Abstract
Thirty percent of interval breast cancers, diagnosed between routine screening mammograms, have a poorer prognosis than screen-detected cancers. Deep learning algorithms can estimate short-term risk from negative mammograms to guide supplemental imaging or screening intervals, but comparative validation on complete national screening data is lacking. We retrospectively evaluated four risk algorithms (Mirai, iCAD, Transpara, and Google) using 112,621 negative mammograms from two UK NHS Breast Screening Programme sites with different mammography systems (Philips, GE) over one screening round (2014-2017) with five-year follow-up, including 1225 future cancers. There was a distinct ranking in discriminative ability; overall AUCs ranged 0.65-0.72, only one algorithm significantly differed between systems. For interval cancers, AUCs ranged 0.67-0.77. Within the highest 4.0% of risk scores, top algorithms identified ~20% of future cancers, including ~27% of interval cancers, doubling at the 14.0% threshold. These differences highlight the need for multi-algorithm prospective trials and potential fine-tuning to improve generalisation across unseen systems.