High risk score of breast cancer by artificial intelligence (AI) on screening mammograms: a review of negative and cancer cases.
Authors
Affiliations (6)
Affiliations (6)
- Department of Radiology and Nuclear medicine, Østfold Hospital Trust, Kalnes, Norway.
- University of Oslo, Institute of Clinical Medicine, Oslo, Norway.
- Department of Breast Cancer Screening, Cancer Registry, Norwegian Institute of Public Health, Oslo, Norway.
- Norwegian Computing Center, Oslo, Norway.
- Department of Breast Cancer Screening, Cancer Registry, Norwegian Institute of Public Health, Oslo, Norway. [email protected].
- Department of Health and Care Sciences, UiT, The Artic University of Norway, Tromsø, Norway. [email protected].
Abstract
To investigate mammographic features associated with high artificial intelligence (AI) risk scores as provided by two AI models applied to screening mammograms. This retrospective study included 130,031 screening mammograms from 42,371 women attending BreastScreen Norway, 2008-2018. Two AI models (A and B) developed for cancer detection on screening mammograms were applied. An informed radiological review was conducted for mammograms within the highest 5% of AI risk scores by both models in two study samples: (1) High AI risk score, but no breast cancer detected within 6 years (n = 120), and (2) High AI risk score in mammograms with screen-detected cancers (n = 120). Mammographic density (BI-RADS a-d), features (mass, spiculated mass, asymmetry, architectural distortion, calcification alone, and density with calcification), and radiologists' interpretation scores (1-5) were analyzed descriptively. Mammographic density was higher in sample 1 compared to sample 2 (BI-RADS d: 11% vs 3%, respectively). In sample 1, calcifications alone were the most frequent AI-marked feature (model A: 72%; model B: 68%), predominantly with amorphous morphology and a cluster distribution, and 76% were interpreted as benign by the radiologists (interpretation score 1). In sample 2, a spiculated mass was the most frequent mammographic feature among the screen-detected cancers (29%). Mammograms assigned high AI risk scores exhibit distinct features depending on screening outcome. Systematic characterization of these features may help refine AI thresholds, improve specificity, reduce AI false-positive findings, and decrease the recall rate in breast cancer screening. Question Knowledge about mammographic features associated with high AI risk scores is essential for distinguishing cancer from non-cancer cases. Findings Calcifications were the dominant feature in non-cancers in screening mammograms with high AI risk score, whereas spiculated mass was the most frequent feature among cancers. Clinical relevance Calcifications in non-cancer screening mammograms with a high AI risk score were frequently interpreted as benign or probably benign by radiologists. This knowledge may help refine AI thresholds and thereby improve specificity and reduce false-positive results in mammographic screening.