Comparing percent breast density assessments of an AI-based method with expert reader estimates: inter-observer variability.

November 1, 2025

papers DOI: 10.1117/1.JMI.12.S2.S22011 PMID: 40520917

Authors

Romanov S,Howell S,Harkness E,Gareth Evans D,Astley S,Fergie M

Affiliations (2)

University of Manchester, Manchester, United Kingdom.
The Christie NHS Foundation Trust, Manchester, United Kingdom.

Abstract

Breast density estimation is an important part of breast cancer risk assessment, as mammographic density is associated with risk. However, density assessed by multiple experts can be subject to high inter-observer variability, so automated methods are increasingly used. We investigate the inter-reader variability and risk prediction for expert assessors and a deep learning approach. Screening data from a cohort of 1328 women, case-control matched, was used to compare between two expert readers and between a single reader and a deep learning model, Manchester artificial intelligence - visual analog scale (MAI-VAS). Bland-Altman analysis was used to assess the variability and matched concordance index to assess risk. Although the mean differences for the two experiments were alike, the limits of agreement between MAI-VAS and a single reader are substantially lower at +SD (standard deviation) 21 (95% CI: 19.65, 21.69) -SD 22 (95% CI: <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>22.71</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>20.68</mn></mrow> </math> ) than between two expert readers +SD 31 (95% CI: 32.08, 29.23) -SD 29 (95% CI: <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>29.94</mn></mrow> </math> , <math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>-</mo> <mn>27.09</mn></mrow> </math> ). In addition, breast cancer risk discrimination for the deep learning method and density readings from a single expert was similar, with a matched concordance of 0.628 (95% CI: 0.598, 0.658) and 0.624 (95% CI: 0.595, 0.654), respectively. The automatic method had a similar inter-view agreement to experts and maintained consistency across density quartiles. The artificial intelligence breast density assessment tool MAI-VAS has a better inter-observer agreement with a randomly selected expert reader than that between two expert readers. Deep learning-based density methods provide consistent density scores without compromising on breast cancer risk discrimination.

View Source Full Text PDF

Topics

Journal Article

Comparing percent breast density assessments of an AI-based method with expert reader estimates: inter-observer variability.

Authors

Affiliations (2)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?