AI-based image quality assessment of positioning in mammography: considerations and challenges.
Authors
Affiliations (9)
Affiliations (9)
- Medical University of Innsbruck, Innsbruck, Austria.
- Hera-MI, Saint-Herblain, France.
- Evidia, Bergen, Norway.
- Way to Women SÃ rl, Le Vaud, Switzerland.
- Team Radiologie Plus, Chur, Switzerland.
- MSS Medical Software Solutions GmbH, Erlenbach, Switzerland.
- Medical University of Innsbruck, Department of Radiology, Innsbruck, Austria.
- Cancer Registry of Norway, Norwegian Institute of Public Health, Oslo, Norway.
- Medical University of Innsbruck, Department of Radiology, Innsbruck, Austria. [email protected].
Abstract
Artificial intelligence (AI) could facilitate and objectify quality assessment in the daily routine. The purpose was to explore the extent to which an AI prototype algorithm is able to replicate the perfect-good-moderate-inadequate (PGMI) system (perfect, good, moderate, inadequate). From a multicentre case collection, 200 standard mammograms (800 images) were selected. A deep learning-based prototype software was used to rate the images in analogy to the PGMI system. The AI results were compared with a reference standard obtained through consensus reading by three expert radiographers and one expert radiologist, using quadratically weighted Cohen's kappa with confidence intervals (CI) and context-based interpretation. Frequency and reasons for disagreement were evaluated for challenging cases with a discrepancy of two or more grades and a discrepancy in assigning an inadequate. For overall PGMI per image, slight agreement between human consensus and AI was observed for CC views (κ = 0.14) and fair agreement for MLO views (κ = 0.25). The highest agreement was observed for the CC category "M. Pectoralis visibility" (substantial, κ = 0.75). Best category in MLO was "Pectoralis angle" (moderate, κ = 0.49). For other categories, fair, slight or poor agreement was observed. The work-up of disagreement gave insight into misinterpretations of anatomical landmarks and causality issues in the categorization. Transforming the PGMI system into a fully automated AI algorithm is challenging and may differ substantially between subcategories. Further research in computer science and quality assessment methodology is needed to pave the way for AI-based objective quality management in mammography. Profound evaluation of AI algorithms and their ability to replicate human interpretation, scoring, and classification are the basis and scientific framework toward AI-based objective quality management in mammography. AI has huge potential for automated assessment of diagnostic image quality. Compared with human reading agreement, substantial disagreement may also be found. Direct transformation of perfect-good-moderate-inadequate scoring into an AI algorithm is challenging.