PGMI assessment in mammography: AI software versus human readers.

Authors

Santner T,Ruppert C,Gianolini S,Stalheim JG,Frei S,Hondl M,Fröhlich V,Hofvind S,Widmann G

Affiliations (9)

  • Medical University of Innsbruck, Fritz-Pregl-Strasse 3, 6020, Innsbruck, Austria. Electronic address: [email protected].
  • b-rayZ AG, Wagistrasse 21, 8952, Schlieren, Switzerland. Electronic address: [email protected].
  • MSS Medical Software Solutions GmbH, Holzwiesstrasse 48, 8703, Erlenbach, Switzerland. Electronic address: [email protected].
  • Evidia, Kaigaten 5, 5015, Bergen, Norway. Electronic address: [email protected].
  • Way to Women Sàrl, Chemin du Pré de l'Epine 7, 1261, Le Vaud, Switzerland. Electronic address: [email protected].
  • Klinik Ottakring, Montleartstraße 37, 1160, Wien, Austria. Electronic address: [email protected].
  • University of Applied Sciences Wiener Neustadt, Johannes Gutenberg Strasse 3, 2700, Wiener Neustadt, Austria. Electronic address: [email protected].
  • Cancer Registry of Norway, Norwegian Institute of Public Health, Ullernchausseen 64, 0379, Oslo, Norway. Electronic address: [email protected].
  • Medical University of Innsbruck, Department of Radiology, Anichstrasse 35, 6020, Innsbruck, Austria. Electronic address: [email protected].

Abstract

The aim of this study was to evaluate human inter-reader agreement of parameters included in PGMI (perfect-good-moderate-inadequate) classification of screening mammograms and explore the role of artificial intelligence (AI) as an alternative reader. Five radiographers from three European countries independently performed a PGMI assessment of 520 anonymized mammography screening examinations randomly selected from representative subsets from 13 imaging centres within two European countries. As a sixth reader, a dedicated AI software was used. Accuracy, Cohen's Kappa, and confusion matrices were calculated to compare the predictions of the software against the individual assessment of the readers, as well as potential discrepancies between them. A questionnaire and a personality test were used to better understand the decision-making processes of the human readers. Significant inter-reader variability among human readers with poor to moderate agreement (κ = -0.018 to κ = 0.41) was observed, with some showing more homogenous interpretations of single features and overall quality than others. In comparison, the software surpassed human inter-reader agreement in detecting glandular tissue cuts, mammilla deviation, pectoral muscle detection, and pectoral angle measurement, while remaining features and overall image quality exhibited comparable performance to human assessment. Notably, human inter-reader disagreement of PGMI assessment in mammography is considerably high. AI software may already reliably categorize quality. Its potential for standardization and immediate feedback to achieve and monitor high levels of quality in screening programs needs further attention and should be included in future approaches. AI has promising potential for automated assessment of diagnostic image quality. Faster, more representative and more objective feedback may support radiographers in their quality management processes. Direct transformation of common PGMI workflows into an AI algorithm could be challenging.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.