Shortcut learning leads to sex bias in deep learning models for photoacoustic tomography.

May 9, 2025pubmed logopapers

Authors

Knopp M,Bender CJ,Holzwarth N,Li Y,Kempf J,Caranovic M,Knieling F,Lang W,Rother U,Seitel A,Maier-Hein L,Dreher KK

Affiliations (9)

  • Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany. [email protected].
  • Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany. [email protected].
  • Division of Intelligent Medical Systems (IMSY), German Cancer Research Center (DKFZ), Heidelberg, Germany.
  • Medical Faculty, Heidelberg University, Heidelberg, Germany.
  • Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany.
  • Department of Vascular Surgery, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany.
  • Department of Pediatrics and Adolescent Medicine, University Hospital Erlangen, FAU, Erlangen, Germany.
  • National Center for Tumor Diseases (NCT), NCT Heidelberg, a Partnership Between DKFZ and University Hospital Heidelberg, Heidelberg, Germany.
  • Faculty of Physics and Astronomy, Heidelberg University, Heidelberg, Germany.

Abstract

Shortcut learning has been identified as a source of algorithmic unfairness in medical imaging artificial intelligence (AI), but its impact on photoacoustic tomography (PAT), particularly concerning sex bias, remains underexplored. This study investigates this issue using peripheral artery disease (PAD) diagnosis as a specific clinical application. To examine the potential for sex bias due to shortcut learning in convolutional neural network (CNNs) and assess how such biases might affect diagnostic predictions, we created training and test datasets with varying PAD prevalence between sexes. Using these datasets, we explored (1) whether CNNs can classify the sex from imaging data, (2) how sex-specific prevalence shifts impact PAD diagnosis performance and underdiagnosis disparity between sexes, and (3) how similarly CNNs encode sex and PAD features. Our study with 147 individuals demonstrates that CNNs can classify the sex from calf muscle PAT images, achieving an AUROC of 0.75. For PAD diagnosis, models trained on data with imbalanced sex-specific disease prevalence experienced significant performance drops (up to 0.21 AUROC) when applied to balanced test sets. Additionally, greater imbalances in sex-specific prevalence within the training data exacerbated underdiagnosis disparities between sexes. Finally, we identify evidence of shortcut learning by demonstrating the effective reuse of learned feature representations between PAD diagnosis and sex classification tasks. CNN-based models trained on PAT data may engage in shortcut learning by leveraging sex-related features, leading to biased and unreliable diagnostic predictions. Addressing demographic-specific prevalence imbalances and preventing shortcut learning is critical for developing models in the medical field that are both accurate and equitable across diverse patient populations.

Topics

Journal Article
Get Started

Upload your X-ray image and get interpretation.

Upload now →

Disclaimer: X-ray Interpreter's AI-generated results are for informational purposes only and not a substitute for professional medical advice. Always consult a healthcare professional for medical diagnosis and treatment.