Back to all papers

Deceptive Bias Measurement in Deep Learning: Assessing Shortcut Reliance in TCGA Cancer Models

December 15, 2025medrxiv logopreprint

Authors

Kheiri, F.,Rahnamayan, S.,Makrehchi, M.

Affiliations (1)

  • Ontario Tech University: University of Ontario Institute of Technology

Abstract

Bias in machine learning is a persistent challenge because it can create unfair outcomes, limit generalization, and reduce trust in real-world applications. A key source of this problem is shortcut learning, where models exploit signals linked to sensitive attributes, such as data source or collection site, instead of relying on task, relevant features. To tackle this, we propose the Deceptive Signal metric, a novel quantitative measure designed to assess the extent of a models reliance on hidden shortcuts during the learning process. This metric is derived via the Deceptive Bias Detection pipeline, which isolates shortcut dependence by contrasting model behavior under two controlled conditions: (1) Full Exclusion, where a sensitive subgroup is completely removed from training; and (2) Partial Exclusion, where the model has limited access to specific classes within the subgroup. By calculating the behavioral shift between these settings, the Deceptive Signal metric provides a concrete value representing the models proneness to learning task-irrelevant patterns. In experiments with the TCGA histopathology dataset, our metric successfully quantified strong dependencies on center-specific artifacts in models trained for cancer classification. Author summaryDeep learning models are becoming powerful tools in healthcare, but they often suffer from a critical vulnerability: they can get the right answer for the wrong reason. In medical imaging, an AI might correctly identify a tumor not by analyzing the tissue, but by recognizing irrelevant digital markers unique to the specific hospital or scanner that produced the image. This phenomenon, known as shortcut learning, makes AI systems appear accurate at first glance while remaining unreliable for real-world patient care. To solve this, our research moves beyond simple accuracy checks and introduces a specific quantitative metric for shortcut learning. We developed a testing framework that forces the model into controlled training scenarios, deliberately withholding specific "shortcut" information to see how the model reacts. By mathematically comparing the models behavior across these scenarios, we calculate a precise score that indicates the magnitude of the models dependence on irrelevant patterns. This metric allows to put a concrete number on a models trustworthiness and ensuring that medical decisions are driven by biology, not background noise.

Topics

pathology

Ready to Sharpen Your Edge?

Subscribe to join 7,400+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.