Counterfactual Reasoning for Mammogram Classification via Semantic Texture Masking.

April 28, 2026

papers

DOI: 10.1007/s10278-026-01969-1 PMID: 42050080

Authors

Arora R,Lee J

Affiliations (4)

Department of Computer Science and Engineering, Manipal University Jaipur, Jaipur, Rajasthan, 303007, India.
Department of Radiology, University of Pittsburgh, Pittsburgh, PA, 15213, USA.
Department of Radiology, University of Pittsburgh, Pittsburgh, PA, 15213, USA. [email protected].
Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, 15213, USA. [email protected].

Abstract

Artificial intelligence-based computer-aided diagnosis (CADx) systems have seen growing adoption in mammography, yet the limited interpretability of their decision-making processes remains a barrier to clinical trust. The present study aimed to investigate whether deep learning classifiers primarily rely on the characteristics of lesions or the surrounding breast tissue through a counterfactual reasoning, specifically using semantic masking in mammogram texture. We modified a part of mammograms by selectively removing texture information from lesion (foreground, FG) or non-lesion (background, BG) regions, replacing it with the mean image intensity, resulting in four scenarios involving benign and malignant foreground or background alterations. MobileNet, ResNet50, and ResNet50v2 were trained and evaluated on the CBIS-DDSM dataset; the area under the ROC curve (AUC) was used for assessing classification performance. All models had similar performance (AUCs = 0.74, 0.72, and 0.78, pairwise p-value > 0.05) on the original unaltered test set. Performance results differed dramatically under the above four masking scenarios: ResNet50 went completely wrong (AUC = 0.20, p-value < 0.0001) when malignant background information was removed, proving strong dependence on background context and difficulty focusing on subtle lesion features, while ResNet50v2 showed improved robustness (albeit its performance was severely impacted) for the same changes (AUC = 0.53, p-value < 0.0001), suggesting better preservation of lesion-level information. MobileNet was relatively stable across all masking scenarios, indicating robustness to region-specific changes. Understanding such region-specific dependencies can enhance model interpretability and support the development of more robust and reliable CADx systems for clinical use.

View Source Full Text PDF

Topics

Journal Article

Counterfactual Reasoning for Mammogram Classification via Semantic Texture Masking.

Authors

Affiliations (4)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?