Organ masks applied in feature space improve weakly supervised scan-level CT classification.
Authors
Affiliations (3)
Affiliations (3)
- Diagnostic Image Analysis Group, Radboud University Medical Center, Nijmegen, 6525, The Netherlands. [email protected].
- Digital Technology and Innovation, Siemens Healthineers, Princeton, NJ, 08540, USA.
- Diagnostic Image Analysis Group, Radboud University Medical Center, Nijmegen, 6525, The Netherlands.
Abstract
Automated analysis of computed tomography (CT) scans is an active area of medical imaging research. Classification tasks often rely on scan-level labels without providing spatial information. When using frozen encoders from pretrained models, such as foundation models, enforcing anatomical focus by input-space cropping can shift the input distribution and degrade performance. We investigated whether organ masks can provide scalable anatomical guidance for weakly supervised classification using a frozen self-supervised Swin Transformer. Two strategies were evaluated: input-space organ centering and feature-space cropping, which applies organ masks to intermediate feature maps before pooling. Across three datasets and seven binary tasks, feature-space cropping matched or improved performance relative to full-image baselines, whereas input-space centering showed task-dependent effects. Feature-space cropping achieved a pooled improvement of 0.018 (95% confidence interval: 0.009-0.028) in area under the receiver operating characteristic curve, with the largest gains for liver lesion and pericardial effusion classification. Feature-space cropping reduced embedding dimensionality without loss of performance while preserving input distribution. In an ablation experiment of input-space centering, tighter crops reduced performance, highlighting the importance of peri-organ context. These findings demonstrate that feature-level anatomical guidance offers an efficient strategy to improve weakly supervised CT classification without retraining the encoder.