InfoOOD: information bottleneck optimization for post hoc medical image out-of-distribution detection.
Authors
Affiliations (4)
Affiliations (4)
- Department of Medical Physics, University of Wisconsin-Madison, 1111 Highland Ave, Madison, Wisconsin, 53705, UNITED STATES.
- Faculty of Mathematics and Physics, University of Ljubljana, Jadranska ulica 19, Ljubljana, Ljubljana, 1000, SLOVENIA.
- Department of Medical Physics, University of Wisconsin-Madison, 1111 Highland Ave, Madison, Wisconsin, 53706-1314, UNITED STATES.
- Medical Physics, University of Wisconsin-Madison, 1111 Highland Ave, Madison, Madison, Wisconsin, 53705, UNITED STATES.
Abstract
Deep learning models are prone to failure when inferring upon out-of-distribution (OOD) data, i.e., data whose features fundamentally differ from those in the training set. Existing OOD measures often lack sensitivity to the subtle image variations encountered within clinical settings. In this work, we investigate a post hoc, information-based approach to OOD detection-termed InfoOOD-which iteratively quantifies the amount of embedded feature information that can be shared between the training data and test data without degrading the model output.
Approach. Abdominal CT images from patients with metastatic liver lesions were used. A 3D U-Net was trained to segment liver organs and lesions using N=157 images. Physics-based artifacts-low dose, sparse view angles, and rings artifacts-were simulated on a separate set of N=40 test images at three intensity magnitudes. Segmentation performance and the ability of the InfoOOD measure to detect the artifact-induced OOD data were evaluated. An additional N=131 test images were used to assess the correlation between the InfoOOD measure and segmentation model performance metrics. In all evaluations, InfoOOD was compared with established embedded feature-based and reconstruction-based OOD detection methods. 
Results. Artifact simulation significantly degraded segmentation model performance across all artifact types and magnitudes (ρ<0.001), with model performance worsening as artifact magnitude increased. The InfoOOD measure consistently outperformed the embedded feature-based measures in detecting OOD data (e.g., AUC=0.93 vs. AUC=0.57 for the strong rings artifact) and surpassed the reconstruction-based measure across weak magnitude artifacts (e.g., AUC=0.75 vs. AUC=0.61 for the weak sparse view artifact). The InfoOOD measure also achieved stronger, negative correlations with segmentation performance metrics (e.g., ρ=-0.52 vs. ρ≥-0.11 for the lesion sensitivity metric). In both assessments, InfoOOD measure performance increased considerably with information bottleneck optimization iterations. 
Significance. This work introduces and validates a novel, highly sensitive, and clinically relevant information-theoretic approach for medical image OOD detection, supporting the safe deployment of deep learning models in clinical settings.