Robust Uncertainty-Informed Glaucoma Classification Under Data Shift.
Authors
Affiliations (4)
Affiliations (4)
- Department of Biomedical Engineering, University of Illinois Chicago, Chicago, IL, USA.
- Artificial Intelligence in Ophthalmology (Ai-O) Center, University of Illinois Chicago, Chicago, IL, USA.
- Illinois Eye and Ear Infirmary, Department of Ophthalmology and Visual Sciences, University of Illinois Chicago, Chicago, IL, USA.
- Duke Eye Center, Department of Ophthalmology, Duke University, Durham, NC, USA.
Abstract
Standard deep learning (DL) models often suffer significant performance degradation on out-of-distribution (OOD) data, where test data differs from training data, a common challenge in medical imaging due to real-world variations. We propose a unified self-censorship framework as an alternative to the standard DL models for glaucoma classification using deep evidential uncertainty quantification. Our approach detects OOD samples at both the dataset and image levels. Dataset-level self-censorship enables users to accept or reject predictions for an entire new dataset based on model uncertainty, whereas image-level self-censorship refrains from making predictions on individual OOD images rather than risking incorrect classifications. We validated our approach across diverse datasets. Our dataset-level self-censorship method outperforms the standard DL model in OOD detection, achieving an average 11.93% higher area under the curve (AUC) across 14 OOD datasets. Similarly, our image-level self-censorship model improves glaucoma classification accuracy by an average of 17.22% across 4 external glaucoma datasets against baselines while censoring 28.25% more data. Our approach addresses the challenge of generalization in standard DL models for glaucoma classification across diverse datasets by selectively withholding predictions when the model is uncertain. This method reduces misclassification errors compared to state-of-the-art baselines, particularly for OOD cases. This study introduces a tunable framework that explores the trade-off between prediction accuracy and data retention in glaucoma prediction. By managing uncertainty in model outputs, the approach lays a foundation for future decision support tools aimed at improving the reliability of automated glaucoma diagnosis.