Glaucoma Detection Using Deep Learning and Prompt-Based Explainable Report Generation
Authors
Affiliations (1)
Affiliations (1)
- Lakehead University
Abstract
Glaucoma is a leading cause of irreversible blindness and requires early detection to prevent vision loss. This study proposes a novel framework for automated glaucoma detection using fundus images, integrating deep learning and explainable artificial intelligence (XAI). By unifying five public datasets (RIM-ONE, ACRIMA, DRISHTI-GS, REFUGE, and EyePACS), we have created a diverse dataset to enhance model generalizability. An ensemble of five deep learning models, three convolutional neural networks (ResNet50, EfficientNet-B0, DenseNet121) and two transformer-based models (Vision Transformer, Swin Transformer) are trained for robust classification. Grad-CAM and attention rollout visualizations provided insight into model decision making, highlighting critical regions such as the optic disc and cup. These visualizations, combined with ensemble predictions, were processed by Google Gemini 1.5 Flash to generate clinician-style diagnostic reports. The ensemble model has achieved a test accuracy of 95.38% and an AUC of 0.99, outperforming individual models. This framework improves diagnostic accuracy and interpretability, bridging the gap between AI predictions and clinical utility, with potential for future integration into real-world ophthalmic workflows.