Multi scale self supervised learning for deep knowledge transfer in diabetic retinopathy grading.
Authors
Affiliations (5)
Affiliations (5)
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia. [email protected].
- Department of Computer Science, Imam Abdulrahman Bin Faisal University, Dammam, 31441, Saudi Arabia. [email protected].
- Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia.
- SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia.
- Interdisciplinary Research Center of Intelligent Secure Systems (IRC-ISS), King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia.
Abstract
Diabetic retinopathy is a leading cause of vision loss, necessitating early, accurate detection. Automated deep learning models show promise but struggle with the complexity of retinal images and limited labeled data. Due to domain differences, traditional transfer learning from datasets like ImageNet often fails in medical imaging. Self-supervised learning (SSL) offers a solution by enabling models to learn directly from medical data, but its success depends on the backbone architecture. Convolutional Neural Networks (CNNs) focus on local features, which can be limiting. To address this, we propose the Multi-scale Self-Supervised Learning (MsSSL) model, combining Vision Transformers (ViTs) for global context and CNNs with a Feature Pyramid Network (FPN) for multi-scale feature extraction. These features are refined through a Deep Learner module, improving spatial resolution and capturing high-level and fine-grained information. The MsSSL model significantly enhances DR grading, outperforming traditional methods, and underscores the value of domain-specific pretraining and advanced model integration in medical imaging.