Deep learning based eye disease classification using Optical Coherence Tomography (OCT) images.
Authors
Affiliations (5)
Affiliations (5)
- School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
- School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad, Pakistan; Kohsar University Murree, Murree, Pakistan. Electronic address: [email protected].
- Department of Physiology, Ziauddin Medical College, Ziauddin University, Karachi, Pakistan.
- Computer and Communication Engineering for Capacity Building Research Center, School of Applied Digital Technology, Mae Fah Luang University, Chiang Rai, Thailand.
- Pakistan Institute of Ophthalmology, Rawalpindi, Punjab, Pakistan.
Abstract
Optical Coherence Tomography (OCT) imaging plays a crucial role in the early diagnosis of ocular diseases. This study aims to develop and evaluate deep learning models for multi-class classification of retinal diseases using OCT images. Four Convolutional Neural Network (CNN) architectures, including VGG16, VGG19, ResNet50, and InceptionV3, were evaluated under three experimental settings: (i) without preprocessing or data augmentation, (ii) with preprocessing and data augmentation, and (iii) with class imbalance handling using latent feature space SMOTE. The dataset comprised 7314 OCT images collected from a single clinical center (Shifa Eye Foundation Hospital, Haripur, Pakistan), categorized into four classes: Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), Drusen, and Normal. Model performance was assessed using accuracy, precision, recall, F1 score, Cohen's Kappa, Matthews Correlation Coefficient (MCC), and ROC AUC. External validation was also performed using an independent dataset to evaluate generalizability. Among the evaluated models, VGG-based architectures demonstrated the most consistent performance. On the internal dataset, VGG19 achieved an accuracy of 97.27% without SMOTE, which improved slightly to 97.54% after applying latent feature space SMOTE, with corresponding precision, recall, and F1 scores above 97%. External validation showed a lower but still competitive performance, with VGG19 achieving an accuracy of 96.07%, highlighting reduced generalizability across different data sources. In contrast, deeper architectures such as ResNet50 and InceptionV3 showed comparatively weaker and less stable performance, particularly on external data. The findings demonstrate the potential of deep learning for automated retinal disease classification using OCT images, particularly when combined with appropriate preprocessing and class imbalance handling. However, the high internal performance should be interpreted in the context of a curated single-center dataset. The observed reduction in performance on external validation underscores the importance of dataset diversity and the need for further multi-center evaluation before real-world clinical deployment.