Contextual anatomy-guided deep learning for accurate fovea segmentation in diabetic retinopathy fundus images.
Authors
Affiliations (4)
Affiliations (4)
- College of Digital Science, Prince of Songkla University, Songkhla, 90110, Thailand.
- Division of Computational Science, Faculty of Science, Prince of Songkla University, Songkhla, 90110, Thailand.
- Department of Ophthalmology, Faculty of Medicine, Prince of Songkla University, Songkhla, 90110, Thailand.
- Division of Computational Science, Faculty of Science, Prince of Songkla University, Songkhla, 90110, Thailand. [email protected].
Abstract
Accurate fovea segmentation in fundus images is a critical step in diabetic retinopathy screening; however, it remains a challenging task due to the indistinct boundaries of the fovea. Beyond simple localization, precise segmentation offers essential clinical value for Diabetic Macular Edema (DME) management, as treatment decisions-specifically the choice between intravitreal anti-VEGF injection for center-involved DME and laser therapy for extrafoveal edema-depend on the accurate delineation of the foveal region. While existing methods often rely on increasing model architecture complexity, the potential of anatomical context within the training process remains under-explored. This paper presents a data-centric approach that leverages contextual information to robustly identify the fovea. We demonstrate that progressively incorporating key anatomical landmarks-the optic disc, retina, and blood vessels-into training labels significantly enhances fovea detection. To facilitate this, we developed IDRiD-RETA-FV, a meticulously annotated dataset comprising 81 images (54 training, 27 testing) with complete anatomical structures (inter-observer F1=0.98), and introduce MNv4Fovea, a framework designed to explicitly exploit these anatomical inter-dependencies through a multi-class constraints mechanism. Evaluation on the held-out test set with verified ground truth demonstrates excellent segmentation performance (fovea IoU = 0.812, F1 = 0.894, AED = 4.06 pixels). To demonstrate the efficacy of our synthesis strategy, our GEV-based augmentation technique achieves a detection rate of 98.4% compared to 59.0% for baseline geometric augmentation (paired t-test: t = 8.536, p < 0.001, Cohen's d = 1.093). Cross-dataset evaluation on REFUGE, MESSIDOR, and ARIA demonstrates competitive localization performance, achieving state-of-the-art Average Euclidean Distance on REFUGE (22.46 ± 18.73 pixels) and MESSIDOR (6.52 ± 5.89 pixels) with robust generalization across diverse imaging protocols. These results establish that explicit anatomical context, rather than mere model complexity, is key to accurate fovea segmentation, offering a robust paradigm for medical image analysis.