CMVFT: A Multi-Scale Attention Guided Framework for Enhanced Keratoconus Suspect Classification in Multi-View Corneal Topography.
Authors
Affiliations (5)
Affiliations (5)
- School of Electrical Engineering, Shanghai Dianji University, Shanghai, China. Electronic address: [email protected].
- School of Electrical Engineering, Shanghai Dianji University, Shanghai, China. Electronic address: [email protected].
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, China. Electronic address: [email protected].
- Jincheng People's Hospital, Jincheng, China. Electronic address: [email protected].
- School of Electrical Engineering, Shanghai Dianji University, Shanghai, China. Electronic address: [email protected].
Abstract
Retrospective cross-sectional study. To develop a multi-view fusion framework that effectively identifies suspect keratoconus cases and facilitates the possibility of early clinical intervention. A total of 573 corneal topography maps representing eyes classified as normal, suspect, or keratoconus. We designed the Corneal Multi-View Fusion Transformer (CMVFT), which integrates features from seven standard corneal topography maps. A pretrained ResNet-50 extracts single-view representations that are further refined by a custom-designed Multi-Scale Attention Module (MSAM). This integrated design specifically compensates for the representation gap commonly encountered when applying Transformers to small-sample corneal topography datasets by dynamically bridging local convolution-based feature extraction with global self-attention mechanisms. A subsequent fusion Transformer then models long-range dependencies across views for comprehensive multi-view feature integration. The primary measure was the framework's ability to differentiate suspect cases from normal and keratoconus cases, thereby creating a pathway for early clinical intervention. Experimental evaluation demonstrated that CMVFT effectively distinguishes suspect cases within a feature space characterized by overlapping attributes. Ablation studies confirmed that both the MSAM and the fusion Transformer are essential for robust multi-view feature integration, successfully compensating for potential representation shortcomings in small datasets. This study is the first to apply a Transformer-driven multi-view fusion approach in corneal topography analysis. By compensating for the representation gap inherent in small-sample settings, CMVFT shows promise in enabling the identification of suspect keratoconus cases and supporting early intervention strategies, with prospective implications for early clinical intervention.