Efficient fusion transformer model for accurate classification of eye diseases.
Authors
Affiliations (1)
Affiliations (1)
- Beijing Jiaotong University Weihai Campus, 264401, Weihai, China. [email protected].
Abstract
The automatic diagnosis model of medical image based on deep learning can improve the diagnosis efficiency and reduce the diagnosis cost. At present, there is a lack of research on special artificial intelligence models for medical image analysis of fundus disease characteristics. Considering that fundus diseases have both local and global features, this paper proposes a novel deep learning model Local-Global Scale Fusion Network (LGSF-Net). The novelty lies in a dual-stream fusion design that processes global context (Transformer) and local details (CNN) in parallel with residual fusion. On the public fundus dataset, LGSF-Net delivers 96% accuracy with only 18.7K parameters and 0.93 GFLOPs, outperforming existing state-of-the-art universal methods like ResNet50 and ViT. LGSF-Net is more suitable for clinical diagnosis because of its accuracy and lightweight design. The ablation study shows that the concept of LGSF-Net multi-scale fusion understanding has been correctly realized. This work effectively promotes the development of smart medicine and provides a new solution for the design of new deep learning models.