Back to all papers

Privacy-aware diabetic retinopathy grading and visual lesion-focused interpretability through mixture-of-experts federated deep learning with explainable AI.

June 23, 2026pubmed logopapers

Authors

Tashrif MTA,Kundu D,Bithee MMA,Rahman A,Farid FA,Karim HA,Miah ASM

Affiliations (8)

  • Department of CSE, National Institute of Textile Engineering and Research (NITER), Constituent Institute of the University of Dhaka, Savar, Dhaka, 1350, Bangladesh.
  • Department of CSE, National Institute of Textile Engineering and Research (NITER), Constituent Institute of the University of Dhaka, Savar, Dhaka, 1350, Bangladesh. [email protected].
  • School of Computing, Georgia Southern University, Statesboro, GA, 30458, USA. [email protected].
  • Faculty of Computer Science and Informatics, Berlin School of Business and Innovation, Karl-Marx-Straße 97-99, 12043, Berlin, Germany. [email protected].
  • Centre for Image and Vision Computing (CIVC), Multimedia University, 63100, Cyberjaya, Malaysia. [email protected].
  • Centre for Image and Vision Computing (CIVC), Multimedia University, 63100, Cyberjaya, Malaysia. [email protected].
  • Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, 6205, Bangladesh.
  • Universiti Kuala Lumpur, Malaysian Institute of Information Technology (UniKL MIIT), 1016, Jalan Sultan Ismail, 50250, Kuala Lumpur, Malaysia.

Abstract

Diabetic Retinopathy (DR) is still a major cause of vision loss that can be avoided. This means that we need automated screening systems that can work across institutions without putting sensitive medical data in one place. Although Federated Learning (FL) allows for cooperative model training while preserving data locality, Non-IID data distribution, communication overhead, and unstable convergence frequently limit its effectiveness in medical imaging. This paper suggests a federated Mixture-of-Experts (FL-MoE) framework for DR classification that combines interpretable deep learning and expert specialization in order to overcome these challenges. Using the EyePACS and APTOS-2019 retinal fundus datasets, this paper evaluates multiple backbone architectures, including Convolutional Neural Networks (CNN), a hybrid CNN-LSTM model, and transformer-based Vision Transformer (ViT), within the FL-MoE framework. FL-MoE improves performance under heterogeneous client distributions for several backbone architectures, particularly CNN-LSTM, though performance varies across models. The CNN-LSTM backbone achieves 76.2% accuracy with 89.5% AUC on EyePACS while reducing communication cost by an order of magnitude compared to transformer-based models. Furthermore, CNN-LSTM exhibits more stable convergence and stronger robustness to client-level data heterogeneity. Grad-CAM based explainability analysis qualitatively shows attention maps highlighting retinal regions commonly associated with DR. To quantify localisation quality, we computed Intersection-over-Union (IoU) with IDRiD lesion masks; mean IoU values were below 0.03 for all lesion types, confirming the coarse, exploratory nature of the visualisations. Overall, the proposed FL-MoE framework with a CNN-LSTM backbone offers an effective and practical solution for scalable, privacy-aware Diabetic Retinopathy screening in federated clinical environments, outperforming both standard federated baselines and a representative personalized FL method (FedBN) under heterogeneous data conditions.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.