Back to all papers

Multimodal Deep Learning for Longitudinal Prediction of Glaucoma Progression Using Sequential RNFL, Visual Field, and Clinical Data

November 4, 2025medrxiv logopreprint

Authors

Moradi, M.,Cao-Xue, J.,Eslami, M.,Wang, M.,Elze, T.,Zebardast, N.

Affiliations (1)

  • Harvard Medical School

Abstract

Forecasting glaucoma progression remains a major challenge in preventing irreversible vision loss. We developed and validated a multimodal, longitudinal deep learning framework to predict future progression using a large retrospective cohort of 10,864 patients from Mass Eye and Ear. The model integrates sequential structural (OCT RNFL scans), functional (visual-field maps), and clinical data from a two-year observation window to forecast progression over the subsequent two-to four-year horizon. Four backbone architectures (ConvNeXt-V2, ViT, MobileNet-V2, EfficientNet-B0) were coupled with a bidirectional LSTM to capture temporal dynamics. The ConvNeXt-V2-based model achieved 0.97 AUC and 0.94-0.96 accuracy, outperforming other backbones with robust performance across sex and race subgroups and only modest attenuation in those > 70 years. Saliency maps localized to clinically relevant arcuate bundles, supporting biological plausibility. By effectively fusing multimodal data over time, this framework enables accurate, interpretable, and equitable long-horizon risk stratification, advancing personalized glaucoma management.

Topics

health informatics

Ready to Sharpen Your Edge?

Subscribe to join 7,600+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.