Back to all papers

Multimodal Deep Learning for Longitudinal Prediction of Glaucoma Progression Using Sequential RNFL, Visual Field, and Clinical Data

November 4, 2025medrxiv logopreprint

Authors

Moradi, M.,Cao-Xue, J.,Eslami, M.,Wang, M.,Elze, T.,Zebardast, N.

Affiliations (1)

  • Harvard Medical School

Abstract

Forecasting glaucoma progression remains a major challenge in preventing irreversible vision loss. We developed and validated a multimodal, longitudinal deep learning framework to predict future progression using a large retrospective cohort of 10,864 patients from Mass Eye and Ear. The model integrates sequential structural (OCT RNFL scans), functional (visual-field maps), and clinical data from a two-year observation window to forecast progression over the subsequent two-to four-year horizon. Four backbone architectures (ConvNeXt-V2, ViT, MobileNet-V2, EfficientNet-B0) were coupled with a bidirectional LSTM to capture temporal dynamics. The ConvNeXt-V2-based model achieved 0.97 AUC and 0.94-0.96 accuracy, outperforming other backbones with robust performance across sex and race subgroups and only modest attenuation in those > 70 years. Saliency maps localized to clinically relevant arcuate bundles, supporting biological plausibility. By effectively fusing multimodal data over time, this framework enables accurate, interpretable, and equitable long-horizon risk stratification, advancing personalized glaucoma management.

Topics

health informatics

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.