A multi-label transformer-based deep learning approach to predict focal visual field progression

Ling Chen, Vincent S. Tseng, Ta Hsin Tsung, Da Wen Lu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Purpose: Tracking functional changes in visual fields (VFs) through standard automated perimetry remains a clinical standard for glaucoma diagnosis. This study aims to develop and evaluate a deep learning (DL) model to predict regional VF progression, which has not been explored in prior studies. Methods: The study included 2430 eyes of 1283 patients with four or more consecutive VF examinations from the baseline. A multi-label transformer-based network (MTN) using longitudinal VF data was developed to predict progression in six VF regions mapped to the optic disc. Progression was defined using the mean deviation (MD) slope and calculated for all six VF regions, referred to as clusters. Separate MTN models, trained for focal progression detection and forecasting on various numbers of VFs as model input, were tested on a held-out test set. Results: The MTNs overall demonstrated excellent macro-average AUCs above 0.884 in detecting focal VF progression given five or more VFs. With a minimum of 6 VFs, the model demonstrated superior and more stable overall and per-cluster performance, compared to 5 VFs. The MTN given 6 VFs achieved a macro-average AUC of 0.848 for forecasting progression across 8 VF tests. The MTN also achieved excellent performance (AUCs ≥ 0.86, 1.0 sensitivity, and specificity ≥ 0.70) in four out of six clusters for the eyes already with severe VF loss (baseline MD ≤ − 12 dB). Conclusion: The high prediction accuracy suggested that multi-label DL networks trained with longitudinal VF results may assist in identifying and forecasting progression in VF regions.


  • Artificial intelligence
  • Focal progression
  • Glaucoma
  • Multi-label deep learning
  • Visual field


Dive into the research topics of 'A multi-label transformer-based deep learning approach to predict focal visual field progression'. Together they form a unique fingerprint.

Cite this