A Spatiotemporal Transformer Framework on 4D-CT for Early Prediction of Progressive Pulmonary Fibrosis.
Authors
Affiliations (6)
Affiliations (6)
- Department of Computer Science, Fairleigh Dickinson University, Vancouver Campus, Canada.
- Department of Electrical Engineering, Faculty of Technology and Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran.
- Department of Computer Engineering, Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran.
- Department of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran.
- Shien-Ming Wu, School of Intelligent Engineering, South China University of Technology, Guangzhou, People's Republic of China.
- School of Electrical Engineering, Iran University of Science and Technology (IUST), Tehran, Iran. [email protected].
Abstract
Progressive pulmonary fibrosis (PPF) remains difficult to predict because static imaging may not fully capture regional respiratory motion, ventilation heterogeneity, and mechanical deformation. We propose a spatiotemporal transformer (ST-Former) framework that jointly models dynamic lung deformation, ventilation, and regional strain from four-dimensional computed tomography (4D-CT). The model integrates CT intensity, deformable registration-derived displacement magnitude, Jacobian-based ventilation maps, and Green-Lagrange strain magnitude maps into a unified four-channel spatiotemporal representation across ten respiratory phases. A total of 210 subjects were included in the final analytic cohort, comprising 130 patients with progressive pulmonary fibrosis and 80 patients with non-progressive interstitial lung disease. Progression was defined using a multidisciplinary reference standard based on ≥ 10% relative decline in forced vital capacity within approximately 12 months and/or radiological worsening. Across patient-level fivefold cross-validation, ST-Former achieved an AUC of 0.947 ± 0.016, outperforming ConvLSTM-3D (0.902 ± 0.021) and 3D ResNet-50 (0.861 ± 0.027). The model also demonstrated improved calibration, with a Brier score of 0.067 and an expected calibration error of 0.028. Center-stratified and leave-one-center-out analyses were performed to assess robustness to institutional domain shift. These results suggest that motion-informed spatiotemporal learning may provide a promising imaging biomarker for predicting pulmonary fibrosis progression, although prospective external validation is required before clinical translation.