4DfCF: 4D fMRI CrossFormer Vision Transformer.
Authors
Abstract
Investigating the spatiotemporal dynamics of the human brain is a complex challenge due to the intricate nature of brain networks, and the limitations of current analytical methods. Herein, we introduce the 4D functional Magnetic Resonance Imaging (fMRI) CrossFormer (4DfCF), a novel vision transformer architecture designed to process high-dimensional 4D fMRI data. This model integrates temporal and spatial dimensions to effectively learn and predict cognitive and clinical outcomes. We further evaluated the 4DfCF on three benchmark datasets: Attention Deficit Hyperactivity Disorder-200 (ADHD-200), Alzheimer's Disease Neuroimaging Initiative (ADNI), and Autism Brain Imaging Data Exchange (ABIDE). The results showed that our model consistently outperforms state-of-the-art baseline models, achieving an accuracy improvement of 5-10%, a precision increase of 4-8%, a recall enhancement of 6-9%, and an F1-score boost of 7-11%. Additionally, the 4D fMRI CrossFormer-Tiny variant demonstrated greater efficiency than existing methods, using 20% fewer computational resources and achieving 30% faster training times. Pre-training experiments further reveal that models pre-trained on one dataset and fine-tuned on another achieved faster convergence and higher accuracy, with the Autism Brain Imaging Data Exchange (ABIDE) pre-trained models showing the best performance. Additionally, we employed an explainable AI method to identify the brain regions associated with disease diagnosis. Overall, our findings highlight the potential of the 4DfCF to advance precision neuroscience through efficient and scalable analysis of complex fMRI data.