Attention-based Multimodal Spatiotemporal Enhanced Interaction Network For Major Depressive Disorder Detection.
Authors
Abstract
Although deep learning models have shown promising results in detecting major depressive disorder (MDD), two main limitations remain: insufficient exploitation of interactive information across multimodal brain networks and a lack of adaptive mechanisms for capturing crucial spatiotemporal dependencies among brain regions. To address these challenges, we propose the Attention-based Multimodal Spatiotemporal Enhanced Interaction Network (AM-SEIN) for MDD detection. Specifically, to tackle the first challenge, we integrate structural information from 3D structural magnetic resonance imaging (sMRI) with functional temporal data from functional magnetic resonance imaging (fMRI). Additionally, we design the Cross-Modal Interaction Network (CMIN) and fusion layer to enhance mutual information aggregation and facilitate interactive guidance between the two modalities. For the second challenge, we develop an attention-based adaptive spatiotemporal feature-extracting architecture for both modalities, incorporating the fMRI-based Adaptive Spatiotemporal Fusion (fASF) and the sMRI-based Regional-Level Content-Dependent (sRLCD) modules. This approach enables the effective encoding of inter-regional interactions relevant to MDD detection. Finally, the proposed AM-SEIN is evaluated on the Rest-meta-MDD(RMM) and Rest-meta-MDD-V2(RMM-V2) datasets, achieving state-of-the-art performance.