ADHD prediction from individual-space T1 images using a Vision Transformer with a gross-region grid framework.
Authors
Affiliations (4)
Affiliations (4)
- Graduate Degree Program of Applied Data Sciences, Sophia University, 7-1 Kioi-cho, Chiyoda-ku, 102-8554, Tokyo, Japan. Electronic address: [email protected].
- Medical Institute of Developmental Disabilities Research, Showa Medical University, 6-11-11 Kita-karasuyama, Setagaya-ku, 157-8577, Tokyo, Japan. Electronic address: [email protected].
- Medical Institute of Developmental Disabilities Research, Showa Medical University, 6-11-11 Kita-karasuyama, Setagaya-ku, 157-8577, Tokyo, Japan. Electronic address: [email protected].
- Graduate Degree Program of Applied Data Sciences, Sophia University, 7-1 Kioi-cho, Chiyoda-ku, 102-8554, Tokyo, Japan. Electronic address: [email protected].
Abstract
Predicting attention-deficit/hyperactivity disorder (ADHD) from neuroimaging remains challenging due to heterogeneous brain morphology. In this study, we proposed an end-to-end framework using Vision Transformer (ViT) models to directly learn discriminative features from individual-space T1-weighted MRI. We evaluated two anatomical coverage patterns to assess the impact of data reduction and spatial granularity: (1) whole-brain (WB) axial slices and (2) 11 representative slices (R11). Our results demonstrated that the ViT achieved the highest numerical AUC, significantly outperforming the baseline CNN and the conventional ROI-based approach, while performing comparably to ResNet. Notably, the transition from WB to R11 (AUC 0.75) showed no statistically significant degradation in performance (p=0.19), proving that high diagnostic integrity can be maintained even with substantial anatomical data reduction. Interpretability analysis via SHAP, applied to the R11 configuration, identified consistent high-impact spatial clusters across anatomical axes. Specifically, the precentral gyrus and occipital regions emerged as robust neuroanatomical substrates for ADHD classification. These findings suggest that transformer-based self-attention effectively integrates distributed morphological variations across sensorimotor and visual processing networks, providing an anatomically coherent approach to ADHD diagnosis.