Cross-Scanner Reliability of Brain MRI Foundation Model Embeddings: A Travelling-Heads Study

March 25, 2026

preprint

DOI: 10.64898/2026.03.23.26348808

Authors

Navarro-Gonzalez, R.,Aja-Fernandez, S.,Planchuelo-Gomez, A.,de Luis-Garcia, R.

Affiliations (1)

Laboratorio de Procesado de Imagen, Universidad de Valladolid

Abstract

Foundation models (FMs) for brain magnetic resonance imaging (MRI) are increasingly adopted as pretrained backbones for clinical tasks such as brain age prediction, disease classification, and anomaly detection. However, if FM embeddings (internal representations) shift systematically across MRI scanners, downstream analyses built on them may reflect acquisition hardware rather than biology. No study has yet quantified this cross-scanner reproducibility. Here, we assess the cross-scanner reliability of brain MRI FM embeddings and investigate which design factors (pretraining strategy, network architecture, embedding dimensionality, and pretraining dataset scale) best explain the observed differences. Using the ON-Harmony travelling-heads dataset (20 participants, eight scanners, three vendors), we evaluate the embeddings of five architecturally diverse FMs and a FreeSurfer morphometric baseline via within- and between-scanner intraclass correlation coefficient (ICC), variance decomposition, and scanner fingerprinting. Reliability spanned the full spectrum: biology-guided models achieved good-to-excellent cross-scanner ICC (AnatCL: 0.97 [95% confidence interval (CI): 0.94, 0.98]; y-Aware: 0.81 [0.63, 0.88]), matching or surpassing FreeSurfer (0.93 [0.83, 0.96]), whereas purely self-supervised models fell below the poor threshold (BrainIAC: 0.45, BrainSegFounder: 0.31, 3D-Neuro-SimCLR: 0.25), with 23-58% of embedding variance attributable to scanner identity. The strongest correlate of cross-scanner reliability among the models evaluated was pretraining strategy: incorporating biological metadata (cortical morphometrics, age) into the contrastive objective produced scanner-robust embeddings, whereas architecture, dimensionality, and dataset scale did not predict reliability.

View Source Full Text PDF

Topics

radiology and imaging

Cross-Scanner Reliability of Brain MRI Foundation Model Embeddings: A Travelling-Heads Study

Authors

Affiliations (1)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?