NeuroConText: Contrastive Learning for Neuroscience Meta-Analysis with Rich Text Representation
Authors
Affiliations (1)
Affiliations (1)
- Inria Centre de Recherche Saclay ÃŽle-de-France: Inria Centre de Recherche Saclay-Ile-de-France
Abstract
Brain meta-analysis is the common way to gather information about human brain function across the existing literature in order to formulate hypotheses and contextualize new findings. However, automated meta-analysis tools face challenges such as inconsistent terminology and difficulties in analyzing long texts and capturing semantic meaning because they still rely on bag-of-words approaches; furthermore, sparse coordinate reporting in articles distorts the activation distribution due to incomplete data. This paper introduces NeuroConText, a predictive text-to-brain modeling framework designed to support brain meta-analysis by bridging neuroscience text, brain location coordinates, and brain images within a shared latent space. This framework follows the predictive brain meta-analysis paradigm: it learns a regression from text descriptions to whole-brain activation maps and also enables the retrieval of relevant studies through contrastive learning, optimizing a multi-objective loss that combines retrieval and reconstruction objectives. Furthermore, NeuroConText supports second-level statistical synthesis by providing activation associated with top-K retrieved studies that can serve as input to coordinate-based meta-analysis (CBMA) methods. NeuroConText also leverages large language models (LLMs) to capture neuroscientific information from full-text articles, plus an LLM-based text augmentation strategy to handle short-text inputs. Quantitative and qualitative analyses demonstrate NeuroConTexts ability to enhance text-to-brain retrieval performance and reconstruct brain maps from neuroscience texts. We also show that predictive brain meta-analysis tools can infer brain activations in regions discussed in articles but absent in reported coordinates, potentially addressing the challenge of sparse coordinate reporting.