ViTMARE - A Vision Transformer Pipeline for Anomaly Detection in 3D Brain MRI.
Authors
Affiliations (4)
Affiliations (4)
- Department of Computer, Electrical and Biomedical Engineering, University of Pavia, Pavia, Italy.
- Research Unit of Computer Systems and Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Rome, Italy.
- Clinical Department, National Center for Oncological Hadrontherapy (CNAO), Pavia, Italy.
- Radiology Department, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy.
Abstract
AI models for medical imaging often fail under dataset shifts and on underrepresented patient subgroups. Detecting out-of-distribution scans-arising from rare pathologies, atypical anatomy, or acquisition artifacts-is therefore essential for robust deployment. We introduce ViTMARE (Vision Transformer Masked Autoencoder Reconstruction Error), a volumetric anomaly-detection pipeline for 3D brain MRI that leverages Vision Transformer Masked AutoEncoders (ViTMAEs) adapted to volumetric data by treating axial slices as input channels. The model is fine-tuned on normal brain volumes and evaluated using a synthetic-lesion generator that produces anatomically plausible abnormalities. During inference, ViTMARE performs multiple reconstructions (N=100) and aggregates binary anomaly masks via majority voting, followed by morphological closing and opening to suppress spurious noise. On a test set of real images with added synthetic anomalies, ViTMARE achieves a median Dice score of 0.793, a median precision of 0.912, and a median recall of 0.748. We present a reproducible pipeline and demonstrate that combining voting-based fusion with morphological postprocessing yields robust voxel-level anomaly detection.