SibBMS: Siberian Brain Multiple Sclerosis Dataset with lesion segmentation and patient meta information
Authors
Affiliations (1)
Affiliations (1)
- International Tomography Center SB RAS
Abstract
Multiple sclerosis (MS) is a chronic inflammatory neurodegenerative disorder of the central nervous system (CNS) and represents the leading cause of non-traumatic disability among young adults. Magnetic resonance imaging (MRI) has revolutionized both the clinical management and scientific understanding of MS, serving as an indispensable paraclinical tool. Its high sensitivity and diagnostic accuracy enable early detection and timely therapeutic intervention, significantly impacting patient outcomes. Recent technological advancements have facilitated the integration of artificial intelligence (AI) algorithms for automated lesion identification, segmentation, and longitudinal monitoring. The ongoing refinement of deep learning (DL) and machine learning (ML) techniques, alongside their incorporation into clinical workflows, holds great promise for improving healthcare accessibility and quality in MS management. Despite the encouraging performance of DL models in MS lesion segmentation and disease progression tracking, their effectiveness is frequently constrained by the scarcity of large, diverse, and publicly available datasets. Open-source initiatives such as MSLesSeg, MS-Baghdad, MS-Shift, and MSSEG-2 have provided valuable contributions to the research community. Building upon these foundations, we introduce the SibBMS dataset to further advance data-driven research in MS. In this study, we present the SibBMS dataset, a carefully curated, open-source resource designed to support MS research utilizing structural brain MRI. The dataset comprises imaging data from 93 patients diagnosed with MS or radiologically isolated syndrome (RIS), alongside 100 healthy controls. All lesion annotations were manually delineated and rigorously reviewed by a three-tier panel of experienced neuroradiologists to ensure clinical relevance and segmentation accuracy. Additionally, the dataset includes comprehensive demographic metadata--such as age, sex, and disease duration--enabling robust stratified analyses and facilitating the development of more generalizable predictive models. Our dataset is available via a request-access form at https://forms.gle/VqTenJ4n8S8qvtxQA.