Diagnosis of Multiple Sclerosis Using Multimodal Deep Learning Integrating Lesion and Normal-Appearing White Matter: A Retrospective Study with International Multicentre External Validation
Authors
Affiliations (1)
Affiliations (1)
- NYU Grossman School of Medicine
Abstract
BackgroundCurrent diagnostic criteria for multiple sclerosis (MS) rely on white matter lesions (WMLs), which are not specific and often occur in other disorders. Microstructural abnormalities in normal-appearing white matter (NAWM) may provide complementary information beyond focal lesions. However, the diagnostic use of NAWM in MS remains limited because a reproducible, diagnostically specific NAWM signature has not been established, and NAWM abnormalities detection typically requires quantitative MRI methods beyond routine clinical MRI protocols. MethodsIn this retrospective study, we proposed DeepMS, a deep learning model trained with both quantitative diffusion MRI (dMRI) and structural MRI (sMRI) to diagnose MS by integrating WML and NAWM features captured from routine MRI alone. Development utilized 8,450 scans from 7,703 patients (NYU Langone/ADNI). Evaluation included an internal test set (n=837) and two independent external cohorts: the Krakow cohort (Poland, n=293) and a public multi-site cohort curated from 15 datasets (n=1,756). We compared DeepMS against 2024 McDonald criteria biomarkers (Dissemination in Time [DIT], Dissemination in Space [DIS], Central Vein Sign [CVS], and Paramagnetic Rim Lesion [PRL]) in a multireader study (n=308). To validate the models use of NAWM, we performed lesion-masking experiments (n=550), comparing performance after removal of focal lesions. FindingsDeepMS achieved robust AUCs in the internal (0{middle dot}968 [95% CI 0{middle dot}946-0{middle dot}987]), Krakow (0{middle dot}940 [0{middle dot}898-0{middle dot}974]), and public external (0{middle dot}974 [0{middle dot}966-0{middle dot}982]) cohorts. In the multireader study, DeepMS outperformed established biomarkers: at matched sensitivity (92{middle dot}9%), DeepMS achieved higher specificity than DIS (89{middle dot}0% vs 78{middle dot}5%; p=0{middle dot}0061); at matched specificity (92{middle dot}8%), DeepMS achieved higher sensitivity than CVS (88{middle dot}2% vs 52{middle dot}0%; p<0{middle dot}0001). Furthermore, DeepMS retained diagnostic capability after WML masking (AUC 0{middle dot}959 to 0{middle dot}881) compared to the model trained with only sMRI (0{middle dot}895 to 0{middle dot}764). InterpretationOur findings suggest it is feasible for deep learning models to leverage NAWM-related information directly from routine sMRI. Integrating these features could help MS diagnosis in patients with ambiguous white matter abnormalities. FundingNational Institute of Neurological Disorders and Stroke, the National Institute of Biomedical Imaging and Bioengineering, and the Irma T. Hirschl Trust.