Reliability of a convolutional neural network in segmenting multiple sclerosis lesions from MRI: Impact of data augmentation, image modality and tolerance with U-Net architecture.
Authors
Affiliations (7)
Affiliations (7)
- School of Engineering, University of Birmingham, Edgbaston, Birmingham, United Kingdom.
- School of Computer Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom.
- Institute of Inflammation and Ageing, University of Birmingham, Edgbaston, Birmingham, United Kingdom.
- University Hospitals Birmingham NHS Foundation Trust, Edgbaston, Birmingham, United Kingdom.
- Royal North Shore Hospital, St Leonards, Sydney, New South Wales, Australia.
- School of Neurology, Dudley Group NHS Foundation Trust, Russells Hall Hospital, Birmingham, United Kingdom.
- School of Life and Health Sciences, Aston University, Birmingham, United Kingdom.
Abstract
Multiple Sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system, typically exhibiting radiologically identifiable lesions within the brain and spinal cord, features key to both diagnosis and clinical disease monitoring. Manually identifying and segmenting lesions is both difficult and time consuming, thus optimising approaches to reliably automate segmentation is highly beneficial. The aim of this study was to assess the impact of data augmentation and manipulation on the accuracy of automated lesion segmentation using MRI scans from MS patients. Factors examined include MRI modalities in both isolation and combination, image augmentation, lesion size and size of testing set relative to training. The MICCAI 2016 MS dataset was used in this study, with U-Net chosen as the algorithmic method for segmentation. Each factor was optimised and then combined to maximise segmentation accuracy; the Dice metric was used as the focal metric to assess the efficacy of any given permutation of the setup. Statistical significance was assessed using the Mann-Whitney U-test, with each permutation repeated five to ten times to ensure robustness. The best Dice score achieved using the testing and training dataset as outlined in the MICCAI 2016 challenge rubric was 0.59, approximately a 2% improvement against controls. To achieve this result whilst adhering to the training and testing distribution as defined in the dataset publication, the optimal imaging sequence was determined to be proton density. The augmentation conditions used included implementing an additional rotation of the dataset (doubling it in size) and excluding lesions < 36.43 mm3 in volume. The impact of data manipulation and augmentation was found to be statistically significant against controls using a Mann-Whitney U-test for lesion segmentation. An ancillary finding of this study was that there was no statistically significant difference between using one MRI modality for training and another for testing.