Back to all papers

Reliability of a convolutional neural network in segmenting multiple sclerosis lesions from MRI: Impact of data augmentation, image modality and tolerance with U-Net architecture.

April 1, 2026pubmed logopapers

Authors

Szekely-Kohn AC,Castellani M,Baronti L,Ahmed Z,Manifold WGK,Douglas M,Espino DM

Affiliations (7)

  • School of Engineering, University of Birmingham, Edgbaston, Birmingham, United Kingdom.
  • School of Computer Science, University of Birmingham, Edgbaston, Birmingham, United Kingdom.
  • Institute of Inflammation and Ageing, University of Birmingham, Edgbaston, Birmingham, United Kingdom.
  • University Hospitals Birmingham NHS Foundation Trust, Edgbaston, Birmingham, United Kingdom.
  • Royal North Shore Hospital, St Leonards, Sydney, New South Wales, Australia.
  • School of Neurology, Dudley Group NHS Foundation Trust, Russells Hall Hospital, Birmingham, United Kingdom.
  • School of Life and Health Sciences, Aston University, Birmingham, United Kingdom.

Abstract

Multiple Sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system, typically exhibiting radiologically identifiable lesions within the brain and spinal cord, features key to both diagnosis and clinical disease monitoring. Manually identifying and segmenting lesions is both difficult and time consuming, thus optimising approaches to reliably automate segmentation is highly beneficial. The aim of this study was to assess the impact of data augmentation and manipulation on the accuracy of automated lesion segmentation using MRI scans from MS patients. Factors examined include MRI modalities in both isolation and combination, image augmentation, lesion size and size of testing set relative to training. The MICCAI 2016 MS dataset was used in this study, with U-Net chosen as the algorithmic method for segmentation. Each factor was optimised and then combined to maximise segmentation accuracy; the Dice metric was used as the focal metric to assess the efficacy of any given permutation of the setup. Statistical significance was assessed using the Mann-Whitney U-test, with each permutation repeated five to ten times to ensure robustness. The best Dice score achieved using the testing and training dataset as outlined in the MICCAI 2016 challenge rubric was 0.59, approximately a 2% improvement against controls. To achieve this result whilst adhering to the training and testing distribution as defined in the dataset publication, the optimal imaging sequence was determined to be proton density. The augmentation conditions used included implementing an additional rotation of the dataset (doubling it in size) and excluding lesions < 36.43 mm3 in volume. The impact of data manipulation and augmentation was found to be statistically significant against controls using a Mann-Whitney U-test for lesion segmentation. An ancillary finding of this study was that there was no statistically significant difference between using one MRI modality for training and another for testing.

Topics

Journal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.