A Minimal Annotation Pipeline for Deep Learning Segmentation of Skeletal Muscles.
Authors
Affiliations (3)
Affiliations (3)
- Institute of Myology, Neuromuscular Investigation Center, NMR Laboratory, Paris, France.
- Support Center for Advanced Neuroimaging (SCAN), Institute for Diagnostic and Interventional Neuroradiology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland.
- Department of Neurology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland.
Abstract
Translating quantitative skeletal muscle MRI biomarkers into clinics requires efficient automatic segmentation methods. The purpose of this work is to investigate a simple yet effective iterative methodology for building a high-quality automatic segmentation model while minimizing the manual annotation effort. We used a retrospective database of quantitative MRI examinations (n = 70) of healthy and pathological thighs for training a nnU-Net segmentation model. Healthy volunteers and patients with various neuromuscular diseases, broadly categorized as dystrophic, inflammatory, neurogenic, and unlabeled NMDs. We designed an iterative procedure, progressively adding cases to the training set and using a simple visual five-level rating scale to judge the validity of generated segmentations for clinical use. On an independent test set (n = 20), we assessed the quality of the segmentation in 13 individual thigh muscles using standard segmentation metrics-dice coefficient (DICE) and 95% Hausdorff distance (HD95)-and quantitative biomarkers-cross-sectional area (CSA), fat fraction (FF), and water-T1/T2. We obtained high-quality segmentations (DICE = 0.88 ± 0.15/0.86 ± 0.14, HD95 = 6.35 ± 12.33/6.74 ± 11.57 mm), comparable to recent works, although with a smaller training set (n = 30). Inter-rater agreement on the five-level scale was fair to moderate but showed progressive improvement of the segmentation model along with the iterations. We observed limited differences from manually delineated segmentations on the quantitative outcomes (MAD: CSA = 65.2 mm<sup>2</sup>, FF = 1%, water-T1 = 8.4 ms, water-T2 = 0.35 ms), with variability comparable to manual delineations.