AI in motion: the impact of data augmentation strategies on mitigating MRI motion artifacts.
Authors
Affiliations (2)
Affiliations (2)
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany. [email protected].
- Department for Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany.
Abstract
Artifacts in clinical MRI can compromise the performance of AI models. This study evaluates how different data augmentation strategies affect an AI model's segmentation performance under variable artifact severity. We used an AI model based on the nnU-Net architecture to automatically quantify lower limb alignment using axial T2-weighted MR images. Three versions of the AI model were trained with different augmentation strategies: (1) no augmentation ("baseline"), (2) standard nnU-net augmentations ("default"), and (3) "default" plus augmentations that emulate MR artifacts ("MRI-specific"). Model performance was tested on 600 MR image stacks (right and left; hip, knee, and ankle) from 20 healthy participants (mean age, 23 ± 3 years, 17 men), each imaged five times under standardized motion to induce artifacts. Two radiologists graded each stack's artifact severity as none, mild, moderate, and severe, and manually measured torsional angles. Segmentation quality was assessed using the Dice similarity coefficient (DSC), while torsional angles were compared between manual and automatic measurements using mean absolute deviation (MAD), intraclass correlation coefficient (ICC), and Pearson's correlation coefficient (r). Statistical analysis included parametric tests and a Linear Mixed-Effects Model. MRI-specific augmentation resulted in slightly (yet not significantly) better performance than the default strategy. Segmentation quality decreased with increasing artifact severity, which was partially mitigated by default and MRI-specific augmentations (e.g., severe artifacts, proximal femur: DSC<sub>baseline</sub> = 0.58 ± 0.22; DSC<sub>default</sub> = 0.72 ± 0.22; DSC<sub>MRI-specific</sub> = 0.79 ± 0.14 [p < 0.001]). These augmentations also maintained precise torsional angle measurements (e.g., severe artifacts, femoral torsion: MAD<sub>baseline</sub> = 20.6 ± 23.5°; MAD<sub>default</sub> = 7.0 ± 13.0°; MAD<sub>MRI-specific</sub> = 5.7 ± 9.5° [p < 0.001]; ICC<sub>baseline</sub> = -0.10 [p = 0.63; 95% CI: -0.61 to 0.47]; ICC<sub>default</sub> = 0.38 [p = 0.08; -0.17 to 0.76]; ICC<sub>MRI-specific</sub> = 0.86 [p < 0.001; 0.62 to 0.95]; r<sub>baseline</sub> = 0.58 [p < 0.001; 0.44 to 0.69]; r<sub>default</sub> = 0.68 [p < 0.001; 0.56 to 0.77]; r<sub>MRI-specific</sub> = 0.86 [p < 0.001; 0.81 to 0.9]). Motion artifacts negatively impact AI models, but general-purpose augmentations enhance robustness effectively. MRI-specific augmentations offer minimal additional benefit. Question Motion artifacts negatively impact the performance of diagnostic AI models for MRI, but mitigation methods remain largely unexplored. Findings Domain-specific augmentation during training can improve the robustness and performance of a model for quantifying lower limb alignment in the presence of severe artifacts. Clinical relevance Excellent robustness and accuracy are crucial for deploying diagnostic AI models in clinical practice. Including domain knowledge in model training can benefit clinical adoption.