The impact of U-Net architecture choices and skip connections on the robustness of segmentation across texture variations.
Kamath A, Willmann J, Andratschke N, Reyes M
•papers•Sep 12 2025Since its introduction in 2015, the U-Net architecture has become popular for medical image segmentation. U-Net is known for its "skip connections," which transfer image details directly to its decoder branch at various levels. However, it's unclear how these skip connections affect the model's performance when the texture of input images varies. To explore this, we tested six types of U-Net-like architectures in three groups: Standard (U-Net and V-Net), No-Skip (U-Net and V-Net without skip connections), and Enhanced (AGU-Net and UNet++, which have extra skip connections). Because convolutional neural networks (CNNs) are known to be sensitive to texture, we defined a novel texture disparity (TD) metric and ran experiments with synthetic images, adjusting this measure. We then applied these findings to four real medical imaging datasets, covering different anatomies (breast, colon, heart, and spleen) and imaging types (ultrasound, histology, MRI, and CT). The goal was to understand how the choice of architecture impacts the model's ability to handle varying TD between foreground and background. For each dataset, we tested the models with five categories of TD, measuring their performance using the Dice Score Coefficient (DSC), Hausdorff distance, surface distance, and surface DSC. Our results on synthetic data with varying textures show differences between the performance of architectures with and without skip connections, especially when trained in hard textural conditions. When translated to medical data, it indicates that training data sets with a narrow texture range negatively impact the robustness of architectures that include more skip connections. The robustness gap between architectures reduces when trained on a larger TD range. In the harder TD categories, models from the No-Skip group performed the best in 5/8 cases (based on DSC) and 7/8 (based on Hausdorff distances). When measuring robustness using the coefficient of variation metric on the DSC, the No-Skip group performed the best in 7 out of 16 cases, showing superior results than the Enhanced (6/16) and Standard groups (3/16). These findings suggest that skip connections offer performance benefits, usually at the expense of robustness losses, depending on the degree of texture disparity between the foreground and background, and the range of texture variations present in the training set. This indicates careful evaluation of their use for robustness-critical tasks like medical image segmentation. Combinations of texture-aware architectures must be investigated to achieve better performance-robustness characteristics.