Multi-encoder U-Net benchmarking for LiTS17 Liver-Tumor segmentation: accuracy-efficiency trade-offs across training durations.
Authors
Affiliations (4)
Affiliations (4)
- Faculty of Mathematics and Computer Science, Quanzhou Normal University, Quanzhou, China.
- Fujian Provincial Key Laboratory of Data Intensive Computing, Quanzhou, China.
- Key Laboratory of Intelligent Computing and Information Processing, Fujian Province University, Quanzhou, China.
- Department of Diagnostic Radiology, Huaqiao University Affiliated Strait Hospital, Quanzhou, Fujian, China.
Abstract
Accurate liver and tumor segmentation from CT is fundamental for diagnosis, treatment planning, and longitudinal monitoring of liver cancer. Although U-Net variants with popular encoder backbones are widely used, the coupled effects of encoder selection, training duration, and computational cost, as well as comparisons against volumetric architectures such as V-Net, remain insufficiently standardized. We propose a unified benchmarking framework that evaluates a family of multi-encoder 2D U-Net models together with an optional 3D V-Net baseline under the same preprocessing, input construction, and 3-fold cross-validation protocol on LiTS17. Multiple backbones (VGG16/VGG19/ResNet34/ResNet50/ResNet101/MobileNetV2) are assessed under 15/50/100-epoch schedules, and performance is reported using overlap and detection metrics (Dice, IoU, precision, recall) alongside efficiency indicators (training time and model complexity when available) to characterize the accuracy-efficiency trade-off. Results show liver segmentation rapidly reaches near-ceiling performance across models, while tumor segmentation benefits markedly from longer training and stronger encoders, especially for small or low-contrast lesions. Overall, the study provides a reproducible protocol and practical guidance for selecting segmentation models that balance accuracy, robustness, and deployment cost.