Volume Fusion-based Self-Supervised Pretraining for 3D Medical Image Segmentation.

September 22, 2025

DOI: 10.1109/TIP.2025.3610249 PMID: 40982499

Authors

Wang G,Fu J,Wu J,Luo X,Zhou Y,Liu X,Li K,Lin J,Shen B,Zhang S

Abstract

The performance of deep learning models for medical image segmentation is often limited in scenarios where training data or annotations are limited. Self-Supervised Learning (SSL) is an appealing solution for this dilemma due to its feature learning ability from a large amount of unannotated images. Existing SSL methods have focused on pretraining either an encoder for global feature representation or an encoder-decoder structure for image restoration, where the gap between pretext and downstream tasks limits the usefulness of pretrained decoders in downstream segmentation. In this work, we propose a novel SSL strategy named Volume Fusion (VolF) for pretraining 3D segmentation models. It minimizes the gap between pretext and downstream tasks by introducing a pseudo-segmentation pretext task, where two sub-volumes are fused by a discretized block-wise fusion coefficient map. The model takes the fused result as input and predicts the category of fusion coefficient for each voxel, which can be trained with standard supervised segmentation loss functions without manual annotations. Experiments with an abdominal CT dataset for pretraining and both in-domain and out-domain downstream datasets showed that VolF led to large performance gain from training from scratch with faster convergence speed, and outperformed several state-of-the-art SSL methods. In addition, it is general to different network structures, and the learned features have high generalizability to different body parts and modalities.

View Source Full Text PDF

Topics

Journal Article

Volume Fusion-based Self-Supervised Pretraining for 3D Medical Image Segmentation.

Authors

Abstract

Tags

Topics

Ready to Sharpen Your Edge?