Slim UNETR++: A lightweight 3D medical image segmentation network for medical image analysis.
Authors
Affiliations (4)
Affiliations (4)
- School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China.
- School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin, 300384, China. [email protected].
- Department of Neurosurgery, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua Medicine, Tsinghua University, Beijing, 102218, China. [email protected].
- Department of Electrical and Smart Systems Engineering, University of South Africa, Florida, South Africa.
Abstract
Convolutional neural network (CNN) models, such as U-Net, V-Net, and DeepLab, have achieved remarkable results across various medical imaging modalities, and ultrasound. Additionally, hybrid Transformer-based segmentation methods have shown great potential in medical image analysis. Despite the breakthroughs in feature extraction through self-attention mechanisms, these methods are computationally intensive, especially for three-dimensional medical imaging, posing significant challenges to graphics processing unit (GPU) hardware. Consequently, the demand for lightweight models is increasing. To address this issue, we designed a high-accuracy yet lightweight model that combines the strengths of CNNs and Transformers. We introduce Slim UNEt TRansformers++ (Slim UNETR++), which builds upon Slim UNETR by incorporating Medical ConvNeXt (MedNeXt), Spatial-Channel Attention (SCA), and Efficient Paired-Attention (EPA) modules. This integration leverages the advantages of both CNN and Transformer architectures to enhance model accuracy. The core component of Slim UNETR++ is the Slim UNETR++ block, which facilitates efficient information exchange through a sparse self-attention mechanism and low-cost representation aggregation. We also introduced throughput as a performance metric to quantify data processing speed. Experimental results demonstrate that Slim UNETR++ outperforms other models in terms of accuracy and model size. On the BraTS2021 dataset, Slim UNETR++ achieved a Dice accuracy of 93.12% and a 95% Hausdorff distance (HD95) of 4.23mm, significantly surpassing mainstream relevant methods such as Swin UNETR.