A novel recursive transformer-based U-Net architecture for enhanced multi-scale medical image segmentation.

Authors

Li S,Liu X,Fu M,Khelifi F

Affiliations (4)

  • College of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao, 266061, China.
  • College of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao, 266061, China. Electronic address: [email protected].
  • College of Electronic Engineering/Sanya Oceanographic Institution, Ocean University of China, Qingdao, Sanya, 266100, China.
  • Computer Information Sciences Department, Northumbria University, Newcastle, UK.

Abstract

Automatic medical image segmentation techniques are vital for assisting clinicians in making accurate diagnoses and treatment plans. Although the U-shaped network (U-Net) has been widely adopted in medical image analysis, it still faces challenges in capturing long-range dependencies, particularly in complex and textured medical images where anatomical structures often blend into the surrounding background. To address these limitations, a novel network architecture, called recursive transformer-based U-Net (ReT-UNet), which integrates recursive feature learning and transformer technology, is proposed. One of the key innovations of ReT-UNet is the multi-scale global feature fusion (Multi-GF) module, inspired by transformer models and multi-scale pooling mechanisms. This module captures long-range dependencies, enhancing the abstraction and contextual understanding of multi-level features. Additionally, a recursive feature accumulation block is introduced to iteratively update features across layers, improving the network's ability to model spatial correlations and represent deep features in medical images. To improve sensitivity to local details, a lightweight atrous spatial pyramid pooling (ASPP) module is appended after the Multi-GF module. Furthermore, the segmentation head is redesigned to emphasize feature aggregation and fusion. During the encoding phase, a hybrid pooling layer is employed to ensure comprehensive feature sampling, thereby enabling a broader range of feature representation and improving detailed information learning. Results: The proposed method has been evaluated through ablation experiments, demonstrating generally consistent performance across multiple trials. When applied to cardiac, pulmonary nodule, and polyp segmentation datasets, the method showed a reduction in mis-segmented regions. The experimental results suggest that the approach can improve segmentation accuracy and stability compared to competing state-of-the-art methods. Experimental findings highlight the superiority of the proposed ReT-UNet over related methods and demonstrate its potential for applications in medical image segmentation.

Topics

Journal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.