DPCrossU-Net: a dual-branch parallel CNN-Transformer network for lung nodule segmentation.
Authors
Affiliations (3)
Affiliations (3)
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China.
- School of Mathematics and Systems Science Guangdong Polytechnic Normal University, Guangzhou, China.
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada.
Abstract
Accurate segmentation of lung nodules in CT images is essential for early lung cancer screening and computer-aided diagnosis, yet remains challenging due to small target size, complex boundaries, and the limitations of existing convolutional or Transformer-based architectures in balancing local detail and global context modeling. We propose DPCrossU-Net, a dual-branch parallel encoder-decoder network that integrates convolutional and Vision Transformer representations. The encoder employs parallel CNN and ViT branches with a Cross-Attentive Fusion (CAF) module to adaptively combine local texture and global semantic features. Multi-scale atrous convolutions are introduced at the bottleneck to enhance sensitivity to small nodules, while a dual-branch Detail Context Fusion (DCF) block in the decoder improves boundary reconstruction. Experiments conducted on the public LIDC-IDRI dataset demonstrate that DPCrossU-Net achieves a Dice score of 85.89%, outperforming the baseline U-Net and showing superior performance, particularly in small-nodule and complex-background scenarios. These results indicate that synergistically combining parallel CNN-Transformer feature extraction with adaptive cross-branch fusion effectively enhances lung nodule segmentation. DPCrossU-Net provides a robust and clinically applicable solution, offering improved accuracy for early lung cancer analysis and potential support for future intelligent diagnostic systems.