BEGA-UNet: Boundary-Explicit Guided Attention U-Net with Multi-Scale Feature Aggregation for Colonoscopic Polyp Segmentation
Authors
Affiliations (1)
Affiliations (1)
- FAU
Abstract
Accurate polyp segmentation from colonoscopy images is critical for colorectal cancer prevention, yet the generalization of deep learning models under domain shift remains insufficiently explored. We propose Boundary-Explicit Guided Attention U-Net (BEGA-UNet), a boundary-aware segmentation architecture that introduces explicit edge modeling as a structural inductive bias to enhance both segmentation accuracy and cross-domain robustness. The framework integrates three components: an Edge-Guided Module (EGM) with learnable Sobel-initialized operators to capture boundary cues, a Dual-Path Attention (DPA) module that processes channel and spatial attention in parallel, and a Multi-Scale Feature Aggregation (MSFA) module to encode contextual information across multiple receptive fields. Evaluated on the combined Kvasir-SEG and CVC-ClinicDB benchmarks, BEGA-UNet achieves 88.53% Dice and 82.51% IoU, outperforming representative convolutional and transformer-based baselines. More importantly, cross-dataset evaluation demonstrates strong robustness under domain shift, with BEGA-UNet retaining 83.2% of its in-distribution performance--substantially higher than U-Net (64.5%), Attention U-Net (47.5%), and TransUNet (53.1%). In a zero-shot setting on an entirely unseen dataset, the model further maintains 72.6% performance retention. Comprehensive ablation studies indicate that explicit boundary modeling plays a central role in improving generalization, while multi-scale context aggregation further stabilizes performance across domains. Feature distribution analyses support this observation by showing that edge-oriented representations exhibit markedly reduced cross-domain variability compared to appearance-driven features. Overall, BEGA-UNet provides an effective and interpretable solution for robust polyp segmentation, demonstrating that explicit boundary modeling serves as a critical inductive bias for ensuring reliability under clinical domain shifts.