Back to all papers

Multi-scale fusion semantic enhancement network for medical image segmentation.

Authors

Zhang Z,Xu C,Li Z,Chen Y,Nie C

Affiliations (4)

  • School of Integrated Circuits, Anhui University, HeFei, 230601, China.
  • Anhui Engineering Laboratory of Agro-Ecological Big Data, HeFei, 230601, China.
  • School of Integrated Circuits, Anhui University, HeFei, 230601, China. [email protected].
  • Anhui Engineering Laboratory of Agro-Ecological Big Data, HeFei, 230601, China. [email protected].

Abstract

The application of sophisticated computer vision techniques for medical image segmentation (MIS) plays a vital role in clinical diagnosis and treatment. Although Transformer-based models are effective at capturing global context, they are often ineffective at dealing with local feature dependencies. In order to improve this problem, we design a Multi-scale Fusion and Semantic Enhancement Network (MFSE-Net) for endoscopic image segmentation, which aims to capture global information and enhance detailed information. MFSE-Net uses a dual encoder architecture, with PVTv2 as the primary encoder to capture global features and CNNs as the secondary encoder to capture local details. The main encoder includes the LGDA (Large-kernel Grouped Deformable Attention) module for filtering noise and enhancing the semantic extraction of the four hierarchical features. The auxiliary encoder leverages the MLCF (Multi-Layered Cross-attention Fusion) module to integrate high-level semantic data from the deep CNN with fine spatial details from the shallow layers, enhancing the precision of boundaries and positioning. On the decoder side, we have introduced the PSE (Parallel Semantic Enhancement) module, which embeds the boundary and position information of the secondary encoder into the output characteristics of the backbone network. In the multi-scale decoding process, we also add SAM (Scale Aware Module) to recover global semantic information and offset for the loss of boundary details. Extensive experiments have shown that MFSE-Net overwhelmingly outperforms SOTA on the renal tumor and polyp datasets.

Topics

Image Processing, Computer-AssistedNeural Networks, ComputerJournal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.