Back to all papers

AttCo: Attention-based co-Learning fusion of deep feature representation for medical image segmentation using multimodality.

January 8, 2026pubmed logopapers

Authors

Dao DP,Yang HJ,Kim SH,Kang SR

Affiliations (4)

  • Department of Artificial Intelligence Convergence, Chonnam National University, Gwangju, South Korea. Electronic address: [email protected].
  • Department of Artificial Intelligence Convergence, Chonnam National University, Gwangju, South Korea. Electronic address: [email protected].
  • Department of Artificial Intelligence Convergence, Chonnam National University, Gwangju, South Korea. Electronic address: [email protected].
  • Department of Nuclear Medicine, Chonnam National University Hwasun Hospital, Gwangju, South Korea. Electronic address: [email protected].

Abstract

Accurate tissue segmentation is crucial for advancing healthcare, particularly in disease prediction and treatment planning. Precisely identifying abnormal tissue locations is a critical step for clinical analysis. While medical image segmentation increasingly utilizes multimodal and three-dimensional (3D) information to capture spatial relationships, current methods often struggle to effectively learn complementary information from multiple inputs, especially with complex 3D structures. In this study, we introduce AttCo, a novel multimodal semantic segmentation network built upon an attention-based co-learning fusion of deep feature representations. AttCo first employs multiple encoder branches to extract unimodal 3D representations from each imaging modality. These unimodal representations are then processed by a co-learning fusion module, which integrates both intra-modality (using SEAT) and inter-modality (using OSCAT) feature learning components. This dual approach ensures the capture of intricate interactions within each modality and across different modalities. Finally, these fused features pass through an up-sampling module to generate 3D segmented tumor maps. Our end-to-end deep network effectively addresses two key aspects: (i) extracting robust unimodal 3D representations and (ii) exploiting comprehensive inter- and intra-modality feature interactions. Experimental results demonstrate that AttCo significantly outperforms competing methods in terms of Dice score across various datasets. The source code can be found here https://github.com/duyphuongcri/AttCo.

Topics

Deep LearningAttentionNeural Networks, ComputerImage Processing, Computer-AssistedMultimodal ImagingJournal Article

Ready to Sharpen Your Edge?

Subscribe to join 11k+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.