DYNAFormer: Enhancing transformer segmentation with dynamic anchor mask for medical imaging.
Authors
Affiliations (8)
Affiliations (8)
- University of Science, VNU-HCM, Ho Chi Minh City, Viet Nam; University of Social Sciences and Humanities, VNU-HCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam.
- PrimeLabs LLC, Covington, KY, United States; University of Cincinnati, Cincinnati, OH, United States.
- University of Science, VNU-HCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam; Thong Nhat Hospital, Ho Chi Minh City, Viet Nam.
- University of Science, VNU-HCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam.
- University of Science, VNU-HCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNU-HCM, Ho Chi Minh City, Viet Nam.
- Thong Nhat Hospital, Ho Chi Minh City, Viet Nam.
- University of Dayton, Dayton, OH, United States.
- University of Science, VNU-HCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNU-HCM, Ho Chi Minh City, Viet Nam. Electronic address: [email protected].
Abstract
Polyp shape is critical for diagnosing colorectal polyps and assessing cancer risk, yet there is limited data on segmenting pedunculated and sessile polyps. This paper introduces PolypDB_INS, a dataset of 4403 images containing 4918 annotated polyps, specifically for sessile and pedunculated polyps. In addition, we propose DYNAFormer, a novel transformer-based model utilizing an anchor mask-guided mechanism that incorporates cross-attention, dynamic query updates, and query denoising for improved object segmentation. Treating each positional query as an anchor mask dynamically updated through decoder layers enhances perceptual information regarding the object's position, allowing for more precise segmentation of complex structures like polyps. Extensive experiments on the PolypDB_INS dataset using standard evaluation metrics for both instance and semantic segmentation show that DYNAFormer significantly outperforms state-of-the-art methods. Ablation studies confirm the effectiveness of the proposed techniques, highlighting the model's robustness for diagnosing colorectal cancer. The source code and dataset are available at https://github.com/ntcongvn/DYNAFormer https://github.com/ntcongvn/DYNAFormer.