SiamAT: transformer-based target-aware siamese tracking network for target tracking in ultrasound image sequences.
Authors
Affiliations (8)
Affiliations (8)
- Beijing Institute of Technology, No 5 Zhongguancun South Street, Haidian District, Beijing, Beijing, Beijing, 100081, CHINA.
- Beijing Institute of Technology, No 5 Zhongguancun South Street, Haidian District, Beijing, Beijing, 100081, CHINA.
- School of Computer Science and Technology, Beijing Institute of Technology, NO.5 Zhongguancun South Street, Haidian District, Beijing, China, Beijing, Beijing, 100081, CHINA.
- School of Optics and Electronics , Beijing Institute of Technology, NO.5 Zhongguancun South Street, Haidian District, Beijing, China, Beijing, Beijing, 100081, CHINA.
- Beijing Institute of Technology School of Computer Science and Technology, No 5 Zhongguancun South Street, Haidian District, Beijing, Beijing, Beijing, 100081, CHINA.
- School of Optics and Photonics, Beijing Institute of Technology, No 5 Zhongguancun South Street, Haidian District, Beijing, Beijing, Beijing, 100081, CHINA.
- Beijing Institute of Technology, No. 5 Zhongguancun South Street, Beijing, Beijing, 100081, CHINA.
- School of Optics and Photonics, Beijing Institute of Technology, NO.5 Zhongguancun south street, Beijing, 100081, CHINA.
Abstract
Accurate target localization is essential for effective radiation therapy, yet respiratory motion introduces considerable uncertainty. Ultrasound-based motion tracking offers a non-invasive solution, but the presence of similar anatomical structures and significant target deformation severely hinders robust and accurate tracking. Robust tracking is essential to ensure precision and improve treatment outcomes.
Approach. We propose SiamAT, a tracking framework comprising a Siamese-like feature extraction network with a Swin Transformer backbone, an attention-based target-aware module, and multi-task prediction heads. The target-aware module adaptively integrate template and search features to generate target-aware representations for precise localization.
Main results. The proposed SiamAT is evaluated on a public dataset, i.e. MICCAI 2015 challenge on liver ultrasound tracking (CLUST), and our clinical dataset provided by the Chinese People's Liberation Army General Hospital. Experimental results demonstrate that the method achieves accurate and robust tracking(0.60±0.30 mm and 0.47±0.31 mm tacking errors in two datasets) and outperforms existing methods. The proposed method runs at approximatively 36 fps on GPU.
Significance. The proposed SiamAT provides clinicians with an effective tool for analyzing 2D motion in ultrasound images in clinical settings, assisting in the planning of optimal treatment strategies while reducing diagnostic and therapeutic risks.