A high-frequency feature guided diffusion model for musculoskeletal ultrasound image segmentation.
Authors
Affiliations (2)
Affiliations (2)
- School of Artificial Intelligence and Data Science, Hebei University of Technology, School of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin, 300401, China, Tianjin, 300401, China.
- Hebei University of Technology, School of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin, 300401, China, Tianjin, 300401, China.
Abstract
Musculoskeletal ultrasound (MSKUS) image segmentation remains challenging due to severe speckle noise, which leads to significant boundary ambiguity. Under such conditions, existing methods often fail to delineate anatomical structures with sufficient accuracy, making them unsuitable for clinical application. A high-frequency feature guided diffusion model, HFGSegDiff, is proposed for robust MSKUS segmentation. Built upon a conditional diffusion framework, HFGSegDiff employs a dual-branch parallel feature encoding strategy to jointly model noisy masks and corresponding ultrasound images. The overall architecture consists of two key modules: a High-Frequency Cross Attention Module (HFCAM) and a Multi-Scale Feature Enhancement Module (MSFEM). The HFCAM integrates wavelet-extracted high-frequency features with spatial information from the conditional image to refine boundary reconstruction. The MSFEM applies depthwise separable convolutions to capture structural boundary information across multiple scales, improving adaptability to anatomical variations. Experiments on two public datasets demonstrate that HFGSegDiff outperforms state-of-the-art methods across multiple metrics. Specifically, compared with U-KAN, the best competing method, HFGSegDiff improves mIoU by 6.00%, 1.03%, and 1.58% on the BB, TA, and GM subsets of the MUST dataset, respectively, while reducing HD95 by 3.0, 0.31, and 0.58. On the DeepACSA dataset, HFGSegdiff further improves mIoU and Dice by 4.01% and 2.26%, respectively, and achieves an HD95 reduction of 0.69. In addition, the noise robustness experiments show that HFGSegDiff exhibits minimal performance degradation under strong speckle noise (σ = 0.3), significantly outperforming other discriminative methods. Our proposed model enables accurate extraction of structural boundaries in noisy ultrasound environments, offering a promising solution for robust and precise ultrasound-assisted clinical diagnosis.