A Hybrid Model Combining U-Net and Transformers for Joint Segmentation and Beamforming of Plane-wave Ultrasound Images.
Authors
Affiliations (2)
Affiliations (2)
- Electrical and Computer Engineering Department, University of Rochester, Rochester, NY, USA.
- Electrical and Computer Engineering Department, University of Rochester, Rochester, NY, USA. Electronic address: [email protected].
Abstract
Recent advancements in deep learning have shown significant potential in ultrasound imaging. However, most approaches focus solely on image enhancement or segmentation, without integrating these tasks into a unified framework. To address this gap, we developed a novel architecture that combines the U-Net and Transformer models to simultaneously segment and beamform plane-wave images acquired from a single insonification. The hybrid model was evaluated using computer simulations, physical phantoms containing hypoechoic inclusions (5- to 10-mm radius) and ultrasound images acquired from the carotid arteries of 50 healthy volunteers. Performance was assessed using six metrics: segmentation accuracy (Dice similarity coefficient), image fidelity (mean square error; structural similarity index metric), image clarity and contrast (Laplacian variance; generalized contrast-to-noise ratio; signal-to-noise ratio). The hybrid model achieved excellent segmentation performance (Dice similarity coefficient = 0.98), low error (mean square error = 0.017) and high structural similarity (structural similarity index metric = 0.765). Its beamformed images were comparable with compound plane-wave imaging, with signal-to-noise ratio and generalized contrast-to-noise ratio values of 2.15 and 0.8, respectively, vs 2.4 and 0.9 for compound plane-wave imaging. The hybrid model showed reduced accuracy for inclusions ≤7 mm and occasionally produced spurious inclusions. These results demonstrate that hybrid U-Net/Transformer models can reduce reliance on compounding while maintaining image quality, although improvements are needed for small target detection and artifact suppression. With further optimization, this approach could enable real-time segmentation and enhance plane-wave imaging in clinical settings.