Efficient temporal feature utilization in ultrasound videos: a multi-channel deep learning framework for enhanced breast lesion differentiation.
Authors
Affiliations (6)
Affiliations (6)
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, 110004, China.
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, 110169, China.
- Department of Breast Imaging, The Affiliated Hospital of Qingdao University, Qingdao, 266100, China.
- Department of Medical Imaging, Cancer Hospital of China Medical University, Cancer Hospital of Dalian University of Technology, Liaoning Cancer Hospital & Institute, Shenyang, 110042, China.
- Department of Medical Imaging, Cancer Hospital of China Medical University, Cancer Hospital of Dalian University of Technology, Liaoning Cancer Hospital & Institute, Shenyang, 110042, China. [email protected].
- Department of Medical Imaging, Cancer Hospital of China Medical University, Cancer Hospital of Dalian University of Technology, Liaoning Cancer Hospital & Institute, Shenyang, 110042, China. [email protected].
Abstract
Automated breast lesion differentiation in ultrasound (US) imaging has advanced through deep learning (DL) techniques. However, existing 2D approaches predominantly rely on individual static images, overlooking valuable temporal information between frames, which limits their performance. While 3D models can utilize this temporal information, their high computational requirements make them impractical in resource-constrained settings. This study introduces an improved framework that combines spatial and temporal lesion features by incorporating consecutive frames from US videos. Unlike existing 2D models, our framework utilizes a multi-channel input strategy to effectively learn lesion characteristics across frames, avoiding computational burden of 3D models. It offers potential use in resource-limited settings and real-time environments. The framework's effectiveness has been validated on multicenter data from two different regions. Extensive experiments demonstrate that the proposed multi-channel input technique significantly outperforms single-image approaches. Across five distinct DL backbone models, the framework consistently achieved higher precision, recall, and AUC values, highlighting the positive impact of temporal information on classification accuracy. Specifically, the multi-channel approach yielded improvements of up to 8.6% in AUC, 9.86% in precision, and 23.68% in recall compared to single-image inputs. These results highlight the potential of the proposed multi-channel framework as an effective and practical solution for breast lesion differentiation. By effectively utilizing temporal information without additional computational burden, our proposed approach serves as a computationally efficient alternative to 3D models, advancing DL-based breast US analysis. It shows promising potential for delivering more accurate and accessible diagnostic solutions, making it highly applicable for clinical practice.