A novel cross-attentive network for classifying cervical metastatic lymph nodes on B- and D-mode ultrasound images in oral squamous cell carcinoma.
Authors
Affiliations (3)
Affiliations (3)
- Department of Oral and Maxillofacial Radiology, Dental Research Institute, School of Dentistry, Seoul National University, 101 Daehak-ro, Jongno-gu, 03080, Seoul, Korea.
- Interdisciplinary Program in Bioengineering, Graduate School of Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, 08826, Seoul, Korea.
- Department of Oral and Maxillofacial Surgery, Houston Methodist Research Institute, Houston, 77030, Texas, USA.
Abstract
This study proposes a deep convolutional neural network model that integrates B-mode and D-mode ultrasound images to classify metastatic lymph nodes in patients with oral squamous cell carcinoma. A shared backbone network incorporating a cross-attention mechanism was employed to enhance feature-level interactions between dual-input ultrasound images. A total of six convolutional neural network architectures (VGG16, SqueezeNet, ResNet50, EfficientNet B3, ConvNext, DenseNet121) were implemented within a shared backbone framework to investigate optimal performance. For each network, diagnostic performance was compared between dual-input and single-input ultrasound. In addition, model performance was evaluated against human observers with different levels of experience. The model using DenseNet121 as a shared backbone with an integrated cross-attention layer (LNM-Net) achieved the highest classification accuracy (85.3%) when utilizing dual-input images, surpassing the diagnostic performance of residents. The cross-attention module improved feature fusion, reducing false positives by suppressing modality-specific noise. LNM-Net demonstrates strong potential as a clinical decision-support tool for preoperative lymph node metastasis assessment in oral squamous cell carcinoma. Despite current limitations such as dataset size and cross-institutional variability, the model offers a promising supplementary aid, particularly in settings with limited radiological expertise. This study develops a novel cross-attentive network using dual-input B- and D-mode ultrasound images to classify metastatic lymph nodes in oral squamous cell carcinoma.