Back to all papers

[Transfer learning-based endoscopic image recognition of nasopharyngeal carcinoma: investigating pre-trained large models in small sample settings].

December 11, 2025pubmed logopapers

Authors

Li Z,Chen X,Liu XL,Shi S,Gong WT,Qiu MY,Shi YX,Yu HM

Affiliations (1)

  • ENT Institute and Department of Otorhinolaryngology, Eye and ENT Hospital, Fudan University, Research Units of New Technologies of Endoscopic Surgery in Skull Base Tumor, Chinese Academy of Medical Sciences, Shanghai 200031, China.

Abstract

<b>Objective:</b> To develop a nasopharyngeal carcinoma (NPC) diagnostic model based on foundation model transfer learning, aiming to address the limited generalization and diagnostic performance of existing models under small-sample conditions. <b>Methods:</b> A retrospective study was conducted using 27 362 nasopharyngeal endoscopic images from eight regional NPC centers. The images were classified into three groups: NPC, benign hyperplasia (BH), and normal nasopharynx (NOR). The data were randomly split into a training/validation set (85%) and a hold-out test set (15%). To evaluate generalization under small-sample conditions, models were trained on both the full dataset (100%) and a small subset (1%), then tested on the same test set. The model was based on BiomedCLIP, pre-trained on large medical image-text datasets and fine-tuned for classification. The performance of our fine-tuned BiomedCLIP model was systematically compared against several benchmark models, including ResNet50, ViT-Base, and the original CLIP. Performance was assessed using accuracy, the area under the receiver operating characteristic curve (AUC), specificity, and sensitivity, with attention maps used to visualize how the model made its decisions. <b>Results:</b> With the full training data, the BiomedCLIP model demonstrated robust performance. It achieved 95.46% (95%<i>CI</i>: 94.87%-96.08%) accuracy and an AUC of 0.98 (95%<i>CI</i>: 0.98-0.99) for distinguishing normal vs abnormal cases (NAN), and 89.92% (95%<i>CI</i>: 89.04%-90.78%) accuracy and an AUC of 0.90 (95%<i>CI</i>: 0.89-0.90) for distinguishing malignant vs non-malignant cases (MNM), significantly outperforming all comparator models. Even when trained with only 1% of the data, BiomedCLIP still maintained strong performance, with AUCs of 0.89 (95%<i>CI</i>: 0.88-0.90) for NAN and 0.81 (95%<i>CI</i>: 0.79-0.82) for MNM, demonstrating effective generalization in data-scarce scenarios. <b>Conclusions:</b> The endoscopic image-based auxiliary diagnostic model presented in this study accurately differentiates NPC, BH, and NOR under small-sample conditions. The model exhibites high diagnostic accuracy and robust generalization despite limited training data, highlighting its promise for clinical deployment as a screening and decision-support tool.

Topics

English AbstractJournal Article

Ready to Sharpen Your Edge?

Subscribe to join 7,100+ peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.