An interpretable multimodal machine learning model for predicting malignancy of thyroid nodules in low-resource scenarios.

October 16, 2025

papers

DOI: 10.1186/s12902-025-02031-x PMID: 41102705

Authors

Ma F,Yu F,Gu X,Zhang L,Lu Z,Zhang L,Mao H,Xiang N

Affiliations (10)

Department of Integrated Traditional and Western Medicine, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, 24 Jinghua Road, Luoyang, Henan, 471003, PR China.
Hubei University of Chinese Medicine, Wuhan, Hubei, 430065, PR China.
School of Information Management, Wuhan University, Wuhan, Hubei, 430072, PR China.
Henan Key Laboratory of Cancer Epigenetics, Cancer Institute, The First Affiliated Hospital, and College of Clinical Medicine of Medical College of Henan University of Science and Technology, Luoyang, 471003, PR China.
Huanggang Hospital of Traditional Chinese Medicine, Hubei University of Chinese Medicine, Huanggang, Hubei, 438000, PR China.
Department of Integrated Traditional and Western Medicine, The First Affiliated Hospital, and College of Clinical Medicine of Henan University of Science and Technology, 24 Jinghua Road, Luoyang, Henan, 471003, PR China. [email protected].
Hubei University of Chinese Medicine, Wuhan, Hubei, 430065, PR China. [email protected].
Hubei Shizhen Laboratory, Wuhan, Hubei, 430061, PR China. [email protected].
Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan, Hubei, 430061, PR China. [email protected].
Hubei Shizhen Laboratory, Wuhan, Hubei, 430061, PR China. [email protected].

Abstract

Thyroid nodules (TNs) represent a prevalent clinical issue in endocrinology. The diagnostic process for malignant TNs typically involves a three-stage detection: the function detection, color ultrasound (CU) detection and biopsy. Early identification is crucial for effective management of malignant TNs. This study developed a multimodal network for classifying CU images and thyroid function (TF) test data. Specifically, the PubMedClIP model was employed to extract visual features from CU images, generating a 512-dimensional feature vector. This vector was subsequently concatenated with five indicators of TF tests, as well as gender and age information, to construct a comprehensive representation. The combined representation was then fed into a downstream ML classifier, where we evaluated seven models, including AdaBoost, Random Forest, and Logistic Regression. Among the seven ML models evaluated, the AdaBoost classifier demonstrated the highest overall performance, surpassing other classifiers in terms of area under the curve (AUC), F1, accuracy, and coordinate attention (CA) metrics. The incorporation of visual features extracted from CU images using PubMedCLIP further enhanced the model’s performance. Feature importance analysis revealed that laboratory indicators such as free thyroxine (FT4), free triiodothyronine (FT3), and clip_feature_184 were the most influential clinical variables. Additionally, the integration of PubMedCLIP significantly improved the model’s capacity to accurately classify data by leveraging both clinical and imaging information. The proposed PubMedCLIP-based multimodal framework, which jointly utilizes ultrasound imaging features and clinical laboratory data, demonstrated superior diagnostic performance in differentiating benign from malignant TNs. This approach offers a promising tool for individualized risk assessment and clinical decision support, potentially facilitating more precise and personalized protocols for patients with TNs. Not applicable.

View Source Full Text PDF

Topics

Journal Article

An interpretable multimodal machine learning model for predicting malignancy of thyroid nodules in low-resource scenarios.

Authors

Affiliations (10)

Abstract

Tags

Topics

Ready to Sharpen Your Edge?