Lung cancer multimodal auxiliary diagnosis based on entropy weight decision fusion.
Authors
Affiliations (4)
Affiliations (4)
- The Second People's Hospital of Hefei, Hefei Hospital Affiliated to Anhui Medical University, Hefei, 230012, China.
- Hefei University of Technology, Hefei, 230601, China.
- Hefei University of Technology, Hefei, 230601, China. [email protected].
- The Second People's Hospital of Hefei, Hefei Hospital Affiliated to Anhui Medical University, Hefei, 230012, China. [email protected].
Abstract
Lung cancer ranks among the most lethal malignancies globally, and its traditional diagnosis suffers from strong subjectivity, high misdiagnosis rates and uneven medical resources. To overcome the poor feature alignment caused by simply concatenating CT images and clinical text, this paper proposes a lung cancer multimodal auxiliary diagnosis model based on entropy weight decision fusion. This retrospective cohort study enrolled 5847 participants from 2020 to 2025, comprising 1823 lung cancer cases, 2253 normal controls and 1771 pulmonary nodule controls. All CT images and corresponding reports were analyzed, and three datasets were established via random sampling from the original dataset. The study incorporated Vision Transformer (ViT) and Bidirectional Encoder Representations from Transformers (BERT) as feature extractors for images and text, respectively, to extract high-dimensional semantic features from lung CT images and CT Imaging Report. Secondly, independent classifiers based on Multi-Layer Perceptron (MLP) were established to convert the embedding vectors of different modalities into predicted probability distributions (Logits). Finally, the entropy weight method was employed to adaptively fuse the decision results of images and text. The model performance evaluation indicators include area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, and F1-score. The proposed method in this study can fully leverage the complementary information from CT images and imaging text multimodality. On the clinical dataset, it achieved an accuracy of 0.9375, a precision of 0.9324, a recall of 0.9322, and an F1-score of 0.9322, significantly improving diagnostic performance. This study validates that, on a real-world lung cancer dataset, multimodal data decision fusion outperforms unimodal models and common fusion methods in terms of diagnostic accuracy, precision and recall. It provides a potential reference for the early auxiliary diagnosis of pulmonary nodules and lung cancer, and lays a foundation for subsequent clinical applications.