Multimodal deep learning model integrating electronic medical records and CT images for gallbladder cancer diagnosis: a retrospective multicenter study in China.
Authors
Affiliations (7)
Affiliations (7)
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China.
- Department of General Surgery, Xinhua Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200092, China.
- Department of Anesthesiology, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200127, China.
- Department of Biliary and Pancreatic Surgery, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200127, China. [email protected].
- Department of Biliary and Pancreatic Surgery, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, 200127, China. [email protected].
- Shanghai Key Laboratory of Systems Regulation and Clinical Translation for Cancer, Shanghai, 200127, China. [email protected].
- State Key Laboratory of Systems Medicine for Cancer, Shanghai Cancer Institute, Shanghai, 200127, China. [email protected].
Abstract
Gallbladder cancer (GBC) is a rare gastrointestinal malignancy with a global 5-year survival rate of less than 5%. Early diagnosis is challenging owing to the lack of specific clinical symptoms. Additionally, the high heterogeneity of gallbladder tumors limits the clinical utility of unimodal deep-learning methods for GBC diagnosis. This study aimed to develop a novel multimodal deep-learning model to facilitate the preoperative diagnosis of GBC in more patients. We conducted a retrospective multicenter study using contrast-enhanced arterial phase computed tomography (CT) images and laboratory examination data from 300 patients (150 GBC cases and 150 non-GBC cases) extracted from electronic medical records of two Grade A tertiary hospitals in Shanghai between 2018 and 2020. A novel two-stage multimodal diagnostic model (GBC-DiagNet) was developed: the first stage achieved coarse segmentation of the gallbladder region using a position-constrained 3D Attention U-Net (improved by combined sampling) to avoid over-segmentation; the second stage realized GBC detection via an adaptive feature fusion strategy, which optimizes the weighted integration of handcrafted radiomic, deep radiomic and laboratory examination features to enhance diagnostic performance. On the independent test set, the model achieved an accuracy of 0.933 (95% confidence interval [95% CI]: 0.927-0.94), specificity of 0.912 (95% CI: 0.904-0.922), sensitivity of 0.962 (95% CI: 0.937-0.986), precision of 0.893 (95% CI: 0.875-0.911), an F1-score of 0.926 (95% CI: 0.919-0.932) and AUC (area under the curve) of 0.9706 (95% CI: 0.961-0.981). Compared with the optimal unimodal model, our model improved accuracy, sensitivity, and F1-score by 14.28%, 16.76%, and 16.85%, respectively. Furthermore, compared to state-of-the-art deep-learning architectures (ResNet, DenseNet, MobileNet, ConvNeXt, ViT), our model exhibited absolute improvements of 7.68% in accuracy, 8.03% in F1-score, and 0.0059 in AUC. The proposed multimodal model integrating contrast-enhanced CT and laboratory data achieves stable and clinically meaningful diagnostic performance for gallbladder cancer, supporting its utility as an artificial intelligence-assisted tool for preoperative noninvasive diagnosis.