CMT-FFNet: A CMT-based feature-fusion network for predicting TACE treatment response in hepatocellular carcinoma.
Authors
Affiliations (9)
Affiliations (9)
- Chengdu Institute of Computer Application Chinese Academy of Sciences, China; University of the Chinese Academy of Sciences, China. Electronic address: [email protected].
- Department of Radiology, The First Affiliated Hospital, Dalian Medical University, China. Electronic address: [email protected].
- Chengdu Institute of Computer Application Chinese Academy of Sciences, China; University of the Chinese Academy of Sciences, China. Electronic address: [email protected].
- Department of Radiology, The First Affiliated Hospital, Dalian Medical University, China. Electronic address: [email protected].
- Department of Radiology, The First Affiliated Hospital, Dalian Medical University, China. Electronic address: [email protected].
- College of Medical Imaging, Dalian Medical University, China. Electronic address: [email protected].
- College of Medical Imaging, Dalian Medical University, China. Electronic address: [email protected].
- Department of Radiology, The First Affiliated Hospital, Dalian Medical University, China; College of Medical Imaging, Dalian Medical University, China. Electronic address: [email protected].
- Chengdu Institute of Computer Application Chinese Academy of Sciences, China; University of the Chinese Academy of Sciences, China. Electronic address: [email protected].
Abstract
Accurately and preoperatively predicting tumor response to transarterial chemoembolization (TACE) treatment is crucial for individualized treatment decision-making hepatocellular carcinoma (HCC). In this study, we propose a novel feature fusion network based on the Convolutional Neural Networks Meet Vision Transformers (CMT) architecture, termed CMT-FFNet, to predict TACE efficacy using preoperative multiphase Magnetic Resonance Imaging (MRI) scans. The CMT-FFNet combines local feature extraction with global dependency modeling through attention mechanisms, enabling the extraction of complementary information from multiphase MRI data. Additionally, we introduce an orthogonality loss to optimize the fusion of imaging and clinical features, further enhancing the complementarity of cross-modal features. Moreover, visualization techniques were employed to highlight key regions contributing to model decisions. Extensive experiments were conducted to evaluate the effectiveness of the proposed modules and network architecture. Experimental results demonstrate that our model effectively captures latent correlations among features extracted from multiphase MRI data and multimodal inputs, significantly improving the prediction performance of TACE treatment response in HCC patients.