Back to all papers

Multimodal text guided network for chest CT pneumonia classification.

Authors

Feng Y,Huang G,Ju F,Cui H

Affiliations (2)

  • College of Computer Science, Beijing University of Technology, Beijing, 100124, China.
  • College of Computer Science, Beijing University of Technology, Beijing, 100124, China. [email protected].

Abstract

Pneumonia is a prevalent and serious respiratory disease, responsible for a significant number of cases globally. With advancements in deep learning, the automatic diagnosis of pneumonia has attracted significant research attention in medical image classification. However, current methods still face several challenges. First, since lesions are often visible in only a few slices, slice-based classification algorithms may overlook critical spatial contextual information in CT sequences, and slice-level annotations are labor-intensive. Moreover, chest CT sequence-based pneumonia classification algorithms that rely solely on sequence-level coarse-grained labels remain limited, especially in integrating multi-modal information. To address these challenges, we propose a Multi-modal Text-Guided Network (MTGNet) for pneumonia classification using chest CT sequences. In this model, we design a sequential graph pooling network to encode the CT sequences by gradually selecting important slice features to obtain a sequence-level representation. Additionally, a CT description encoder is developed to learn representations from textual reports. To simulate the clinical diagnostic process, we employ multi-modal training and single-modal testing. A modal transfer module is proposed to generate simulated textual features from CT sequences. Cross-modal attention is then employed to fuse the sequence-level and simulated textual representations, thereby enhancing feature learning within the CT sequences by incorporating semantic information from textual descriptions. Furthermore, contrastive learning is applied to learn discriminative features by maximizing the similarity of positive sample pairs and minimizing the similarity of negative sample pairs. Extensive experiments on a self-constructed pneumonia CT sequences dataset demonstrate that the proposed model significantly improves classification performance.

Topics

Tomography, X-Ray ComputedPneumoniaJournal Article

Ready to Sharpen Your Edge?

Join hundreds of your peers who rely on RadAI Slice. Get the essential weekly briefing that empowers you to navigate the future of radiology.

We respect your privacy. Unsubscribe at any time.