Hierarchical reasoning for lung cancer detection: from multi-scale perception to hypergraph inference with CR-YOLO.
Authors
Affiliations (11)
Affiliations (11)
- Department of Thoracic Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- Key Laboratory of Surgery Critical Care and Life Support (Xi'an Jiaotong University), Ministry of Education, Xi'an, Shaanxi, China.
- Department of Thoracic Radiotherapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, Zhejiang, China.
- Department of General Surgery, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- Department of Respiratory and Critical Care Medicine, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- Department of Thoracic Surgery, First Hospital of Yulin City, Yulin, Shaanxi, China.
- Zhejiang Key Laboratory of Blood-Stasis-Toxin Syndrome, Zhejiang Chinese Medical University, Hangzhou, Zhejiang, China. [email protected].
- Department of Oncology, The Second Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, Zhejiang, China. [email protected].
- Department of Urology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China. [email protected].
- Department of Thoracic Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China. [email protected].
- Key Laboratory of Surgery Critical Care and Life Support (Xi'an Jiaotong University), Ministry of Education, Xi'an, Shaanxi, China. [email protected].
Abstract
Accurate detection of lung cancer from Computed Tomography (CT) scans is vital for improving patient survival but remains challenging for deep learning models, which struggle with scale variations of pulmonary nodules and the complex reasoning required for diagnosis. We propose CR-YOLO, a novel framework incorporating a Cognitive Reasoning C2f (CR-C2f) module that emulates a radiologist's hierarchical workflow. CR-YOLO employs a Multi-scale Convolution (MSC) module for robust feature perception, Global-Local Attention (GLA) Bottlenecks to integrate local morphology with contextual dependencies, and a Hypergraph Convolution (HGC) Refiner for high-order relational inference. Experiments demonstrate that CR-YOLO achieves a mean Average Precision (mAP) of 92.5%, a 4.1% absolute improvement over the YOLOv8n baseline. In addition to improved accuracy, CR-YOLO enhances interpretability through Grad-CAM analysis, highlighting its potential as a reliable and transparent tool for early lung cancer diagnosis.