PD-YOLO: a high-precision detection network for fine-grained lung cancer and multi-scale pulmonary nodules in CT images.
Authors
Affiliations (5)
Affiliations (5)
- Department of Intensive care unit, Jinjiang Municipal Hospital, Jinjiang, 362200, Fujian, China. [email protected].
- Department of General Surgery, the Second Affiliated Hospital of Fujian Medical University, QuanZhou, 362000, Fujian, China.
- Department of General Practice, Jinling Hospital, Medical School of Nanjing University, Nanjing, China.
- Quanzhou Hospital of Traditional Chinese medicine, Quanzhou, China.
- Department of Anesthesiology, Affiliated Hospital of Medical School, Jinling Hospital, Nanjing University, Nanjing, China.
Abstract
Timely and accurate Computed Tomography (CT) screening is crucial for the early clinical treatment of lung cancer and preventing the progression of malignant pulmonary nodules. However, owing to the large scale variations and blurred boundaries of lesions, existing deep learning models struggle to strike an optimal balance between computational costs, receptive field size, and the precise reconstruction of fine details. To address these challenges, this paper integrates three complementary components-A2C2f_DFFN, SPPF_LSKA, and DySample-into the YOLOv12n framework to construct PD-YOLO, a high-precision fine-grained lung cancer detection model. The main contribution lies in the synergistic optimization of these modules for CT‑based pulmonary lesion detection, rather than the theoretical novelty of any single component. Specifically, we introduce the A2C2f_DFFN module, which utilizes a Dynamic Feed-Forward Network to significantly enhance the non-linear feature representation of low-contrast early lesions. Furthermore, the SPPF_LSKA module is designed by combining Spatial Pyramid Pooling with Large Separable Kernel Attention, which expands the effective receptive field with extremely low computational overhead to handle large-scale massive tumor consolidations typical of advanced lung cancer. Additionally, an ultra-lightweight dynamic upsampling module (DySample) is employed to adaptively reconstruct high-resolution features, effectively mitigating the loss of edge textures and minor infiltrates during feature propagation. Extensive experiments on a chest CT dataset containing 4200 images demonstrate that PD-YOLO achieves a mean Average Precision (mAP @0.5) of 97.3%, outperforming state-of-the-art detectors including YOLOv12s and YOLOv12n. The proposed model successfully optimizes the trade-off between missed detections and false alarms, exhibiting highly robust overall object detection capabilities for complex clinical applications, particularly in the automated screening and localization of malignant pulmonary lesions.