Comparative evaluation of machine learning models for predicting PD-L1 high expression in resectable NSCLC: a dual-center study integrating [<sup>18</sup>F]FDG PET/CT and clinicopathological features.
Authors
Affiliations (5)
Affiliations (5)
- Department of Thoracic Surgery, The First Affiliated Hospital of Shantou University Medical College, Shantou, China.
- Shantou University Medical College, Shantou, China.
- Department of Gastrointestinal Surgery, The First Affiliated Hospital of Shantou University Medical College, Shantou, China.
- Department of Thoracic Surgery, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China.
- Joint Cardiac Surgery Center, The First Affiliated Hospital of Shantou University Medical College, Shantou, China.
Abstract
Accurate prediction of programmed death-ligand 1 (PD-L1) high expression (tumor proportion score [TPS] ≥50%) is important for identifying patients with resectable non-small cell lung cancer (NSCLC) who may benefit from neoadjuvant chemoimmunotherapy (nCIT). This study aimed to evaluate eight machine learning (ML) algorithms and develop a non-invasive, [<sup>18</sup>F]FDG PET/CT-based predictive model. A retrospective, dual-center cohort of 269 patients with stage IB-IIIB resectable NSCLC who underwent [<sup>18</sup>F]FDG PET/CT for initial staging was enrolled (training set: n=216; independent external validation set: n=53). The reference standard for PD-L1 status was immunohistochemistry (IHC) using the 22C3 assay, with TPS ≥50% defined as high expression. Baseline clinical-pathological features and the primary tumor maximum standardized uptake value (SUVmax) were extracted. Feature dimension reduction was performed using LASSO regression. Eight ML algorithms were trained and evaluated using 1000-fold bootstrap internal validation and external validation. Pairwise model comparisons of the area under the receiver operating characteristic curve (AUC) were conducted using DeLong's test with Bonferroni correction. A two-tailed adjusted P < 0.05 was considered statistically significant. A quantitative nomogram was subsequently constructed based on the optimal parsimonious model. Of the 269 enrolled patients, 79.9% were male and 32.0% were older than 65 years. LASSO regression identified five core predictors: smoking status, histological type, T stage, histological grade, and SUVmax. In the independent external validation set, Support Vector Machine (SVM) (AUC = 0.858), Random Forest (RF) (AUC = 0.849), and Logistic Regression (LR) (AUC = 0.833) demonstrated good discriminative performance. However, DeLong's test indicated no statistically significant advantage of the complex models over the traditional LR model (all adjusted P = 1.000). Prioritizing model transparency and interpretability, an LR-based nomogram was established, which exhibited favorable calibration and provided clinical net benefit across a wide range of threshold probabilities in both cohorts. We developed and validated an interpretable, [<sup>18</sup>F]FDG PET/CT-based nomogram integrating SUVmax and clinical-pathological features to non-invasively predict PD-L1 high expression in resectable NSCLC. This study suggests that traditional LR offers comparable predictive accuracy to complex ML algorithms, while providing enhanced clinical transparency and a potential non-invasive adjunct for personalized nCIT decision-making.