Clinicopathological-CT model for predicting PD-L1 expression in resectable early-stage non-small cell lung cancer
Original Article

Clinicopathological-CT model for predicting PD-L1 expression in resectable early-stage non-small cell lung cancer

Yaoyao Zhuo1,2,3#, Qingle Wang1,2,3#, Yi Zhan4, Shuyi Yang1,2,3, Haoling Zhang1,2,3, Shan Yang1,2,3, Zhiyong Zhang1,2,3, Fei Shan2,4,5

1Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China; 2Shanghai Institute of Medical Imaging, Shanghai, China; 3Department of Cancer Center, Zhongshan Hospital, Fudan University, Shanghai, China; 4Department of Radiology, Shanghai Public Health Clinical Center, Fudan University, Shanghai, China; 5Research Institute of Big Data, Fudan University, Shanghai, China

Contributions: (I) Conception and design: Y Zhuo, Q Wang, F Shan; (II) Administrative support: Y Zhuo, Q Wang; (III) Provision of study materials or patients: Y Zhuo, Q Wang, Y Zhan; (IV) Collection and assembly of data: Y Zhan, Shuyi Yang; (V) Data analysis and interpretation: Y Zhuo, Q Wang, F Shan; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work as co-first authors.

Correspondence to: Fei Shan, MD, PhD. Shanghai Institute of Medical Imaging, Shanghai, China; Department of Radiology, Shanghai Public Health Clinical Center, Fudan University, No. 2901 Caolang Road, Jinshan District, Shanghai 201508, China; Research Institute of Big Data, Fudan University, Shanghai 200032, China. Email: shanfei_2901@163.com.

Background: The expression of programmed death-ligand 1 (PD-L1) has an impact on survival outcomes in non-small cell lung cancer (NSCLC) patients, but preoperative diagnosis is challenging. This study aimed to construct and validate a non-invasive model for predicting PD-L1 expression in early-stage resected NSCLC based on computed tomography (CT) features and clinicopathological characteristics.

Methods: In this retrospective study, the clinical, pathological, and CT data were obtained from consecutive NSCLC patients who had undergone resection from January 2016 to March 2018. The clinicopathologic, CT, and clinicopathologic-CT models were constructed after univariate and multivariate logistic regression analyses. The Kaplan-Meier analysis and log-rank test were used for survival analysis.

Results: A total of 679 consecutive patients with 695 early-stage NSCLC nodules were included, and there were 243 {median age 57 [interquartile range (IQR), 48–63] years; 152 females} in the positive PD-L1 group and 452 [median age 58 (IQR, 50–65) years; 315 females] in the negative PD-L1 group. Smoking history, spread through air spaces (STAS), average CT value, lobulation, and cytokeratin 7 (CK7) were independent predictors of positive PD-L1 NSCLC. In validation set, the area under the curve (AUC) value of clinicopathologic model, CT model and clinicopathologic-CT model were 0.630 [95% confidence interval (CI): 0.621–0.702; sensitivity =0.701; specificity =0.542], 0.629 (95% CI: 0.574–0.638; sensitivity =0.624; specificity =0.615), and 0.819 (95% CI: 0.740–0.837; sensitivity =0.763; specificity =0.760), respectively. The clinicopathologic-CT model had higher predictive performance than the other two models by DeLong test, both in the training and validation sets.

Conclusions: Smoking history, STAS, average CT value, lobulation, and CK7 might be helpful in the diagnosis of PD-L1 expression in patients with early-stage NSCLC. The clinicopathologic-CT model had higher predictive performance than the clinicopathologic and CT models.

Keywords: Non-small cell lung cancer (NSCLC); programmed death-ligand 1 (PD-L1); computed tomography (CT); logistic regression


Submitted Jul 21, 2025. Accepted for publication Nov 03, 2025. Published online Dec 29, 2025.

doi: 10.21037/jtd-2025-1439


Highlight box

Key findings

• Smoking history, average computed tomography (CT) value, and lobulation were independent predictors of positive programmed death-ligand 1 (PD-L1) non-small cell lung cancer.

What is known and what is new?

• This study investigates on predictors for PD-L1 expression based on CT and clinicopathological features.

• The clinicopathologic-CT model had great predictive performance in lung cancer.

What is the implication, and what should change now?

• Current PD-L1 testing in lung cancer has limitations, requiring more objective detection methods.


Introduction

Lung cancer remains the leading cause of cancer-related deaths worldwide, with non-small cell lung cancer (NSCLC) representing the most common pathological type, accounting for approximately 80% of cases (1,2). Surgical resection is the primary curative treatment for stage I and II NSCLC, often followed by adjuvant chemotherapy for stage II patients (3,4). Immunotherapy targeting the programmed death-ligand 1 (PD-L1) and programmed cell death protein 1 (PD-1) receptor, also known as cluster of differentiation 274 (CD274), has significantly improved survival outcomes in NSCLC patients (5-7).

PD-1 protein, a key T-cell coinhibitory receptor, is activated by its ligands, primarily PD-L1, leading to the recruitment of SHP-2 phosphatase and suppression of T-cell receptor and CD28 signaling (8). Studies have demonstrated that antibody-mediated blockade of PD-L1 can induce durable tumor regression and prolonged disease stabilization in patients with advanced tumors (9,10). According to the National Comprehensive Cancer Network (NCCN) guidelines (version 3.2022), advanced NSCLC patients with PD-L1 expression may benefit from immune checkpoint inhibitors (ICIs) as monotherapy or in combination with chemotherapy, with immunotherapy recommended as a first-line treatment (3). Although ICIs are not yet approved for neoadjuvant therapy in current guidelines, numerous clinical trials have shown that ICIs achieve high major pathological response (MPR) rates with manageable toxicity in resectable NSCLC (11,12).

However, the predictive role of PD-L1 expression varies across studies due to differences in assays, antibodies, platforms, and thresholds (13). Additionally, testing results can be influenced by factors such as insufficient tumor quantity or quality and sample heterogeneity, limiting its widespread clinical application (14). Therefore, it is particularly important to explore the correlation between PD-L1 expression level and the clinical and imaging characteristics of patients.

Shao et al. proposed a multi-label, multi-task deep learning system for non-invasively predicting actionable NSCLC mutations and PD-L1 expression based on computed tomography (CT) images, but their approach relied on radiomics rather than CT features and clinical information (15). Yoon et al. studied 153 patients with advanced lung adenocarcinoma and found no significant differences in CT features or clinical characteristics between PD-L1-positive and PD-L1-negative groups (P>0.05 for all) (16). To date, most studies exploring the correlation between PD-L1 expression and clinical/imaging characteristics have focused on advanced lung cancer, utilizing deep learning or radiomics models (17,18). Consequently, further research is needed to non-invasively predict PD-L1 expression in NSCLC based on CT images and clinical data.

This study aimed to retrospectively analyze the CT features and clinical characteristics of NSCLC patients who underwent PD-L1 testing. We constructed and compared three models (clinicopathological, CT, and clinicopathological-CT) to non-invasively predict PD-L1 expression in early-stage NSCLC patients, providing a reference for clinical immunotherapy decision-making. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1439/rc).


Methods

Study population

The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Biomedical Research Ethics Committee of Zhongshan Hospital (No. B2021-429R). Informed consent was waived in this retrospective study. Clinical, pathological, and CT data were collected from consecutive NSCLC patients who underwent curative resection at Zhongshan Hospital from January 2016 to March 2018. A total of 2,523 patients with pathologically confirmed NSCLC were initially included. Exclusion criteria were as follows: (I) absence of consecutive thin-section CT images (1 mm) within 1 month before surgery; (II) incomplete clinicopathological data; (III) receiving any treatment (radiotherapy, chemotherapy, or chemoradiotherapy) before CT scanning; and (IV) presence of other malignant tumors. The tumor-node-metastasis (TNM) categories of pulmonary nodules were performed according to the 9th edition of the International Association for the Study of Lung Cancer (IASLC) TNM classification of lung cancer (19). Ultimately, 679 patients with 695 NSCLC nodules were included. The details of inclusion and exclusion criteria are shown in Figure 1.

Figure 1 Flow diagrams show the pathway of patient inclusion and exclusion, and workflow of data analysis. Red box, data inclusion and screening; green box, model construction; blue box, model validation; yellow box, results. CT, computed tomography; NSCLC, non-small cell lung cancer; PD-L1, programmed death-ligand 1; ROC, receiver operating characteristic.

CT scanning and image acquisition

All CT scans were performed in the supine position with arms raised after deep inspiration. CT data were acquired using three scanners: Somaton Force (Siemens, Erlangen, Germany), Aquilion One/320 (Toshiba, Ōtawara, Japan), and uCT128 (United Imaging, Shanghai, China). Scan parameters included: collimation (160 mm × 0.75 mm, 160 mm × 0.5 mm, and 64 mm × 0.625 mm), tube voltage (120–130 kVp), tube current (100–150 mAs), rotation time (0.5–0.75 s), pitch (0.828–1.2), matrix (512×512), lung window settings (width/level: 1,000–1,500/−600 HU), and mediastinal window settings (width/level: 350/40 HU). Lung algorithm reconstruction was used to generate 1-mm-thick CT images.

Two radiologists (Y.Z. and Q.W., with 6 and 10 years of thoracic CT diagnostic experience, respectively) independently analyzed CT features, blinded to clinical and pathological information. Disagreements were resolved through discussion with a senior radiologist (F.S., 25 years of experience). CT features assessed included density type, location, size, spiculation, lobulation, boundary, vacuole, cavity, air bronchogram, pleural indentation, and pulmonary vascular abnormality (definitions provided in Table 1). Nodule volume and average CT attenuation were automatically extracted using the uAI Research Portal (20).

Table 1

Definitions for CT features

Variables Definition
Spiculation Radially nonbranched linear strands around the edge of the nodule
Cavity A gaseous density with maximum diameter more than 5 mm
Vacuole A gaseous density with maximum diameter less than 5 mm
Air bronchogram The tubular gas-density bronchus reaches the edge of the nodule, entering or not entering the nodule
Pleural indentation The pleura is pulled to form a triangular structure filled with fluid and connected to the lung lesion by a linear structure
Pulmonary vascular abnormality Including vessel convergence and expansion

CT, computed tomography.

Histopathological evaluation and clinical characteristics

Clinicopathological data were obtained from Zhongshan Hospital’s Hospital Information System. PD-L1 expression was evaluated using the tumor proportion score (TPS) via immunohistochemistry (clones 28-8, SP142, or 22C3 pharmDx) (21,22). TPS was used to evaluate the level of PD-L1 expression, which was the proportion of positive staining tumor cells to the total number of viable tumor cells (22,23). It should be noted that TPS staining positive cells should only include visible tumor cells with partial or complete cell membrane staining, excluding stained cytoplasm and immune cells. Diagnostic criteria of positive PD-L1: a threshold of 1% for positivity, which described a higher frequency of response to anti-PD-1 therapy in patients with tumors that were positive vs. negative for PD-L1 protein expression. The 695 lung nodules were grouped into the positive PD-L1 group (≥1%) and the negative PD-L1 group (<1%).

Spread through air spaces (STAS) was diagnosed according to the 2015 World Health Organization classification of lung tumors (7). NSCLC with micropapillary and/or solid patterns was classified as high-grade, while other patterns were classified as low-grade (24). Complications referred to NSCLC patients with one or more of the following diseases: hypertension, diabetes, and coronary heart disease.

Follow-up and endpoints

Follow-up data were obtained from the Hospital Information System and telephone interviews. The follow-up survey was conducted in all the included patients every 3 months for the first 2 years after surgery, every 6 months for the next 3 years, and annual follow-up visits after 5 years. At each outpatient visit, chest CT scans, abdominal CT scans, brain magnetic resonance images or brain CT scans, bone scans, and ultrasounds of the supraclavicular region were routinely performed to detect any evidence of recurrence or metastasis. Telephone follow-up was also performed as a complement. A positron emission tomography (PET)/CT scan or biopsy was recommended when recurrence or metastasis was suspected. The primary endpoints of this study were recurrence-free survival (RFS) and overall survival (OS). RFS refers to the time between initial surgery and the earliest evidence of recurrence or death or the last follow-up. OS was defined as the period from the date of surgery to the time of death or the date of the last follow-up.

Statistical analysis

Statistical analyses were performed using SPSS (version 20.0) and R software (version 4.2.0; http://www.r-project.org). Intraclass correlation coefficient (ICC) and κ index were used to assess the reproducibility of continuous and categorical variables, respectively. The normality of continuous data was tested by the Shapiro-Wilk method. The continuous variables that did not conform to normal distribution were represented by median [interquartile range (IQR)] and compared by the Mann-Whitney U test. The categorical variables were represented by numbers with percentages in parentheses and compared by χ2 test or Fisher’s exact test.

All lung nodules were assigned at random to the training set (n=486) and the validation set (n=209) according to 7:3 ratio. The clinical and pathological data were analyzed by univariate and multivariate logistic regression analysis, to select the independent predictors of distinguish positive and negative PD-L1 groups. The clinicopathologic, CT, and clinicopathologic-CT models were constructed. The receiver operating characteristic (ROC) curve was used to evaluate the performance of the logistic regression model, and the area under the curve (AUC) was obtained. DeLong’s test was used between different ROCs, and the Hosmer-Lemeshow test was used to evaluate the goodness of fit of the nomogram. To evaluate the clinical usefulness of the nomogram, clinical decision curve was constructed using standardized net benefit and high-risk threshold (25). The survival outcomes were compared with Kaplan-Meier analysis, and log-rank test was calculated to analyze the difference between the curves.

The bilateral P value less than 0.05 was considered statistically significant. The packages used in R software are as follows: “glm”, “forestmodel”, “pROC”, “ROCR”, “rms”, “rmda”, and “survival”.


Results

Patient characteristics

The study included 679 patients with 695 early-stage NSCLC nodules [median age 57 (IQR, 48–64) years; 467 females]. Among these, 243 nodules [median age 57 (IQR, 48–63) years; 152 females] were PD-L1-positive, and 452 nodules [median age 58 (IQR, 50–65) years; 315 females] were PD-L1-negative. The continuous variables (including size, volume, and CT attenuation) with ICC ≥0.9, and the other CT features with a κ index ≥0.9 were regarded as highly repeatable.

Significant differences in clinicopathological characteristics between PD-L1-positive and PD-L1-negative groups were observed for age, smoking history, histologic subtype, pathological grade, STAS, epidermal growth factor receptor (EGFR) mutation, and cytokeratin 7 (CK7) expression (P=0.03, <0.001, 0.004, 0.02, <0.001, and 0.02, respectively). Significant differences in CT features included density type, average CT value, volume, maximum diameter, minimum diameter, spiculation, cavity, vacuole, lobulation, air bronchogram, and pulmonary vascular abnormality (P<0.001, <0.001, 0.002, 0.005, 0.004, 0.001, 0.01, 0.010, <0.001, 0.04, and 0.001, respectively). Detailed patient characteristics are provided in Tables 2,3.

Table 2

Clinical and pathological characteristics of included patients with NSCLC

Clinical characteristics All patients (n=695) Positive PD-L1 (n=243) Negative PD-L1 (n=452) P
Age (years) 57 [48–64] 57 [48–63] 58 [50–65] 0.03
Sex 0.07
   Female 467 (67.2) 152 (62.6) 315 (69.7)
   Male 228 (32.8) 91 (37.4) 137 (30.3)
Smoking history <0.001
   Non-smoker 454 (65.3) 188 (77.4) 266 (58.8)
   Smoker 241 (34.7) 55 (22.6) 186 (41.2)
Complication 0.17
   Yes 213 (30.6) 83 (34.2) 130 (28.8)
   No 482 (69.4) 160 (65.8) 322 (71.2)
TNM categories 0.13
   I 679 (97.7) 234 (96.3) 445 (98.5)
   II 15 (2.2) 8 (3.3) 7 (1.5)
   III 1 (0.1) 1 (0.4) 0 (0)
Histologic subtype 0.004
   Adenocarcinoma 584 (84.0) 193 (79.4) 391 (86.5)
   Squamous cell carcinoma 74 (10.6) 28 (11.5) 46 (10.2)
   Others 37 (5.32) 22 (9.05) 15 (3.32)
Pathological grade 0.02
   High grade 445 (64.0) 170 (70.0) 275 (60.8)
   Low grade 250 (36.0) 73 (30.0) 177 (39.2)
STAS <0.001
   Present 43 (6.19) 31 (12.8) 12 (2.65)
   Absent 652 (93.8) 212 (87.2) 440 (97.3)
EGFR mutation 0.31
   Present 383 (55.1) 127 (52.3) 256 (56.6)
   Absent 312 (44.9) 116 (47.7) 196 (43.4)
CK7 0.02
   Positive 588 (99.0) 202 (97.6) 386 (99.7)
   Negative 6 (1.01) 5 (2.42) 1 (0.26)
   NA 101 36 65
Surgical methods 0.17
   Lobectomy 445 (64.0) 164 (67.5) 281 (62.2)
   Wedge resection 124 (17.8) 44 (18.1) 80 (17.7)
   Segmentectomy 126 (18.1) 35 (14.4) 91 (20.1)

Data are presented as median [IQR], n (%), or n. CK7, cytokeratin 7; EGFR, epidermal growth factor receptor; IQR, interquartile range; NA, not available; NSCLC, non-small cell lung cancer; PD-L1, programmed death-ligand 1; STAS, spread through air spaces; TNM, tumor-node-metastasis.

Table 3

CT features of included patients with NSCLC

Characteristics All patients (n=695) Positive PD-L1 (n=243) Negative PD-L1 (n=452) P
Density type <0.001
   Non-solid 260 (37.4) 71 (29.2) 189 (41.8)
   Partially solid 319 (45.9) 104 (42.8) 215 (47.6)
   Solid 116 (16.7) 68 (28.0) 48 (10.6)
Average CT value (HU) −536.32 [−640.19 to −376.48] −481.33 [−619.06 to −236.95] −555.50 [−655.10 to −416.20] <0.001
Tumor volume (mm3) 936 [396–2,291] 1,288 [415–2,860] 813 [385–1,949] 0.002
Maximum diameter (mm) 13.5 [10.2–23.1] 14.8 [10.5–24.5] 12.6 [10.1–19.3] 0.005
Minimum diameter (mm) 10.4 [8.52–15.7] 11.3 [8.60–17.7] 10.0 [8.46–15.9] 0.004
Location 0.11
   RUL 238 (34.24) 75 (30.89) 163 (36.06)
   RML 63 (9.06) 30 (12.35) 33 (7.30)
   RLL 119 (17.12) 36 (14.81) 83 (18.36)
   LUL 178 (25.61) 68 (27.97) 110 (24.34)
   LLL 97 (13.97) 34 (13.98) 63 (13.94)
Spiculation 0.001
   Present 180 (25.9) 81 (33.3) 99 (21.9)
   Absent 515 (74.1) 162 (66.7) 353 (78.1)
Cavity 0.01
   Present 25 (3.60) 15 (6.17) 10 (2.21)
   Absent 670 (96.4) 228 (93.8) 442 (97.8)
Vacuole 0.010
   Present 31 (4.46) 18 (7.41) 13 (2.88)
   Absent 664 (95.5) 225 (92.6) 439 (97.1)
Boundary 0.26
   Clear 489 (70.4) 178 (73.3) 311 (68.8)
   Unclear 206 (29.6) 65 (26.7) 141 (31.2)
Lobulation <0.001
   Present 300 (43.2) 138 (56.8) 162 (35.8)
   Absent 395 (56.8) 105 (43.2) 290 (64.2)
Air bronchogram 0.04
   Present 303 (43.6) 119 (49.0) 184 (40.7)
   Absent 392 (56.4) 124 (51.0) 268 (59.3)
Pleural indentation 0.10
   Present 364 (52.4) 138 (56.8) 226 (50.0)
   Absent 331 (47.6) 105 (43.2) 226 (50.0)
Pulmonary vascular abnormality 0.001
   Present 152 (21.9) 71 (29.2) 81 (17.9)
   Absent 543 (78.1) 172 (70.8) 371 (82.1)

Data are presented as n (%) or median [IQR]. CT, computed tomography; HU, Hounsfield unit; IQR, interquartile range; LLL, left lower lobe; LUL, left upper lobe; NSCLC, non-small cell lung cancer; PD-L1, programmed death-ligand 1; RLL, right lower lobe; RML, right middle lobe; RUL, right upper lobe.

The median follow-up times for RFS and OS were 70.17 (range, 0.47–89.80) and 71.47 (range, 0.47–89.80) months, respectively. According to Kaplan-Meier method and log-rank test in survival analyses, the RFS rate of all patients was 78.85%, and the OS rate was 84.17%; the 5-year RFS rate in the PD-L1-negative group was statistically significantly higher than in the PD-L1-positive group (79.20% vs. 68.31%, P<0.001), the 5-year OS rate in the PD-L1-negative group was statistically significantly higher than in the PD-L1-positive group (79.20% vs. 72.43, P<0.001). Results of the survival analysis are shown in Figure 2.

Figure 2 Kaplan-Meier survival curves. (A) OS curves for the prediction clinicopathologic-CT model show significantly larger survival for NSCLC patients in negative PD-L1 group than those in the positive PD-L1 group. (B) RFS curves for the prediction clinicopathologic-CT model show significantly larger survival for NSCLC patients in negative PD-L1 group than those in the positive PD-L1 group. CT, computed tomography; NSCLC, non-small cell lung cancer; OS, overall survival; PD-L1, programmed death-ligand 1; RFS, recurrence-free survival.

Univariate and multivariate logistic regression analysis

The results of univariate logistic regression analysis are shown in Table 4. For clinical and pathological characteristics, age, smoking history, histologic subtype, pathological grade, STAS, and CK7 were associated with a higher risk of positive PD-L1 NSCLC (P=0.02, 0.008, 0.01, 0.01, <0.001, and 0.05, respectively). For CT features, density type, average CT value, volume, maximum diameter, minimum diameter, lobulation, and pulmonary vascular abnormality were associated with a higher risk of positive PD-L1 NSCLC (P<0.001, <0.001, 0.005, 0.01, 0.01, <0.001, and 0.002, respectively).

Table 4

Univariate logistic regression analysis results

Characteristics B SE OR (95% CI) Z P
Age 0.02 0.00847 1.02 (1.004–1.037) 2.361 0.02
Sex 0.162 0.19891 1.176 (0.795–1.734) 0.814 0.42
Smoking history 0.539 0.20295 1.715 (1.151–2.553) 2.657 0.008
Complication 0.046 0.20672 1.047 (0.696–1.566) 0.221 0.83
TNM categories 0.205 0.14591 1.227 (0.923–1.637) 1.404 0.16
Histologic subtype 0.41 0.16648 1.506 (1.087–2.094) 2.461 0.01
Pathological grade 0.505 0.20461 1.656 (1.114–2.487) 2.466 0.01
STAS 1.378 0.37971 3.967 (1.926–8.659) 3.629 <0.001
EGFR −0.2 0.1887 0.819 (0.566–1.186) −1.058 0.29
CK7 −2.209 1.10031 0.11 (0.006–0.69) −2.007 0.05
Surgical methods −0.12 0.12264 0.887 (0.695–1.126) −0.975 0.33
Density type 0.591 0.13699 1.805 (1.384–2.369) 4.312 <0.001
Average CT value 0.002 0.0005 1.002 (1.001–1.003) 4.586 <0.001
Volume 0 0.00004 1 (1–1) 2.834 0.005
Maximum diameter 0.042 0.01605 1.042 (1.01–1.076) 2.589 0.01
Minimum diameter 0.057 0.02258 1.058 (1.013–1.107) 2.511 0.01
Location 0.025 0.06258 1.025 (0.907–1.16) 0.401 0.69
Spiculation 0.4 0.20472 1.492 (0.997–2.228) 1.954 0.05
Cavity 0.819 0.51301 2.269 (0.831–6.452) 1.597 0.11
Vacuole 0.665 0.44747 1.945 (0.804–4.76) 1.486 0.14
Boundary 0.218 0.20758 1.244 (0.831–1.878) 1.052 0.29
Lobulation 0.743 0.19113 2.103 (1.448–3.065) 3.889 <0.001
Air bronchogram 0.379 0.19001 1.461 (1.007–2.122) 1.995 0.05
Pleural indentation 0.249 0.18927 1.282 (0.886–1.861) 1.313 0.19
Pulmonary vascular abnormality 0.67 0.21736 1.955 (1.276–2.996) 3.083 0.002

B, beta; CI, confidence interval; CK7, cytokeratin 7; CT, computed tomography; EGFR, epidermal growth factor receptor; OR, odds ratio; SE, standard error; STAS, spread through air spaces; TNM, tumor-node-metastasis.

Multivariate logistic regression analysis (using backward method) was performed using the above statistical significance in univariate analysis regarding the clinicopathological characteristics and CT features. The results showed that smoking history [P=0.01; odds ratio (OR) =1.808; 95% confidence interval (CI): 1.155–2.830], STAS (P=0.10; OR =2.119; 95% CI: 0.877–5.417), average CT value (P=0.04; OR =1.001; 95% CI: 1–1.003), lobulation (P=0.04; OR =1.685; 95% CI: 1.038–2.738), and CK7 (P=0.12; OR =0.169; 95% CI: 0.008–1.161) were independent predictors of positive PD-L1 NSCLC. The result is presented in the forest plots (Figures 3,4). The cut-off value of average CT value was −268.58 HU between the positive- and negative PD-L1 groups. The probability model was calculated: logit = 0.592 × smoking history (1= smoker, 0= non-smoker) + 0.751 × STAS (1= present, 0= absent) + 0.001 × average CT value (HU) + 0.521 × lobulation (1= present, 0= absent) + 1.777 × CK7 (1= positive, 0= negative).

Figure 3 Forest plots for predicting the results of PD-L1-based multivariate logistic regression analysis. CI, confidence interval; CK7, cytokeratin 7; CT, computed tomography; OR, odds ratio; PD-L1, programmed death-ligand 1; STAS, spread through air spaces.
Figure 4 Some CT morphological characteristics of pulmonary nodule with positive and negative PD-L1. (A,B) A 70-year-old female, pulmonary nodule with lobulated sign, and average CT value was 30.282 HU. (C,D) A 67-year-old male, pulmonary nodule without lobulated sign, and average CT value was −743.826 HU. CT, computed tomography; HU, Hounsfield unit; PD-L1, programmed death-ligand 1.

Model construction and comparison

The clinicopathologic, CT, and clinicopathologic-CT models were established based on the results univariate and multivariate logistic regression analyses. The calculation formula of clinicopathologic-CT model was: logit = 0.592 × smoking history (1= smoker, 0= non-smoker) + 0.751 × STAS (1= present, 0= absent) + 0.001 × average CT value (HU) + 0.521 × lobulation (1= present, 0= absent) + 1.777 × CK7 (1= positive, 0= negative); the calculation formula of CT model was: logit = 0.001 × average CT value (HU) + 0.521 × lobulation (1= present, 0= absent); and the calculation formula of clinicopathologic model was: logit = 0.592 × smoking history (1= smoker, 0= non-smoker) + 0.751 × STAS (1= present, 0= absent) + 1.777 × CK7 (1= positive, 0= negative). The AUC were 0.682 (95% CI: 0.643–0.741; sensitivity =0.724; specificity =0.519), 0.642 (95% CI: 0.589–0.647; sensitivity =0.566; specificity =0.688), and 0.804 (95% CI: 0.738–0.835; sensitivity =0.797; specificity =0.731) in training set, respectively (Figure 5A). The AUC were 0.630 (95% CI: 0.621–0.702; sensitivity =0.701; specificity =0.542), 0.629 (95% CI: 0.574–0.638; sensitivity =0.624; specificity =0.615), and 0.819 (95% CI: 0.740–0.837; sensitivity =0.763; specificity =0.760) in validation set, respectively (Figure 5B).

Figure 5 ROC curves analysis results. (A) ROC curves of the clinicopathologic, CT, and clinicopathologic-CT models for predicting PD-L1 in the training set. (B) ROC curves of the clinicopathologic, CT, and clinicopathologic-CT models for predicting PD-L1 in the validation set. (C) The Hosmer-Lemeshow test of clinicopathologic-CT model in the training set. (D) The Hosmer-Lemeshow test of clinicopathologic-CT model in the validation set. AUC, area under the curve; CT, computed tomography; PD-L1, programmed death-ligand 1; ROC, receiver operating characteristic.

Both in training and validation sets, the clinicopathologic-CT model had higher predictive performance than the other two models by DeLong test (P<0.001, 95% CI: −0.206 to −0.015; P<0.001, 95% CI: −0.2235 to −0.124). There were no statistically significant differences between the clinicopathologic and CT models, both in the training and validation sets (P=0.42, 95% CI: −0.057 to 0.137; P=0.96, 95% CI: −0.056 to 0.059).

The self-service method was used to resample the calibration curve 1,000 times to ensure the accuracy of the results. The result of Hosmer-Lemeshow test showed that the clinicopathologic-CT model had good calibration ability both in training and validation sets (P=0.98 and 0.10, respectively) (Figure 5C,5D).

Nomogram construction and performance

The nomogram of the clinicopathologic-CT model was established in Figure 6A. The clinical decision curve was constructed using standardized net benefit and high-risk threshold to evaluate the clinical usefulness of nomogram. The clinical decision curve showed that nomogram could bring net benefits compared to the case of treat-all and treat-none (Figure 6B).

Figure 6 The nomogram of and decision curve of the clinicopathologic-CT model. (A) The nomogram can predict the risk of PD-L1 in NSCLC patients, which combining all the statistically significant variables after univariate logistic regression analysis. Smoking history: 1= smoker, 0= non-smoker; histologic subtype: 3= adenocarcinoma, 2= squamous cell carcinoma, 1= others; pathological grade: 2= high grade, 1= low grade; STAS: 1= present, 0= absent; density type: 3= non-solid, 2= partially solid, 1= solid; lobulation: 1= present, 0= absent; pulmonary vascular abnormality: 1= present, 0= absent; CK7: 1= positive, 0= negative. (B) The decision curve of the clinicopathologic-CT model. The X-axis represents the threshold probability and the Y-axis represents the net income. The decision curve shows that the patient could receive a net gain in the range of 0.2 to 0.8. CK7, cytokeratin 7; CT, computed tomography; HU, Hounsfield unit; Nomo, nomogram; NSCLC, non-small cell lung cancer; PD-L1, programmed death-ligand 1; STAS, spread through air spaces.

Discussion

This study demonstrated that CT features and clinicopathological characteristics—including smoking history, STAS, average CT value, lobulation, and CK7—may aid in the differential diagnosis of PD-L1-positive NSCLC patients. While some studies have utilized CT images to construct predictive models for PD-L1 expression in lung cancer, few have explored the differences in CT features and clinicopathological characteristics between PD-L1-positive and PD-L1-negative groups (15,17,18).

In this study, smoking history (P=0.01; OR =1.808), STAS (P=0.10; OR =2.119), average CT value (P=0.04; OR =1.00), lobulation (P=0.04; OR =1.68), and CK7 (P=0.12; OR =0.169) were identified as independent predictors of PD-L1-positive NSCLC. In the validation set, the AUC values for the clinicopathological model, CT model, and clinicopathological-CT model were 0.630 (95% CI: 0.621–0.702; sensitivity =0.701; specificity =0.542), 0.629 (95% CI: 0.574–0.638; sensitivity =0.624, specificity =0.615), and 0.819 (95% CI: 0.740–0.837; sensitivity =0.763; specificity =0.760), respectively. The clinicopathological-CT model demonstrated significantly higher predictive performance than the other two models, as confirmed by DeLong’s test in both the training and validation sets (P<0.001, 95% CI: −0.206 to −0.015; P<0.001, 95% CI: −0.2235 to −0.124). Additionally, decision curve analysis indicated that the nomogram provided net benefits compared to the treat-all and treat-none strategies.

Kaplan-Meier analysis and log-rank tests revealed that both the 5-year RFS and OS rates were significantly higher in the PD-L1-negative group compared to the PD-L1-positive group. These findings align with previous studies suggesting that PD-L1 expression is a biomarker of poor prognosis in advanced NSCLC (26,27). A large meta-analysis involving 55 studies and 11,383 patients further supported the role of PD-L1 as a poor prognostic biomarker in lung cancer (28). Miyazawa et al. demonstrated that PD-L1 expression increases with tumor progression, correlating with worse outcomes in NSCLC patients (29). However, most prior studies focused on advanced-stage NSCLC, highlighting the need for research in early-stage disease.

The prognostic impact of PD-L1 expression in early-stage NSCLC is equally critical. Another meta-analysis of 15 studies involving 3,790 patients indicated that PD-L1 expression predicts unfavorable outcomes in surgically resected early-stage NSCLC, consistent with our results (30). The abnormal activation of multiple oncogenes and signaling pathways in tumor cells has a certain regulatory effect on PD-L1. The expression of PD-L1 is more significant with the increase of lymphatic invasion and the risk of tumor spread, and the prognosis of patients is worse (8,9,31).

Among clinicopathological characteristics, a history of smoking was significantly associated with PD-L1 positivity, as confirmed by multivariate logistic regression analysis. Smoking is a well-established risk factor for lung cancer and has been shown to influence prognosis (32,33). In a study of 189 NSCLC patients, Ng et al. found that smoking status (P=0.008) remained a significant predictor of PD-L1 expression (≥1%) and was the most accessible predictor of response to PD-1/PD-L1 inhibitors (34).

Regarding CT features, our study identified average CT value and lobulation as independent predictors of PD-L1 positivity in NSCLC. Few studies have explored the relationship between CT features and PD-L1 expression in lung cancer. Yoon et al. (16) reported no significant differences in clinical characteristics (e.g., age, sex, smoking) or CT features (e.g., solid and subsolid nodules) between PD-L1-positive and PD-L1-negative groups in 153 advanced lung adenocarcinoma patients (P>0.05 for all), which contrasts with our findings. This study differed from our research, as it encompassed patients with advanced lung adenocarcinoma, in contrast to our study, which focused on consecutive patients with early-stage NSCLC. According to the 9th edition of the IASLC TNM classification of lung cancer (19), lobulation and CT density (cut-off value: −268.584 HU) are indicative features of malignant lung tumors, supporting our results.

In this study, we developed three models—clinicopathological, CT, and clinicopathological-CT—and found that the clinicopathological-CT model achieved the highest diagnostic efficiency, with AUC values of 0.804 (95% CI: 0.738–0.835; sensitivity =0.797; specificity =0.731) in the training set and 0.819 (95% CI: 0.740–0.837; sensitivity =0.763; specificity =0.760) in the validation set. Tian et al. (18) constructed a deep learning model using a convolutional neural network based on CT images, reporting AUC values of 0.78 (95% CI: 0.75–0.80), 0.71 (95% CI: 0.59–0.81), and 0.76 (95% CI: 0.66–0.85) in the training, validation, and test sets, respectively, which were lower than those of our model. Shao et al. (15) proposed a multi-label, multi-task deep learning system for NSCLC, achieving AUC values of 0.856 (95% CI: 0.663–0.948) for identifying a 10-molecular status panel (including PD-L1) and 0.868 (95% CI: 0.641–0.972) for classifying EGFR/PD-L1 subtypes. Compared to these studies, our approach leverages routinely available clinical and CT data, offering a practical and non-invasive method for predicting PD-L1 expression in NSCLC.

There were several limitations in this study. First, although the sample size was substantial, the retrospective, single-center design may introduce bias. Second, we used a 1% threshold to categorize PD-L1 expression as positive or negative, rather than employing a more granular classification [e.g., negative (<1%), low (1–50%), and high (>50%)] (22,23). The initial report used a 5% threshold for positivity (31), and the choice of cutoff may influence the results. Further studies with detailed PD-L1 expression stratification are needed. Third, interobserver variability in CT feature assessment could affect stability, although we selected features with high reproducibility (ICC and κ index >0.9) for model construction.


Conclusions

In conclusion, CT features and clinicopathological characteristics, including smoking history, STAS, average CT value, lobulation, and CK7, may assist in diagnosing PD-L1 expression in early-stage NSCLC. The clinicopathological-CT model demonstrated superior predictive performance compared to the clinicopathological and CT models in both training and validation sets. Decision curve analysis confirmed the clinical utility of the nomogram, showing net benefits over treat-all and treat-none strategies.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1439/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1439/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1439/prf

Funding: This work was supported by the Zhongshan Hospital Funding (No. 2024XKPT22-RC1) and the National Natural Science Foundation of China (General Program) (No. 82172030).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1439/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Biomedical Research Ethics Committee of Zhongshan Hospital (No. B2021-429R). Informed consent was waived in this retrospective study.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Nicholson AG, Tsao MS, Beasley MB, et al. The 2021 WHO Classification of Lung Tumors: Impact of Advances Since 2015. J Thorac Oncol 2022;17:362-87. [Crossref] [PubMed]
  2. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
  3. Ettinger DS, Wood DE, Aisner DL, et al. Non-Small Cell Lung Cancer, Version 3.2022, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw 2022;20:497-530. [Crossref] [PubMed]
  4. Pignon JP, Tribodet H, Scagliotti GV, et al. Lung adjuvant cisplatin evaluation: a pooled analysis by the LACE Collaborative Group. J Clin Oncol 2008;26:3552-9. [Crossref] [PubMed]
  5. Reck M, Remon J, Hellmann MD. First-Line Immunotherapy for Non-Small-Cell Lung Cancer. J Clin Oncol 2022;40:586-97. [Crossref] [PubMed]
  6. Reda M, Ngamcherdtrakul W, Nelson MA, et al. Development of a nanoparticle-based immunotherapy targeting PD-L1 and PLK1 for lung cancer treatment. Nat Commun 2022;13:4261. [Crossref] [PubMed]
  7. Travis WD, Brambilla E, Burke AP, et al. Introduction to The 2015 World Health Organization Classification of Tumors of the Lung, Pleura, Thymus, and Heart. J Thorac Oncol 2015;10:1240-2. [Crossref] [PubMed]
  8. Ai L, Xu A, Xu J. Roles of PD-1/PD-L1 Pathway: Signaling, Cancer, and Beyond. Adv Exp Med Biol 2020;1248:33-59. [Crossref] [PubMed]
  9. Brahmer JR, Tykodi SS, Chow LQ, et al. Safety and activity of anti-PD-L1 antibody in patients with advanced cancer. N Engl J Med 2012;366:2455-65. [Crossref] [PubMed]
  10. Yi M, Zheng X, Niu M, et al. Combination strategies with PD-1/PD-L1 blockade: current advances and future directions. Mol Cancer 2022;21:28. [Crossref] [PubMed]
  11. Reck M, Rodríguez-Abreu D, Robinson AG, et al. Updated Analysis of KEYNOTE-024: Pembrolizumab Versus Platinum-Based Chemotherapy for Advanced Non-Small-Cell Lung Cancer With PD-L1 Tumor Proportion Score of 50% or Greater. J Clin Oncol 2019;37:537-46. [Crossref] [PubMed]
  12. Uprety D, Mandrekar SJ, Wigle D, et al. Neoadjuvant Immunotherapy for NSCLC: Current Concepts and Future Approaches. J Thorac Oncol 2020;15:1281-97. [Crossref] [PubMed]
  13. Miller PG, Li G, Singal G. PD-L1 Status and Survival in Patients With Lung Cancer-Reply. JAMA 2019;322:783-4. [Crossref] [PubMed]
  14. Yang SR, Schultheis AM, Yu H, et al. Precision medicine in non-small cell lung cancer: Current applications and future directions. Semin Cancer Biol 2022;84:184-98. [Crossref] [PubMed]
  15. Shao J, Ma J, Zhang S, et al. Radiogenomic System for Non-Invasive Identification of Multiple Actionable Mutations and PD-L1 Expression in Non-Small Cell Lung Cancer Based on CT Images. Cancers (Basel) 2022;14:4823. [Crossref] [PubMed]
  16. Yoon J, Suh YJ, Han K, et al. Utility of CT radiomics for prediction of PD-L1 expression in advanced lung adenocarcinomas. Thorac Cancer 2020;11:993-1004. [Crossref] [PubMed]
  17. Lim CH, Koh YW, Hyun SH, et al. A Machine Learning Approach Using PET/CT-based Radiomics for Prediction of PD-L1 Expression in Non-small Cell Lung Cancer. Anticancer Res 2022;42:5875-84. [Crossref] [PubMed]
  18. Tian P, He B, Mu W, et al. Assessing PD-L1 expression in non-small cell lung cancer and predicting responses to immune checkpoint inhibitors using deep learning on computed tomography images. Theranostics 2021;11:2098-107. [Crossref] [PubMed]
  19. Asamura H, Nishimura KK, Giroux DJ, et al. IASLC Lung Cancer Staging Project: The New Database to Inform Revisions in the Ninth Edition of the TNM Classification of Lung Cancer. J Thorac Oncol 2023;18:564-75.
  20. Wu J, Xia Y, Wang X, et al. uRP: An integrated research platform for one-stop analysis of medical images. Front Radiol 2023;3:1153784. [Crossref] [PubMed]
  21. Herbst RS, Baas P, Kim DW, et al. Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): a randomised controlled trial. Lancet 2016;387:1540-50. [Crossref] [PubMed]
  22. Mok TSK, Wu YL, Kudaba I, et al. Pembrolizumab versus chemotherapy for previously untreated, PD-L1-expressing, locally advanced or metastatic non-small-cell lung cancer (KEYNOTE-042): a randomised, open-label, controlled, phase 3 trial. Lancet 2019;393:1819-30. [Crossref] [PubMed]
  23. Reck M, Rodríguez-Abreu D, Robinson AG, et al. Pembrolizumab versus Chemotherapy for PD-L1-Positive Non-Small-Cell Lung Cancer. N Engl J Med 2016;375:1823-33. [Crossref] [PubMed]
  24. Moreira AL, Ocampo PSS, Xia Y, et al. A Grading System for Invasive Pulmonary Adenocarcinoma: A Proposal From the International Association for the Study of Lung Cancer Pathology Committee. J Thorac Oncol 2020;15:1599-610. [Crossref] [PubMed]
  25. Baker SG, Kramer BS. Evaluating a new marker for risk prediction: decision analysis to the rescue. Discov Med 2012;14:181-8.
  26. Okuma Y, Hosomi Y, Nakahara Y, et al. High plasma levels of soluble programmed cell death ligand 1 are prognostic for reduced survival in advanced lung cancer. Lung Cancer 2017;104:1-6. [Crossref] [PubMed]
  27. Teramoto K, Igarashi T, Kataoka Y, et al. Prognostic impact of soluble PD-L1 derived from tumor-associated macrophages in non-small-cell lung cancer. Cancer Immunol Immunother 2023;72:3755-64. [Crossref] [PubMed]
  28. Li H, Xu Y, Wan B, et al. The clinicopathological and prognostic significance of PD-L1 expression assessed by immunohistochemistry in lung cancer: a meta-analysis of 50 studies with 11,383 patients. Transl Lung Cancer Res 2019;8:429-49. [Crossref] [PubMed]
  29. Miyazawa T, Marushima H, Saji H, et al. PD-L1 Expression in Non-Small-Cell Lung Cancer Including Various Adenocarcinoma Subtypes. Ann Thorac Cardiovasc Surg 2019;25:1-9. [Crossref] [PubMed]
  30. Shi T, Zhu S, Guo H, et al. The Impact of Programmed Death-Ligand 1 Expression on the Prognosis of Early Stage Resected Non-Small Cell Lung Cancer: A Meta-Analysis of Literatures. Front Oncol 2021;11:567978. [Crossref] [PubMed]
  31. Topalian SL, Hodi FS, Brahmer JR, et al. Safety, activity, and immune correlates of anti-PD-1 antibody in cancer. N Engl J Med 2012;366:2443-54. [Crossref] [PubMed]
  32. Gemine RE, Davies GR, Lanyon K, et al. Quitting smoking improves two-year survival after a diagnosis of non-small cell lung cancer. Lung Cancer 2023;186:107388. [Crossref] [PubMed]
  33. Steliga MA, Dresler CM. Epidemiology of lung cancer: smoking, secondhand smoke, and genetics. Surg Oncol Clin N Am 2011;20:605-18. [Crossref] [PubMed]
  34. Ng TL, Liu Y, Dimou A, et al. Predictive value of oncogenic driver subtype, programmed death-1 ligand (PD-L1) score, and smoking status on the efficacy of PD-1/PD-L1 inhibitors in patients with oncogene-driven non-small cell lung cancer. Cancer 2019;125:1038-49. [Crossref] [PubMed]
Cite this article as: Zhuo Y, Wang Q, Zhan Y, Yang S, Zhang H, Yang S, Zhang Z, Shan F. Clinicopathological-CT model for predicting PD-L1 expression in resectable early-stage non-small cell lung cancer. J Thorac Dis 2025;17(12):10954-10968. doi: 10.21037/jtd-2025-1439

Download Citation