A novel nomogram for predicting mediastinal lymph node metastasis in non-small cell lung cancer: a retrospective analysis
Original Article

A novel nomogram for predicting mediastinal lymph node metastasis in non-small cell lung cancer: a retrospective analysis

Jialin Mei1, Bing Zhang1 ORCID logo, Yongyue Zhu1, Lei Chen1, Dingping Yang1, Tao Lin1, Fei Gao1, Defu Yin1, Gaofeng Li2

1Department of Cardiothoracic Surgery, The Fifth Affiliated Hospital of Dali University (Baoshan People’s Hospital), Baoshan, China; 2Department of Thoracic Surgery, The Third Affiliated Hospital of Kunming Medical University, Yunnan Hospital of Peking University Cancer Hospital (Yunnan Tumor Hospital), Kunming, China

Contributions: (I) Conception and design: J Mei, B Zhang; (II) Administrative support: Y Zhu, L Chen; (III) Provision of study materials or patients: D Yang, T Lin; (IV) Collection and assembly of data: F Gao, D Yin; (V) Data analysis and interpretation: G Li, J Mei; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Bing Zhang, MMed. Department of Cardiothoracic Surgery, The Fifth Affiliated Hospital of Dali University (Baoshan People’s Hospital), Dongcheng New Campus (at the intersection of Qingyang Road and Longquan Road, Longyang District), Baoshan 678000, China. Email: doctorbingzhang@126.com.

Background: Accurate assessment of lymph node metastasis (LNM) is crucial for preoperative staging and treatment planning in patients with lung cancer. While previous research has explored LNM risk in non-small cell lung cancer (NSCLC), clinical validation of multifactorial predictive models is lacking. This study aimed to develop and validate a dynamic nomogram for predicting LNM in NSCLC patients.

Methods: We retrospectively analysed 619 NSCLC patients and divided them into training (70%) and validation (30%) groups. Univariate and multivariate ordinal logistic regression analyses identified predictive factors for LNM. Variables were selected via least absolute shrinkage and selection operator (LASSO) regression. A dynamic nomogram was developed on the basis of logistic regression results, and its performance was evaluated through receiver operating characteristic (ROC) curve analysis, calibration plots, and decision curve analysis (DCA). The model was further validated with 1,000 bootstrap resamples.

Results: Independent predictors of LNM were ferritin, carbohydrate antigen 125 (CA125), carcinoembryonic antigen (CEA), carbohydrate antigen 199 (CA199), EGFR exon 19 deletion, tumor size, and tumor location. The nomogram exhibited excellent discriminative ability, with an area under the ROC curve (AUC) of 0.846 in the training group and 0.828 in the validation group. DCA indicated greater net benefits across various LNM risk thresholds.

Conclusions: This study presents a dynamic nomogram that integrates EGFR exon 19 deletion and serum ferritin levels, enhancing preoperative staging and aiding treatment decisions for NSCLC patients.

Keywords: Non-small cell lung cancer (NSCLC); lymph node metastasis (LNM); serum tumor markers; EGFR exon 19 deletion


Submitted Apr 04, 2025. Accepted for publication May 23, 2025. Published online Aug 23, 2025.

doi: 10.21037/jtd-2025-701


Highlight box

Key findings

• A dynamic nomogram integrating EGFR exon 19 deletion and serum ferritin was developed to predict mediastinal lymph node metastasis (LNM) in non-small cell lung cancer (NSCLC). Key predictors included ferritin, carbohydrate antigen 125 (CA125), carcinoembryonic antigen (CEA), carbohydrate antigen 199 (CA199), and EGFR exon 19 deletion. The model achieved high accuracy [training area under the receiver operating characteristic curve (AUC) 0.846; validation AUC 0.828], with decision curve analysis confirming clinical utility across risk thresholds.

What is known and what is new?

• Traditional LNM prediction relies on imaging and markers (CEA/CA125); EGFR exon 19 deletion correlates with prognosis but was unused in metastasis prediction.

• This study represents the first integration of EGFR exon 19 deletion and ferritin into a predictive tool, offering precise preoperative risk stratification.

What is the implication, and what should change now?

• The nomogram offers a non-invasive tool to identify high-risk patients, reducing unnecessary invasive staging. Incorporating EGFR testing and ferritin measurement into preoperative workflows is critical. Clinicians should consider adopting this model to guide personalized treatment and refine staging protocols. Prospective validation in diverse cohorts and inclusion of additional biomarkers (e.g., PD-L1) are recommended to broaden applicability. Guidelines should prioritize integrating molecular and biochemical profiling for NSCLC metastasis risk assessment.


Introduction

Lung cancer remains a leading global malignancy with persistently high incidence and mortality rates, emphasizing the imperative for early detection and accurate staging (1). Preoperative lymph node staging serves as a cornerstone of clinical decision-making, directly informing therapeutic strategies and prognostic evaluations (2). Mediastinal lymph node metastasis (LNM), which is of particular clinical significance, typically signifies locally advanced disease, highlighting the necessity for precise metastatic status assessment to guide personalized management (3). Recent advances in multimodal diagnostic approaches, including high-resolution imaging and liquid biopsy biomarkers, have intensified efforts to improve predictive accuracy through integrative data analysis (4).

As a well-established tumor biomarker, serum carcinoembryonic antigen (CEA) demonstrates diagnostic utility in lung cancer through its high specificity for epithelial malignancies and correlation with tumor burden (5,6). Notably, elevated CEA levels exhibit prognostic relevance in LNM across multiple cancers, including colorectal malignancies (7,8). Carbohydrate antigen 125 (CA125), traditionally associated with ovarian cancer diagnostics (9), has gained recognition in thoracic oncology, with elevated preoperative levels correlating with lymph node involvement in advanced non-small cell lung cancer (NSCLC) (10,11). Similarly, carbohydrate antigen 199 (CA199), primarily utilized in pancreaticobiliary malignancies (12), shows emerging potential in lung cancer progression monitoring, particularly in metastatic dissemination (13,14). Therapeutically, EGFR exon 19 deletion mutations (EGFR exon 19del) constitute a critical determinant of tyrosine kinase inhibitor responsiveness and survival outcomes (15,16), though their predictive value for nodal metastasis remains underexplored. Ferritin, a multifaceted acute-phase reactant, may reflect systemic inflammatory responses and tumor microenvironment (TME) dynamics in NSCLC (17), with preliminary evidence suggesting lymph node metastatic associations (18).

Existing literature has extensively characterized clinicopathological predictors of mediastinal metastasis, including primary tumor dimensions, histologic subtypes, and tumor, node, metastasis (TNM) staging parameters (19,20). However, current predictive models suffer from critical limitations: (I) overreliance on single-modality data (e.g., imaging or biomarkers), neglecting synergistic diagnostic potential; (II) inadequate validation of multifactorial algorithms in diverse clinical settings; (III) insufficient incorporation of serum proteomic profiles with radiomic features. To address these gaps, this study sought to develop and validate a novel composite model integrating clinicoradiological parameters [tumor size, computed tomography (CT) attenuation] with serological biomarkers (CEA, CA125, CA199, ferritin) and molecular profiling (EGFR exon 19del) to optimize preoperative mediastinal lymph node staging accuracy. By leveraging machine learning methodologies, our approach aimed to surpass conventional logistic regression models, providing clinicians with a robust decision-support tool for therapeutic stratification. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-701/rc).


Methods

Study population and grouping

The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. This study was approved by the Ethics Committee of The Fifth Affiliated Hospital of Dali University (Baoshan People’s Hospital, Approval No. L1-2023-KYKY-08). Owing to the retrospective nature of this observational study, written informed consent was waived. All patient data were anonymized to ensure privacy protection.

A retrospective analysis was performed on 619 patients with surgically pathologically confirmed NSCLC at Baoshan People’s Hospital between January 2018 and December 2023. The cohort included 260 patients with metastatic mediastinal lymph nodes and 359 without metastasis. Patients were randomly allocated to a training group (n=433) and a validation group (n=186) in a 7:3 ratio. Inclusion criteria were: (I) single pulmonary lesion; (II) lobectomy or partial resection with mediastinal lymph node dissection/sampling; (III) complete pathological data; (IV) no neoadjuvant chemotherapy/radiotherapy; (V) histologically confirmed NSCLC. Exclusion criteria included: (I) major organ dysfunction affecting surgical tolerance; (II) concurrent malignancies; (III) receipt of neoadjuvant therapy; (IV) incomplete pathological data. The sample size was determined based on the 10-event-per-variable (10EPV) principle to ensure ≥10 events per predictor, enhancing model robustness. Given that the proportion of missing data was relatively low (approximately 2.5%), we employed listwise deletion to handle the missing values.

Clinicopathological characteristics

The study workflow is illustrated in Figure 1. Preoperative clinical parameters included: sex, age, smoking status, tumor diameter, lymph node size, tumor location, radiographic tumor density, lymph node CT attenuation (Hounsfield units, HU), serum CEA, CA125, CA199, EGFR exon 19del mutations, and ferritin levels. CT scans were performed within 30 days preoperatively to assess maximum tumor dimensions, density, and HU values. Imaging interpretations were independently conducted by two thoracic surgeons and one radiologist; discrepancies were resolved through consensus.

Figure 1 The flowchart of this study. LASSO, least absolute shrinkage and selection operator; ROC, receiver operating characteristic.

Variable definitions

Optimal cutoffs for continuous variables were determined via receiver operating characteristic (ROC) curves: CEA: 3.96 ng/mL [an area under the ROC curve (AUC) =0.867, 95% confidence interval (CI): 0.839–0.894]; CA125: 17.23 U/mL (AUC =0.850, 95% CI: 0.821–0.879); CA199: 14.37 U/mL (AUC =0.874, 95% CI: 0.847–0.901); tumor size: 2.75 cm (AUC =0.867, 95% CI: 0.839–0.894); lymph node size: 1.15 cm (AUC =0.848, 95% CI: 0.818–0.878); lymph node HU: 41.5 (AUC =0.825, 95% CI: 0.794–0.857); ferritin: 286 ng/mL (AUC =0.849, 95% CI: 0.816–0.882). Age was dichotomized as <55 vs. ≥55 years. Tumor density was classified as solid or ground-glass opacity. T-stage followed the 8th Edition TNM classification (revised to reflect current guidelines). EGFR exon 19del mutations were detected via high-sensitivity polymerase chain reaction (PCR).

Surgical operation

For tumors <8 mm, pneumonectomy was considered if malignancy risk was high during follow-up or upon patient request. Resection modalities included lobectomy and sublobar resection (wedge/segmentectomy). Mediastinal lymph node evaluation comprised systematic dissection or sampling: right-sided tumors: stations 2R, 4R, 3A, 3P, 7, 8, 9, 10; left-sided tumors: stations 4L, 5, 6, 7, 8, 9, 10.

Statistical analysis

Analyses were performed using R (v4.2.1). Categorical variables are expressed as frequencies (%). Baseline comparisons used the comparegroups package; least absolute shrinkage and selection operator (LASSO) regression via glmnet; multivariate logistic regression with stepwise backward elimination [Akaike Information Criterion (AIC) criterion, α=0.2 for inclusion]. The XGBoost model was implemented with the xgboost package. Discrimination was assessed using ROC curves (ggROC); calibration via rms::val.prob and Hosmer-Lemeshow test (ResourceSelection); decision curve analysis (DCA) with dcurves. Nomograms were generated using rms. Statistical significance was defined as two-tailed P<0.05.


Results

Patient characteristics

A total of 619 patients were included in the study, with 284 males and 335 females. Histopathological examination revealed that 260 patients (42.0%) had positive lymph nodes (pN+), while 359 patients (58.0%) had negative lymph nodes (pN0). The patients were randomly assigned to the training group (433 patients) and the validation group (186 patients). The clinical and demographic characteristics of both groups are shown in Table 1. In the training group, 187 patients (43.19%) had positive lymph nodes, and in the validation group, 73 patients (39.25%) had positive lymph nodes. There were no significant differences between the groups regarding gender, age, smoking history, tumor diameter, lymph node size, tumor location, tumor imaging density, lymph node CT value (HU), CEA, CA125, CA199, EGFR exon 19 deletion mutation, ferritin, and postoperative pathology (P>0.05), indicating that the baseline characteristics were balanced between the groups.

Table 1

Demographic and clinicopathologic characteristics of the training and validation groups

Characteristics Total (N=619) Training group (N=433) Validation group (N=186) P value
LNM 0.41
   Positive 260 (42.00) 187 (43.19) 73 (39.25)
   Negative 359 (58.00) 246 (56.81) 113 (60.75)
Smoking history 0.43
   No smoking 326 (52.67) 233 (53.81) 93 (50.00)
   Smoking 293 (47.33) 200 (46.19) 93 (50.00)
Sex >0.99
   Female 335 (54.12) 234 (54.04) 101 (54.30)
   Male 284 (45.88) 199 (45.96) 85 (45.70)
Age (years) 0.14
   <55 306 (49.43) 223 (51.50) 83 (44.62)
   ≥55 313 (50.57) 210 (48.50) 103 (55.38)
Tumor size (cm) 0.91
   <2.75 340 (54.93) 239 (55.20) 101 (54.30)
   ≥2.75 279 (45.07) 194 (44.80) 85 (45.70)
LN size (cm) 0.95
   <1.15 339 (54.77) 238 (54.97) 101 (54.30)
   ≥1.15 280 (45.23) 195 (45.03) 85 (45.70)
Tumor location 0.39
   Peripheral 260 (60.05) 104 (55.91) 260 (60.05)
   Central 173 (39.95) 82 (44.09) 173 (39.95)
EGFR exon 19d 0.38
   NMG 331 (53.47) 237 (54.73) 94 (50.54)
   GM 288 (46.53) 196 (45.27) 92 (49.46)
Tumor imaging density 0.19
   Nonsolid 336 (54.28) 243 (56.12) 93 (50.00)
   Solid 283 (45.72) 190 (43.88) 93 (50.00)
CEA (ng/mL) 0.94
   <3.96 333 (53.80) 232 (53.58) 101 (54.30)
   ≥3.96 286 (46.20) 201 (46.42) 85 (45.70)
CA125 (ng/mL) 0.10
   <17.23 277 (44.75) 184 (42.49) 93 (50.00)
   ≥17.23 342 (55.25) 249 (57.51) 93 (50.00)
CA199 (ng/mL) 0.18
   <14.37 331 (53.47) 242 (55.89) 89 (47.85)
   ≥14.37 288 (46.53) 191 (44.11) 97 (52.15)
Ferritin (ng/mL) >0.99
   <286 292 (47.17) 204 (47.11) 88 (47.31)
   ≥286 327 (52.83) 229 (52.89) 98 (52.69)
LN CT value (HU) 0.24
   <41.5 350 (56.54) 252 (58.20) 98 (52.69)
   ≥41.5 269 (43.46) 181 (41.80) 88 (47.31)

Data are presented as n (%). CA125, carbohydrate antigen 125; CA199, carbohydrate antigen 19-9; CEA, carcinoembryonic antigen; CT, computed tomography; GM, gene mutation; HU, Hounsfield units; LN, lymph node; LNM, lymph node metastasis; NMG, nonmutated gene.

Variable selection for model construction

A two-step method was used for clinical feature selection. First, LASSO regression was employed to reduce 13 candidate variables to 8, as shown in Figure 2. These variables were then subjected to logistic regression analysis to identify risk factors associated with LNM. The results of univariate and multivariate logistic regression analyses of the selected variables are presented in Table 2. In the univariate analysis, variables with P values less than 0.2 were considered potential predictors, and then stepwise regression using AIC was applied to finalize the predictive factors. The seven factors selected were: ferritin [odds ratio (OR): 2.847, 95% CI: 1.755–4.685, P<0.001], CA125 (OR: 3.127, 95% CI: 1.91–5.193, P<0.001), CEA (OR: 1.743, 95% CI: 1.07–2.838, P=0.03), CA199 (OR: 2.188, 95% CI: 1.336–3.591, P=0.002), EGFR exon 19 deletion (OR: 1.687, 95% CI: 1.039–2.757, P=0.04), tumor location (OR: 1.941, 95% CI: 1.192–3.171, P=0.008), and tumor size (OR: 5.570, 95% CI: 3.432–9.168, P<0.001). These variables were identified through LASSO regression.

Figure 2 LASSO regression analysis for variable selection and parameter optimization. (A) Variable reduction and selection were performed via LASSO regression, where an increase in the log lambda value resulted in a decrease in the number of independent variables. (B) Further refinement of variable selection was achieved via 10-fold cross-validation with LASSO regression, with the optimal parameters determined by the lambda value corresponding to 1SE. Abbreviations: LASSO, least absolute shrinkage and selection operator; 1SE, 1 standard error.

Table 2

Logistic regression analysis of the risk factors for lymph node metastasis

Factors Univariate analysis Multivariate analysis
OR (95% CI) P value OR (95% CI) P value
LN CT value 1.16 (0.789–1.705) 0.45
Ferritin 2.538 (1.718–3.777) <0.001 2.847 (1.755–4.685) <0.001
CA125 3.425 (2.282–5.201) <0.001 3.127 (1.91–5.193) <0.001
CEA 3.095 (2.091–4.613) <0.001 1.743 (1.07–2.838) 0.03
CA199 3.725 (2.504–5.588) <0.001 2.188 (1.336–3.591) 0.002
Imaging density 2.241 (1.522–3.315) <0.001 1.382 (0.836–2.279) 0.20
EGFR exon 19d 1.6 (1.091–2.352) 0.02 1.687 (1.039–2.757) 0.04
Tumor location 2.511 (1.695–3.738) <0.001 l.941 (1.192–3.171) 0.008
LN size 1.628 (1.11–2.393) 0.01 1.067 (0.646–1.752) 0.80
Tumor size 8.228 (5.371–12.79) <0.001 5.57 (3.432–9.168) <0.001
Age 0.777 (0.53–1.137) 0.19
Sex 0.894 (0.61–1.31) 0.56
Smoking history 1.555 (1.061–2.283) 0.02 1.162 (0.712–1.890) 0.55

CA125, carbohydrate antigen 125; CA199, carbohydrate antigen 19-9; CEA, carcinoembryonic antigen; CI, confidence interval; CT, computed tomography; LN, lymph node; OR, odds ratio.

Nomogram for predicting LNM

A nomogram for predicting the probability of LNM in NSCLC was constructed based on the seven risk factors selected from the training group, as shown in Figure 3. The nomogram combines multiple clinical factors, and the score for each risk factor can be determined by its position on the vertical axis. The total score is obtained by summing the scores for each factor, and the probability of LNM can be estimated from the total score.

Figure 3 Nomogram for predicting the probability of mediastinal lymph node metastasis in lung cancer patients on the basis of all independent risk factors. CA125, carbohydrate antigen 125; CA199, carbohydrate antigen 19-9; CEA, carcinoembryonic antigen; NMG, nonmutated gene.

Predictive model validation

Discrimination

The AUC for the training group was 0.846 (95% CI: 0.812–0.883), as shown in Figure 4A. After 1,000 bootstrap resamplings, the AUC for the training group was 0.840 (95% CI: 0.803–0.877) (Figure 4B). The AUC for the validation group was 0.828 (95% CI: 0.769–0.886), as shown in Figure 4C. After 1,000 bootstrap resamplings, the AUC for the validation group was 0.859 (95% CI: 0.808–0.912) (Figure 4D). These results indicate that the model has good discrimination ability.

Figure 4 ROC curve analysis of LNM prediction in the training and validation groups. (A) ROC curve analysis of LNM in the prediction training group. (B) Validation via bootstrap resampling (1,000 iterations) and ROC curve analysis of LNM in the prediction training group. (C) ROC curve analysis of LNM in the prediction validation group. (D) Validation via bootstrap resampling (1,000 iterations) and ROC curve analysis of LNM in the prediction validation group. The blue shaded areas in (A,C) depict the 95 % confidence intervals (95 % CI) around the corresponding curves. AUC, area under the curve; CI, confidence interval; LNM, lymph node metastasis; ROC, receiver operating characteristic.

Calibration of the predictive model

The calibration results of the model are presented in Figure 5. The calibration curve of the training group, shown in Figure 5A, was close to the reference line (y = x), indicating good agreement between predicted probabilities and observed outcomes. The c-statistic was 0.85 (95% CI: 0.81–0.88), and the Hosmer-Lemeshow test showed that χ2=7.402, df=8, P=0.49. Figure 5B illustrates the calibration curve of the training group after 1,000 bootstrap resamplings, which continued to demonstrate good calibration. For the validation group, the calibration curve depicted in Figure 5C was also close to the reference line (y = x). The c-statistic was 0.83 (95% CI: 0.76–0.88), and the Hosmer-Lemeshow test showed that χ2=7.112, df=8, P=0.52. Finally, Figure 5D confirms the model’s calibration performance by showing the calibration curve of the validation group after 1,000 bootstrap resamplings.

Figure 5 Calibration analysis of the nomogram model for LNM prediction in the training and validation groups. (A) Calibration curves for the nomogram model in the training group. (B) Validation via bootstrap resampling (1,000 iterations) and a calibration curve of the training group nomogram model. (C) Calibration curves for the nomogram model in the validation group. (D) Validation via bootstrap resampling (1,000 iterations) and calibration curve of the validation group nomogram model. B, number of bootstrap repetitions; boot, bootstrap; CI, confidence interval; LNM, lymph node metastasis.

Clinical decision-making and rationality analysis

The DCA showed that the nomogram provided significant net benefit for predicting LNM in patients within a threshold risk range of 2–96% (Figure 6A). In the validation group, significant net benefit was also observed within the 4–85% risk threshold range (Figure 6B). In both the training and validation groups, the nomogram performed better than other clinical indicators, providing superior discrimination (Figure 7).

Figure 6 Decision curve analysis for evaluating the clinical utility of the nomogram in predicting LNM. (A) Decision curve analysis of the training cohort via the nomogram for detecting LNM. (B) Decision curve analysis of the validation group via the nomogram for detecting LNM. LNM, lymph node metastasis; Nomo model, nomogram model.
Figure 7 Comparative analysis of the nomogram model against other clinical indicators in predicting LNM. Comparison of ROC curves for the nomogram model with other clinical indicators in the training group (A) and validation group (C). Comparison of clinical decision curve analysis results for the nomogram model and other clinical indicators in the training group (B) and validation group (D). CA125, carbohydrate antigen 125; CA199, carbohydrate antigen 199; LNM, lymph node metastasis; ROC, receiver operating characteristic.

Discussion

In this study, we developed and validated a novel predictive model for mediastinal LNM in patients with NSCLC, integrating molecular genetic markers and serum biomarkers to enhance preoperative assessment accuracy. Using LASSO regression followed by multivariable logistic regression, we identified EGFR exon 19 deletion mutations and elevated serum ferritin levels as independent predictors of LNM. These predictors were incorporated into a clinically practical nomogram, representing the first predictive framework to synergize molecular alterations with iron metabolism indicators for LNM risk stratification in NSCLC.

The prognostic significance of LNM is underscored by its association with tumor aggressiveness, necessitating refined risk assessment tools. While a prior study utilized wild-type EGFR and TP53 status for LNM prediction in early-stage lung adenocarcinoma (AUC: 0.819 training cohort, 0.780 validation cohort) (18), our model specifically demonstrated the predictive superiority of EGFR exon 19 deletion mutations (AUC: 0.846 training, 0.828 validation). Furthermore, although serum tumor markers such as CEA, CA125, and CA199 are established prognostic indicators, our work innovatively highlights serum ferritin—a biomarker of dysregulated iron metabolism—as a critical predictor. This aligns with Li et al.’s findings on CEA’s predictive utility (21), while further confirming in clinical settings ferritin’s role in promoting metastasis by modulating tumor microenvironments (22). Emerging evidence underscores ferritin’s central role in lung cancer metastasis through iron-TME axis dysregulation. As the primary iron storage protein, ferritin disrupts iron homeostasis to activate reactive oxygen species (ROS)-dependent metastasis pathways (23), with clinical correlations showing its elevation exacerbates KRAS-mutant tumor progression via iron-catalyzed lipid peroxidation (24). Metabolic reprogramming studies reveal ferritin’s dual action in enhancing glutamate flux and GPX4-mediated antioxidant defenses to empower metastatic spread. Microenvironmentally, ferritin activates MSR+ macrophages/neutrophils, driving CD163+ M2-TAM polarization (SOCS3-mediated immunosuppression) (25,26) and neutrophil extracellular traps (NETs)-induced inflammatory priming for lung/brain colonization (27). While these findings position ferritin as a multifunctional metastasis regulator, its stage-specific mechanisms and metabolic-immunologic synergies demand systematic exploration to enable therapeutic translation.

Contrary to existing reports linking tumor density to LNM risk (28,29), our analysis revealed no significant association between solid/nonsolid nodule morphology and metastatic propensity (P=0.27), potentially attributable to limited sample size. However, we corroborated diameter-dependent risk stratification, identifying nodules ≤2.75 cm as lower-risk—consistent with Wang et al.’s ≤2 cm threshold for occult LNM (30). The nomogram demonstrated robust discriminative performance (C-index: 0.85 training, 0.83 validation) and excellent calibration (Hosmer-Lemeshow χ2=7.32, P=0.50). Our data show that the DCA reveals a net benefit threshold range of 2–96% (training set) and 4–85% (validation set) for the model, which outperforms prior models with narrower applicability. Yet, this wide range also indicates certain model limitations. On one hand, while the model can comprehensively capture information of patients at different risk levels, it may make it difficult for clinicians to accurately identify the patient group that benefits most in practice. Low-risk thresholds may cause false-positives, and high-risk thresholds may overlook intermediate-risk patients. On the other hand, the heterogeneity from the sample size and data diversity may affect the model’s stability at extreme thresholds, and too many variables and complex modeling techniques can increase the model’s complexity and make it harder to interpret. Drawing on clinical experience, the decision curve suggests that the threshold risk for mediastinoscopy is usually within 7–20%, with 8–10% being a critical value supported by multiple studies. Invasive procedures like mediastinoscopy or endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) are typically considered within this range. Thus, we’ve narrowed the model’s applicable scope to this clinically relevant interval to enhance its clinical applicability and operability.

Study limitations include potential selection bias from retrospective single-center data and variability in external validation performance. Future directions necessitate prospective multicenter trials to assess real-world generalizability, alongside integration of radiomic features and emerging genetic markers (e.g., PD-L1, KRAS) to refine predictive precision. Another study limitation is the omission of tumor histological subtypes and differentiation grades from the predictive model. Future prospective studies should combine histological, pathological features with serum biomarkers to boost predictive accuracy and comprehensiveness. Further investigations should evaluate the model’s utility in predicting survival outcomes and treatment responses. Collectively, this work advances LNM risk assessment in NSCLC through biomarker synthesis, providing a foundation for personalized therapeutic strategies while highlighting pathways for translational optimization.


Conclusions

We established the first clinicomolecular nomogram combining EGFR exon 19del mutations and serum ferritin for preoperative LNM prediction in NSCLC. The model achieved AUC >0.82 across cohorts, providing biologically informed risk stratification superior to conventional methods. With external validation, this tool could optimize surgical planning and neoadjuvant therapy selection, particularly in settings lacking advanced imaging.


Acknowledgments

We appreciate the help from other teammates.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-701/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-701/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-701/prf

Funding: This work was supported by the Baoshan Science and Technology Plan Medical Research Joint Special Project (2023bskjylms001 to J.M., 2023bskjylms007 to T.L.); and the Key Research Project Fund of Baoshan People’s Hospital (Bsy2025-ky002 to B.Z.).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-701/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of The Fifth Affiliated Hospital of Dali University (Baoshan People’s Hospital, Approval No. L1-2023-KYKY-08). Owing to the retrospective nature of this observational study, written informed consent was waived.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Luo G, Zhang Y, Rumgay H, et al. Estimated worldwide variation and trends in incidence of lung cancer by histological subtype in 2022 and over time: a population-based study. Lancet Respir Med 2025;13:348-63. [Crossref] [PubMed]
  2. Hendriks LEL, Remon J, Faivre-Finn C, et al. Non-small-cell lung cancer. Nat Rev Dis Primers 2024;10:71. [Crossref] [PubMed]
  3. Naxerova K. Evolutionary paths towards metastasis. Nat Rev Cancer 2025;25:545-60. [Crossref] [PubMed]
  4. Akaike T, Thakuria M, Silk AW, et al. Circulating Tumor DNA Assay Detects Merkel Cell Carcinoma Recurrence, Disease Progression, and Minimal Residual Disease: Surveillance and Prognostic Implications. J Clin Oncol 2024;42:3151-61. [Crossref] [PubMed]
  5. Tang F, Huang CW, Tang ZH, et al. Prognostic role of serum carcinoembryonic antigen in patients receiving liver resection for colorectal cancer liver metastasis: A meta-analysis. World J Gastrointest Surg 2023;15:2890-906. [Crossref] [PubMed]
  6. Tamura M, Furukawa N, Sakai T, et al. Prognostic significance of CEA reduction rate in patients with abnormally high preoperative CEA levels who underwent surgery for lung cancer. J Cardiothorac Surg 2024;19:662. [Crossref] [PubMed]
  7. Qiao G, Li X, Mohamed M, et al. Efficacy of Neoadjuvant Therapy and the Prognostic Significance of Serum Carcinoembryonic Antigen Level in Patients with Localized Pancreatic Adenocarcinoma with Non-elevated Carbohydrate Antigen 19-9 Levels. Ann Surg 2025; Epub ahead of print. [Crossref]
  8. Lin J, Wu Y, Lin Z, et al. Mid-level data fusion strategy based on urinary nucleosides SERS spectra and blood CEA levels for enhanced preoperative detection of lymph node metastasis in colorectal cancer. Anal Chim Acta 2024;1332:343360. [Crossref] [PubMed]
  9. Jin W, Chen R, Wu L, et al. An "on-off" electrochemical immunosensor for the detection of the glycan antigen CA125 by amplification signals using electropositive COFs. Talanta 2025;286:127593. [Crossref] [PubMed]
  10. Yang M, Wang L, Xie C, et al. A disposable ultrasensitive immunosensor based on MXene/NH(2)-CNT modified screen-printed electrode for the detection of ovarian cancer antigen CA125. Talanta 2025;281:126893. [Crossref] [PubMed]
  11. Yao F, Xu M, Zhu T, et al. A rare case of non-small cell lung cancer with progressive elevation of CA199 as its first manifestation. Asian J Surg 2024;47:3278-9. [Crossref] [PubMed]
  12. Lopes M, Figueiredo V, Mendes A, et al. Lung Adenocarcinoma Presenting With Elevated Serum Carbohydrate Antigen 19-9 Levels: A Case Report. Cureus 2025;17:e77786. [Crossref] [PubMed]
  13. Garnier J, Marchetti A, Campbell B, et al. Carbohydrate Antigen 19-9 Delta Function for Survival Prediction in Borderline Pancreatic Cancer. A PANC-PALS Consortium International Multicenter Derivation and Validation Study. Ann Surg 2025; Epub ahead of print. [Crossref]
  14. Xie F, Xu L, Mu Y, et al. Diagnostic Value of Seven Autoantibodies Combined with CEA and CA199 in Non-Small Cell Lung Cancer. Clin Lab 2023;
  15. Kim J, Park S, Ku BM, et al. Updates on the treatment of epidermal growth factor receptor-mutant non-small cell lung cancer. Cancer 2025;131:e35778. [Crossref] [PubMed]
  16. Tavernari D, Borgeaud M, Liu X, et al. Decoding the Clinical and Molecular Signatures of EGFR Common, Compound, and Uncommon Mutations in NSCLC: A Brief Report. J Thorac Oncol 2025;20:500-6. [Crossref] [PubMed]
  17. Guo D, Cai S, Deng L, et al. Ferroptosis in Pulmonary Disease and Lung Cancer: Molecular Mechanisms, Crosstalk Regulation, and Therapeutic Strategies. MedComm (2020) 2025;6:e70116.
  18. Guo W, Lu T, Song Y, et al. Lymph node metastasis in early invasive lung adenocarcinoma: Prediction model establishment and validation based on genomic profiling and clinicopathologic characteristics. Cancer Med 2024;13:e70039. [Crossref] [PubMed]
  19. Li Y, Liu F, Cai Q, et al. Invasion and metastasis in cancer: molecular insights and therapeutic targets. Signal Transduct Target Ther 2025;10:57. [Crossref] [PubMed]
  20. Shi X, Wang X, Yao W, et al. Mechanism insights and therapeutic intervention of tumor metastasis: latest developments and perspectives. Signal Transduct Target Ther 2024;9:192. [Crossref] [PubMed]
  21. Li F, Zhai S, Fu L, et al. Nomograms for intraoperative prediction of lymph node metastasis in clinical stage IA lung adenocarcinoma. Cancer Med 2023;12:14360-74. [Crossref] [PubMed]
  22. Lin H, Tison K, Du Y, et al. Itaconate transporter SLC13A3 impairs tumor immunity via endowing ferroptosis resistance. Cancer Cell 2024;42:2032-2044.e6. [Crossref] [PubMed]
  23. Gao G, Zhang X. Broadening horizons: research on ferroptosis in lung cancer and its potential therapeutic targets. Front Immunol 2025;16:1542844. [Crossref] [PubMed]
  24. Li Q, Song Q, Pei H, et al. Emerging mechanisms of ferroptosis and its implications in lung cancer. Chin Med J (Engl) 2024;137:818-29. [Crossref] [PubMed]
  25. Hu Z, Sui Q, Jin X, et al. IL6-STAT3-C/EBPbeta-IL6 positive feedback loop in tumor-associated macrophages promotes the EMT and metastasis of lung adenocarcinoma. Journal of Experimental & Clinical Cancer 2024;43:63. [Crossref] [PubMed]
  26. Li X, Yang Z, Chen B, et al. SOCS3 as a potential driver of lung metastasis in colon cancer patients. Front Immunol 2023;14:1088542. [Crossref] [PubMed]
  27. Zhang H, Wu D, Wang Y, et al. Ferritin-mediated neutrophil extracellular traps formation and cytokine storm via macrophage scavenger receptor in sepsis-associated lung injury. Cell Commun Signal 2024;22:97. [Crossref] [PubMed]
  28. Nia HT, Munn LL, Jain RK. Probing the physical hallmarks of cancer. Nat Methods 2025; Epub ahead of print. [Crossref]
  29. Linke JA, Munn LL, Jain RK. Compressive stresses in cancer: characterization and implications for tumour progression and treatment. Nat Rev Cancer 2024;24:768-91. [Crossref] [PubMed]
  30. Wang Z, Wu Y, Wang L, et al. Predicting occult lymph node metastasis by nomogram in patients with lung adenocarcinoma ≤2 cm. Future Oncol 2021;17:2005-13. [Crossref] [PubMed]
Cite this article as: Mei J, Zhang B, Zhu Y, Chen L, Yang D, Lin T, Gao F, Yin D, Li G. A novel nomogram for predicting mediastinal lymph node metastasis in non-small cell lung cancer: a retrospective analysis. J Thorac Dis 2025;17(8):5803-5815. doi: 10.21037/jtd-2025-701

Download Citation