Development of machine learning-based prognostic models for small cell lung cancer with brain metastases: an analysis of SEER and Chinese populations
Original Article

Development of machine learning-based prognostic models for small cell lung cancer with brain metastases: an analysis of SEER and Chinese populations

Mingyuan Guo1,2# ORCID logo, Jun Zhu3#, Xiaoman Duan4#, Pengyue Ren2, Haitao Wang1, Yu Zhang1, Hailing Lu2, Yanbin Zhao1

1Department of Internal Medical Oncology, Harbin Medical University Cancer Hospital, Harbin, China; 2Department of Oncology, The First Affiliated Hospital of Harbin Medical University, Harbin, China; 3Department of Hepatobiliary and Pancreatic Oncology, Nanjing Tianyinshan Hospital, Nanjing, China; 4Department of Radiation, Bayannur Hospital, Bayannur, China

Contributions: (I) Conception and design: M Guo, J Zhu, Y Zhao, H Lu; (II) Administrative support: Y Zhang, H Wang; (III) Provision of study materials or patients: Y Zhao, J Zhu; (IV) Collection and assembly of data: M Guo, X Duan, J Zhu, H Wang, Y Zhang; (V) Data analysis and interpretation: M Guo, J Zhu, H Lu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

Correspondence to: Yanbin Zhao, MD, PhD. Department of Internal Medical Oncology, Harbin Medical University Cancer Hospital, 150 Haping Road, Harbin 150081, China. Email: zhaoyanbin@hrbmu.edu.cn; Hailing Lu, MD, PhD. Department of Oncology, The First Affiliated Hospital of Harbin Medical University, 23 Youzheng Street, Harbin 150001, China. Email: luhailing2004@163.com; Yu Zhang, MS, MD. Candidate, Department of Internal Medical Oncology, Harbin Medical University Cancer Hospital, 150 Haping Road, Harbin 150081, China. Email: zhangyu1539@163.com.

Background: Accurate survival prediction is critical for optimizing clinical management in small cell lung cancer (SCLC) patients with initial brain metastasis (BM). This study aims to develop a machine learning (ML)-based prognostic model to improve survival prediction, thereby enhancing clinical decision-making and treatment outcomes.

Methods: Data on SCLC patients with initial BM were extracted from the Surveillance, Epidemiology, and End Results (SEER) database for the period from 2010 to 2020 and split into training (70%) and testing (30%) sets. We used COX proportional hazards regression model to select modeling features in the training set. Four ML methods, including least absolute shrinkage and selection operator (LASSO), random forest (RF), eXtreme Gradient Boosting (XGBoost), and gradient boosting machine (GBM), were used to establish prognostic models and validated by both the SEER database and patients from Harbin Medical University Cancer Hospital. Model performance was evaluated using area under the curve (AUC) values, accuracy and F1 scores. Furthermore, we conducted a prioritization of the importance of features by SHapley Additive exPlanations (SHAP) at various time points.

Results: A total of 4,227 patients were enrolled in the SEER database, with 2,958 cases allocated to the training set and 1,269 to the testing set. Based on the results of univariate and multivariate COX regression analyses, we identified age, sex, marital, primary tumor size, N stage, bone metastasis, liver metastasis, lung metastasis, months from diagnosis to therapy, surgery, radiotherapy, and chemotherapy as model features. The LASSO model outperformed other models, with AUCs of 0.771, 0.724, 0.753, 0.718 at 6 months, 1 year, 2 years, and 3 years in the testing set, and 0.801, 0.763, 0.838, and 0.900 in the external validation set. Additionally, feature importance analysis consistently identified liver metastasis (highest rank), surgery, radiotherapy, lung metastasis, and chemotherapy as key predictors across all time points.

Conclusions: The LASSO model demonstrated high accuracy in predicting survival for SCLC patients with initial BM, particularly in external validation. This model may provide valuable prognostic insights for personalized treatment strategies.

Keywords: Small cell lung cancer (SCLC); brain metastases (BMs); machine learning (ML); Surveillance, Epidemiology, and End Results database (SEER database); least absolute shrinkage and selection operator (LASSO)


Submitted May 12, 2025. Accepted for publication Aug 15, 2025. Published online Nov 21, 2025.

doi: 10.21037/jtd-2025-961


Highlight box

Key findings

• This study developed and validated machine learning-based prognostic models to predict the survival of small cell lung cancer (SCLC) patients with initial brain metastasis.

• Our feature importance analysis identified liver metastasis as the key predictor of survival, which had the highest rank.

What is known and what is new?

• Existing studies have shown that various factors, including clinicopathological features and treatment modalities, influence the survival of SCLC patients with initial brain metastasis. The prognosis becomes challenging to assess, especially for young physicians.

• The least absolute shrinkage and selection operator model demonstrated high accuracy in predicting survival for SCLC patients with initial brain metastasis.

What is the implication, and what should change now?

• By identifying key survival predictors, the model can help clinicians prioritize interventions and treatment plans by more personalized and precise predictions, potentially improving patient outcomes.

• Future studies could further optimize the model by incorporating imaging data and other patient information.


Introduction

Lung cancer is the leading cause of cancer-related mortality worldwide (1,2). Small cell lung cancer (SCLC) is a type of pulmonary neuroendocrine carcinoma and is among the most aggressive pathological types of lung cancer, accounting for approximately 10% to 15% of all new lung cancer cases annually (3-5). Its prognosis is extremely poor, with a 5-year survival rate of approximately 12% and median overall survival (OS) of 12.3–13.0 months for extensive-stage SCLC (6). SCLC is highly invasive, with about 70% of patients presenting with distant metastasis at initial diagnosis, and about 10% of patients developing brain metastasis (BM) at their first visit (7,8). Accurate survival prediction for this population is crucial, not only to inform treatment decisions but also to benefit patients in terms of cost management. However, the prediction of individual survival time is influenced by multiple factors, including age, staging, treatment, and underlying diseases. Moreover, the prognosis becomes challenging to assess, especially for young physicians, due to the combined effects and adverse reactions of various treatment modalities such as chemotherapy, radiotherapy, and immunotherapy. Previous studies have shown that physicians often overestimate or inaccurately predict patient survival time, particularly for patients with a survival time of less than 12 months (9-11). Therefore, it is essential to explore more precise prognostic models for SCLC patients with initial BM.

Machine learning (ML), a branch of artificial intelligence, provides a principled approach focused on using mathematical algorithms to identify patterns in data for prediction, specifically for analyzing high-dimensional and multimodal biomedical data (12). Common ML models include least absolute shrinkage and selection operator (LASSO), random forest (RF), eXtreme Gradient Boosting (XGBoost), and gradient boosting machines (GBMs), support vector machine (SVM), among others. These models currently hold significant clinical value in the screening, diagnosis, and prognosis of cancer. Studies have applied them to the diagnosis and prognosis prediction of various diseases, including breast cancer, non-SCLC, and thyroid cancer (13-16). However, there have been no studies using ML methods for the prognostic assessment of patients with BM from SCLC.

The application of ML relies on large-scale data for training and validation, which is difficult to obtain from single-center data. The Surveillance, Epidemiology, and End Results (SEER) database, managed by the National Cancer Institute of the National Institutes of Health, is the largest publicly available cancer dataset. This database collects data from 22 population-based cancer registries across the United States, covering approximately 48% of the U.S. population (17). It provides information on cancer patients’ general conditions, clinical characteristics, treatments, and survival, offering robust data support for the training and validation of ML models. Several studies have already conducted analyses on survival and diagnosis based on data from the SEER database (18-20). Therefore, the establishment and testing of our study model are based on data from the SEER database. Additionally, to assess the model’s effectiveness in the local region, we collected local data for external validation.

This study, based on data from the SEER database and our hospital, has established and validated ML models using LASSO, RF, XGBoost, and GBM. The model with the best performance was selected as the prognostic model for this study, aiming to provide more accurate prognostic information and treatment decision recommendations for SCLC patients with BM at initial diagnosis. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-961/rc).


Methods

Study design

The workflow of our study design and analysis is shown in Figure 1.

Figure 1 Diagram of the study procedure. AUC, area under the curve; GBM, gradient boosting machine; LASSO, least absolute shrinkage and selection operator; RF, random forest; ROC, receiver operating characteristic; SEER, Surveillance, Epidemiology, and End Results; XGBoost, eXtreme Gradient Boosting.

Data source and study population

Since distant metastasis information has only been included starting from 2010, our study used the SEER*stat software to retrieve the study population from the SEER database [SEER 17 Regs Research Data (2010–2020 changes); Version 8.4.2] (URL: https://seer.cancer.gov/), which is publicly accessible. The study finally comprised 4,227 SCLC patients with BM from the SEER database. The inclusion criteria for patients in the SEER database were as follows: (I) SCLC diagnosis based on the International Classification of Diseases for Oncology, Third Edition (ICD-O-3) histological codes 8041-3, 8042-3, 8043-3, 8044-3, and 8045-3; (II) all SCLC patients had BM at the time of initial diagnosis. The exclusion criteria for the SEER database were as follows: (I) no positive histological diagnosis; (II) patients with two or more primary cancers; (III) incomplete follow-up information; (IV) unknown or abnormal tumor size.

Ninety-one patients diagnosed with SCLC and BM at Harbin Medical University Cancer Hospital from September 1, 2019 to August 31, 2021, were included for external validation. The follow-up of these patients was conducted until September 2023. The inclusion criteria for external validation patients were as follows: (I) histopathologically confirmed SCLC; (II) presence of BM in all SCLC patients at initial diagnosis. The exclusion criteria for external validation patients were as follows: (I) missing or unknown clinical and survival information; (II) presence of non-SCLC components; (III) patients with two or more primary cancers. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the ethics committee of Harbin Medical University Cancer Hospital (No. KY2024-08) and individual consent for this retrospective analysis was waived.

Feature selection and importance analysis

Based on the SEER database, this study extracted clinical and pathological features for each patient. We collected data included patient age, sex, race, marital status, income, primary site, laterality, grade, primary tumor size, N stage, bone metastasis, liver metastasis, lung metastasis, distant lymph node metastasis, other distant metastases, history of lung cancer surgery, radiotherapy, chemotherapy, and months from diagnosis to therapy. The population included in the SEER database was stratified by age into the following categorical variables: 18–44, 45–59, 60–74, and 75 years and older. Patient income information was stratified into the following categorical variables: less than $35,000, $35,000–$54,999, $55,000–$74,999, and $75,000 and above. For patients in the external validation set, we collected their clinicopathological information, including gender, age, marital status, tumor primary size, N stage, bone metastasis, liver metastasis, lung metastasis, history of surgery, radiotherapy, and chemotherapy. The data collection personnel remained blinded to the patients’ OS and cancer-specific survival (CSS) outcomes. Patients with incomplete data were excluded during the screening process.

The outcomes examined in this study included both OS and CSS. OS was defined as the interval between diagnosis and either the occurrence of death or the date of the last follow-up. CSS was defined as the interval between diagnosis and the occurrence of death caused by SCLC. In the external validation set, the OS and CSS were collected by a personnel who was unaware of the patients’ clinical information.

This study employed the COX proportional hazards regression model for feature selection. Initially, a univariate COX analysis was conducted on the aforementioned variables, and factors with P<0.05 were included in a multivariate COX analysis. Ultimately, independent prognostic factors with P<0.05 were selected as features for training the ML model. The survival package in R was used for this process. In the final optimal prognostic model, we used SHapley Additive exPlanations (SHAP) to rank the importance of the included model features at 6-month, 1-year, 2-year, and 3-year time points.

Model development and comparison

The models in this study were established using R (version 4.1.3). We used the slice_sample function from the dplyr package in R to randomly divide the SEER cohort into a training set and a testing set at a ratio of 7:3. The external validation set in this study consisted of patient data from the Harbin Medical University Cancer Hospital. In the development and validation of the prognostic models, the features selected using the aforementioned methods were incorporated into the following ML models: LASSO, XGBoost, RF, and GBM. These models were designed to predict the OS of SCLC BM patients at 6 months, 1 year, 2 years, and 3 years, and were subsequently validated using an external validation set. The R packages of glmnet, xgboost, randomForest, and gbm were used to construct the four ML models.

Receiver operating characteristic (ROC) curves were plotted and the area under the curve (AUC) was calculated at 6 months, 1 year, 2 years, and 3 years to compare the four established ML models in both the testing set and the external validation set. The model with the best performance was selected as the prognostic prediction model for this study. The model’s performance was further evaluated using a confusion matrix. The accuracy of the confusion matrix was calculated using the formula: (truepositives+truenegatives)/(totalpredictions). The F1 score was calculated using the formula: 2×(precision×recall)/(precision+recall).

Statistical analysis

R software version 4.1.3 (available at http://www.R-project.org) was used for data analysis and graphical presentation. Independent Chi-squared tests were employed to compare variables between groups. The Kaplan-Meier and log-rank test method were used to calculate OS, CSS and the median follow-up time for patients. A P of less than 0.05 was considered statistically significant in all statistical analyses.


Results

Clinical characteristics of patients in SEER database

The study cohort comprised 4,227 eligible SCLC patients with BM from the SEER database, randomly divided into training (n=2,958, 70%) and testing (n=1,269, 30%) sets. Among the 4,227 included patients, the median OS was 6 months [95% confidence interval (CI): 6–7], and the median CSS was 7 months (95% CI: 6–7) (Figure 2). The baseline characteristics of the train set and test set were summarized in Table 1. The majority of patients were elderly, with 56.9% aged 60–74 years and 15.8% ≥75 years. Males slightly predominated (52.5%) over females (47.5%). Over half of the patients (50.3%) were married, while 16.9% reported single status. Income distribution showed 41.0% of patients in the $55,000–$74,999 bracket, with 31.7% earning ≥$75,000 annually, reflecting middle-to-high socioeconomic predominance. The upper lobe of lung was the most frequent primary site (53.8%), followed by main bronchus (10.0%). As shown in Table 1, independent Chi-squared tests for each of the clinicopathological features demonstrated excellent homogeneity.

Figure 2 Kaplan-Meier survival analysis. (A) OS and (B) CSS curves for the study cohort. The numbers at risk at each time point are provided below the graph. CI, confidence interval; CSS, cancer-specific survival; OS, overall survival.

Table 1

Clinical characteristics of 4,227 SCLC patients with BM from the SEER database

Clinical characteristics Train set (n=2,958) Test set (n=1,269) All (n=4,227) P
Age (years) >0.99
   18–44 39 (1.3) 18 (1.4) 57 (1.3)
   45–59 768 (26.0) 331 (26.1) 1,099 (26.0)
   60–74 1,677 (56.7) 727 (57.3) 2,404 (56.9)
   ≥75 474 (16.0) 193 (15.2) 667 (15.8)
Sex 0.73
   Female 1,394 (47.1) 615 (48.5) 2,009 (47.5)
   Male 1,564 (52.9) 654 (51.5) 2,218 (52.5)
Race 0.96
   Hispanic (all races) 153 (5.2) 61 (4.8) 214 (5.1)
   Non-Hispanic Black 285 (9.6) 136 (10.7) 421 (10.0)
   Non-Hispanic White 2,371 (80.2) 1,005 (79.2) 3,376 (79.9)
   Other/unknown 149 (5.0) 67 (5.3) 216 (5.1)
Marital 0.54
   Married 1,461 (49.4) 664 (52.3) 2,125 (50.3)
   Single 506 (17.1) 207 (16.3) 713 (16.9)
   Other/unknown 991 (33.5) 398 (31.4) 1,389 (32.9)
Income 0.80
   <$35,000 74 (2.5) 34 (2.7) 108 (2.6)
   $35,000–$54,999 713 (24.1) 330 (26.0) 1,043 (24.7)
   $55,000–$74,999 1,238 (41.9) 497 (39.2) 1,735 (41.0)
   ≥$75,000 933 (31.5) 408 (32.2) 1,341 (31.7)
Primary site 0.99
   Main bronchus 301 (10.2) 122 (9.6) 423 (10.0)
   Upper lobe, lung 1,585 (53.6) 691 (54.5) 2,276 (53.8)
   Middle lobe, lung 102 (3.4) 47 (3.7) 149 (3.5)
   Lower lobe, lung 660 (22.3) 294 (23.2) 954 (22.6)
   Overlapping lesion of lung 41 (1.4) 14 (1.1) 55 (1.3)
   Lung, NOS 269 (9.1) 101 (8.0) 370 (8.8)
Laterality 0.69
   Left-origin of primary 1,250 (42.3) 568 (44.8) 1,818 (43.0)
   Right-origin of primary 1,622 (54.8) 666 (52.5) 2,288 (54.1)
   Other/unknown 86 (2.9) 35 (2.8) 121 (2.9)
Grade >0.99
   Well differentiated 4 (0.1) 1 (0.1) 5 (0.1)
   Moderately differentiated 10 (0.3) 3 (0.2) 13 (0.3)
   Poorly differentiated 219 (7.4) 93 (7.3) 312 (7.4)
   Undifferentiated 290 (9.8) 116 (9.1) 406 (9.6)
   Unknown 2,435 (82.3) 1,056 (83.2) 3,491 (82.6)
Primary tumor size (cm) 0.93
   0< T ≤3 780 (26.4) 350 (27.6) 1,130 (26.7)
   3< T ≤5 765 (25.9) 333 (26.2) 1,098 (26.0)
   5< T ≤7 624 (21.1) 270 (21.3) 894 (21.1)
   7< T ≤10 586 (19.8) 223 (17.6) 809 (19.1)
   T >10 203 (6.9) 93 (7.3) 296 (7.0)
N stage 0.82
   N0 468 (15.8) 205 (16.2) 673 (15.9)
   N1 237 (8.0) 113 (8.9) 350 (8.3)
   N2 1,475 (49.9) 637 (50.2) 2,112 (50.5)
   N3 648 (21.9) 274 (21.6) 922 (21.8)
   NX 130 (4.4) 40 (3.2) 170 (4.0)
Surgery 0.65
   No/unknown 2,566 (86.7) 1,114 (87.8) 3,680 (87.1)
   Yes 392 (13.3) 155 (12.2) 547 (12.9)
Radiotherapy 0.94
   No/unknown 702 (23.7) 295 (23.2) 997 (23.6)
   Yes 2,256 (76.3) 974 (76.8) 3,230 (76.4)
Chemotherapy 0.50
   No/unknown 648 (21.9) 299 (23.6) 947 (22.4)
   Yes 2,310 (78.1) 970 (76.4) 3,280 (77.6)
Months from diagnosis to therapy 0.60
   <1 month 1,504 (50.8) 640 (50.4) 2,144 (50.7)
   ≥1 month 1,258 (42.5) 527 (41.5) 1,785 (42.2)
   Unknown 196 (6.6) 102 (8.0) 298 (7.1)
Bone metastasis 0.81
   No/unknown 2,134 (72.1) 903 (70.9) 2,949 (69.8)
   Yes 824 (27.9) 366 (28.8) 1,190 (28.2)
Liver metastasis 0.56
   No/unknown 2,049 (63.6) 900 (70.9) 2,949 (69.8)
   Yes 909 (30.7) 369 (29.1) 1,278 (30.2)
Lung metastasis 0.89
   No/unknown 2,742 (83.6) 1,068 (84.2) 3,540 (83.7)
   Yes 486 (16.4) 201 (15.8) 687 (16.3)
Distant lymph node metastasis >0.99
   No/unknown 2,692 (91.0) 1,154 (90.9) 3,846 (91.0)
   Yes 266 (9.0) 115 (9.1) 381 (9.0)
Other distant metastasis 0.92
   No/unknown 2,467 (83.4) 1,052 (82.9) 3,519 (83.3)
   Yes 491 (16.6) 217 (17.1) 708 (16.7)

BM, brain metastasis; N, node; NOS, not otherwise specified; SCLC, small cell lung cancer; SEER, Surveillance, Epidemiology, and End Results; T, tumor.

Univariable and multivariable COX regression analysis

The results of the univariate COX regression analysis for OS based on SEER database data are shown in Table 2. Factors with P<0.05 in the univariate COX regression analysis were included in the multivariate COX analysis. Independent prognostic factors for multivariate COX regression analysis of OS included age ≥75 years [hazard ratio (HR) =1.54], male (HR =1.14), unmarried or other/unknown marital status (HR =1.10–1.15), tumor diameter >5 cm (5< T ≤7 cm: HR =1.13; 7< T ≤10 cm: HR=1.13; T >10 cm: HR =1.17), N2/N3/NX stage lymph node metastasis (HR =1.24–1.34), as well as bone metastasis (HR =1.13), liver metastasis (HR =1.41) and lung metastasis (HR =1.16). Of note, patients who did not receive surgery (HR =0.77), chemotherapy (HR =0.29), or radiotherapy (HR =0.83) had a significantly higher risk of death, whereas a delay in treatment of ≥1 month showed a protective effect (HR =0.90).

Table 2

Univariate and multivariate COX regression analysis of OS in 4,227 SCLC patients with BM

Characteristics Univariate analysis Multivariate analysis
HR 95% CI P HR 95% CI P
Age (years)
   18–44 Reference Reference
   45–59 1.11 0.84–1.46 0.47 1.02 0.77–1.35 0.87
   60–74 1.29 0.98–1.69 0.07 1.15 0.88–1.52 0.31
   ≥75 1.94 1.46–2.56 <0.001 1.54 1.16–2.05 0.003
Sex
   Female Reference Reference
   Male 1.11 1.04–1.18 0.002 1.14 1.07–1.22 <0.001
Race
   Hispanic (all races) Reference
   Non-Hispanic Black 1.04 0.88–1.24 0.64
   Non-Hispanic White 1.06 0.92–1.23 0.41
   Other/unknown 0.83 0.68–1.01 0.07
Marital
   Married Reference Reference
   Single 1.06 0.97–1.16 0.18 1.10 1.00–1.21 0.04
   Other/unknown 1.18 1.10–1.27 <0.001 1.15 1.07–1.23 <0.001
Income
   <$35,000 Reference
   $35,000–$54,999 1.07 0.87–1.32 0.51
   $55,000–$74,999 1.00 0.82–1.22 0.98
   ≥$75,000 0.90 0.74–1.11 0.34
Primary site
   Main bronchus Reference
   Upper lobe, lung 1.00 0.90–1.11 0.98
   Middle lobe, lung 0.96 0.79–1.17 0.68
   Lower lobe, lung 1.12 1.00–1.27 0.055
   Overlapping lesion of lung 0.82 0.61–1.12 0.21
   Lung, NOS 1.03 0.89–1.20 0.65
Laterality
   Left-origin of primary Reference
   Right-origin of primary 1.03 0.96–1.10 0.40
   Other/unknown 0.93 0.77–1.13 0.47
Grade
   Well differentiated Reference
   Moderately differentiated 2.02 0.71–5.73 0.19
   Poorly differentiated 1.93 0.80–4.68 0.14
   Undifferentiated 1.95 0.81–4.72 0.14
   Unknown 2.00 0.83–4.81 0.12
Primary tumor size (cm)
   0< T ≤3 Reference Reference
   3< T ≤5 1.12 1.02–1.22 0.01 1.09 1.00–1.19 0.059
   5< T ≤7 1.14 1.04–1.25 0.005 1.13 1.03–1.24 0.008
   7< T ≤10 1.14 1.04–1.25 0.006 1.13 1.02–1.24 0.02
   T >10 1.17 1.03–1.34 0.02 1.17 1.02–1.34 0.03
N stage
   N0 Reference Reference
   N1 1.07 0.94–1.23 0.30 1.02 0.89–1.17 0.78
   N2 1.23 1.13–1.35 <0.001 1.27 1.16–1.40 <0.001
   N3 1.27 1.15–1.41 <0.001 1.34 1.20–1.49 <0.001
   NX 1.17 0.98–1.40 0.08 1.24 1.04–1.49 0.02
Surgery
   No/unknown Reference Reference
   Yes 0.72 0.65–0.79 <0.001 0.77 0.69–0.85 <0.001
Radiotherapy
   No/unknown Reference Reference
   Yes 0.66 0.61–0.71 <0.001 0.83 0.77–0.91 <0.001
Chemotherapy
   No/unknown Reference Reference
   Yes 0.31 0.29–0.34 <0.001 0.29 0.27–0.32 <0.001
Months from diagnosis to therapy
   <1 month Reference Reference
   ≥1 month 0.97 0.91–1.03 0.35 0.90 0.84–0.96 0.003
   Unknown 2.76 2.43–3.12 <0.001 0.86 0.73–1.01 0.07
Bone metastasis
   No/unknown Reference Reference
   Yes 1.30 1.21–1.39 <0.001 1.13 1.05–1.22 0.001
Liver metastasis
   No/unknown Reference Reference
   Yes 1.47 1.37–1.57 <0.001 1.41 1.31–1.52 <0.001
Lung metastasis
   No/unknown Reference Reference
   Yes 1.24 1.14–1.35 <0.001 1.16 1.06–1.26 0.001
Distant lymph node metastasis
   No/unknown Reference
   Yes 1.09 0.97–1.22 0.15
Other distant metastasis
   No/unknown Reference Reference
   Yes 1.11 1.02–1.21 0.02 1.08 0.99–1.18 0.10

BM, brain metastasis; CI, confidence interval; HR, hazard ratio; N, node; NOS, not otherwise specified; OS, overall survival; SCLC, small cell lung cancer; T, tumor.

Table 3 presents the results of the univariate COX regression analysis for CSS using data from the SEER database. The multivariate COX regression analysis results (Table 3) indicate that tumor diameter >3 cm showed prognostic significance (3< T ≤5 cm: HR =1.10; 5< T ≤7 cm: HR =1.14; 7< T ≤10 cm: HR =1.15; T >10 cm: HR =1.17). The protective effect was more pronounced for treatment delays of ≥1 month (HR =0.89), whereas the risk of tumor-specific death was significantly increased in patients who did not receive surgery (HR =0.75), chemotherapy (HR =0.28), or radiotherapy (HR =0.85). The impact of metastatic patterns on CSS was similar to OS, with the highest risk for liver metastases (HR =1.42), followed by bone metastases (HR =1.14) and lung metastases (HR =1.16).

Table 3

Univariate and multivariate COX regression analysis of CSS in 4,227 SCLC patients

Characteristics Univariate analysis Multivariate analysis
HR 95% CI P HR 95% CI P
Age (years)
   18–44 Reference Reference
   45–59 1.11 0.84–1.47 0.48 1.02 0.77–1.35 0.91
   60–74 1.29 0.98–1.71 0.07 1.15 0.87–1.52 0.32
   ≥75 1.91 1.44–2.55 <0.001 1.51 1.13–2.01 0.006
Sex
   Female Reference Reference
   Male 1.10 1.03–1.17 0.003 1.14 1.07–1.22 <0.001
Race
   Hispanic (all races) Reference
   Non-Hispanic Black 1.02 0.86–1.22 0.80
   Non-Hispanic White 1.07 0.92–1.24 0.37
   Other/unknown 0.83 0.67–1.02 0.07
Marital
   Married Reference Reference
   Single 1.05 0.96–1.15 0.25 1.09 0.99–1.20 0.07
   Other/unknown 1.19 1.11–1.28 <0.001 1.15 1.07–1.24 <0.001
Income
   <$35,000 Reference
   $35,000–$54,999 1.07 0.87–1.32 0.52
   $55,000–$74,999 1.00 0.81–1.23 0.98
   ≥$75,000 0.92 0.74–1.13 0.40
Primary site
   Main bronchus Reference
   Upper lobe, lung 0.99 0.89–1.11 0.88
   Middle lobe, lung 0.98 0.80–1.19 0.83
   Lower lobe, lung 1.11 0.99–1.26 0.09
   Overlapping lesion of lung 0.83 0.61–1.13 0.24
   Lung, NOS 1.03 0.89–1.19 0.72
Laterality
   Left-origin of primary Reference
   Right-origin of primary 1.03 0.96–1.10 0.38
   Other/unknown 0.89 0.73–1.09 0.27
Grade
   Well differentiated Reference
   Moderately differentiated 2.51 0.81–7.78 0.11
   Poorly differentiated 2.33 0.87–6.24 0.09
   Undifferentiated 2.36 0.88–6.33 0.09
   Unknown 2.38 0.89–6.35 0.08
Primary tumor size (cm)
   0< T ≤3 Reference Reference
   3< T ≤5 1.13 1.04–1.24 0.006 1.10 1.01–1.21 0.03
   5< T ≤7 1.15 1.05–1.26 0.003 1.14 1.04–1.26 0.006
   7< T ≤10 1.17 1.06–1.29 0.001 1.15 1.05–1.27 0.004
   T >10 1.18 1.03–1.35 0.02 1.17 1.02–1.35 0.03
N stage
   N0 Reference Reference
   N1 1.07 0.93–1.23 0.33 1.01 0.88–1.16 0.84
   N2 1.24 1.13–1.36 <0.001 1.28 1.16–1.41 <0.001
   N3 1.27 1.14–1.41 <0.001 1.33 1.19–1.48 <0.001
   NX 1.20 1.00–1.43 0.052 1.27 1.06–1.52 0.01
Surgery
   No/unknown Reference Reference
   Yes 0.70 0.64–0.77 <0.001 0.75 0.68–0.83 <0.001
Radiotherapy
   No/unknown Reference Reference
   Yes 0.66 0.62–0.72 <0.001 0.85 0.78–0.92 <0.001
Chemotherapy
   No/unknown Reference Reference
   Yes 0.30 0.28–0.33 <0.001 0.28 0.26–0.31 <0.001
Months from diagnosis to therapy
   <1 month Reference Reference
   ≥1 month 0.97 0.90–1.03 0.31 0.89 0.84–0.96 0.002
   Unknown 2.79 2.46–3.17 <0.001 0.86 0.73–1.01 0.07
Bone metastasis
   No/unknown Reference Reference
   Yes 1.31 1.22–1.41 <0.001 1.14 1.06–1.23 <0.001
Liver metastasis
   No/unknown Reference Reference
   Yes 1.48 1.38–1.59 <0.001 1.42 1.32–1.53 <0.001
Lung metastasis
   No/unknown Reference Reference
   Yes 1.25 1.15–1.37 <0.001 1.16 1.07–1.27 <0.001
Distant lymph node metastasis
   No/unknown Reference
   Yes 1.09 0.97–1.22 0.16
Other distant metastasis
   No/unknown Reference Reference
   Yes 1.12 1.02–1.22 0.01 1.08 0.98–1.18 0.10

CSS, cancer-specific survival; CI, confidence interval; HR, hazard ratio; N, node; NOS, not otherwise specified; SCLC, small cell lung cancer; T, tumor.

Establishment and evaluation of prognostic prediction models

In this study, based on the results of multivariate COX regression analysis for OS and CSS, age, sex, marital status, T, N, bone metastasis, liver metastasis, lung metastasis, months from diagnosis to therapy, surgical history, chemotherapy and radiotherapy were used as model features. The ML methods of LASSO, XGBoost, RF, and GBM were used to train the model in 2,958 patients, and internally validate it in 1,269 patients, predicting the survival status of SCLC with BM patients at 6 months, 1 year, 2 years and 3 years. Overall, the four ML models in test set demonstrated comparable efficacy, with AUC values of 0.70 or higher (Table 4).

Table 4

AUC for each predictive model in the test set

Model AUC
6-month 1-year 2-year 3-year
LASSO 0.771 0.724 0.753 0.718
XGBoost 0.771 0.725 0.737 0.710
RF 0.729 0.712 0.736 0.729
GBM 0.771 0.723 0.728 0.702

AUC, area under the curve; LASSO, least absolute shrinkage and selection operator; XGBoost, eXtreme Gradient Boosting; RF, random forest; GBM, gradient boosting machine.

91 patients who met the enrollment criteria at the Harbin Medical University Cancer Hospital were included as an independent external validation set for this study, with their basic characteristics shown in Table 5. The median follow-up time of this cohort was 30.5 months [95% CI: 29.7–not applicable (NA)], and the median OS was 10.9 months (95% CI: 9.14–13.3).

Table 5

Clinical characteristics of 91 SCLC patients with BM from the external validation set

Clinical characteristics External validation set (n=91), n (%)
Age (years)
   18–44 3 (3.3)
   45–59 38 (41.8)
   60–74 49 (53.8)
   ≥75 1 (1.1)
Sex
   Female 27 (29.7)
   Male 64 (70.3)
Marital
   Married 86 (94.5)
   Single 1 (1.1)
   Divorced 4 (4.4)
Primary tumor size (cm)
   0< T ≤3 16 (17.6)
   3< T ≤5 24 (26.4)
   5< T ≤7 29 (31.9)
   7< T ≤10 16 (17.6)
   T >10 6 (6.59)
N stage
   N0 14 (15.4)
   N1 4 (4.4)
   N2 22 (24.2)
   N3 51 (56.0)
Surgery
   No 85 (93.4)
   Yes 6 (6.6)
Radiotherapy
   No 35 (38.5)
   Yes 56 (61.5)
Chemotherapy
   No 7 (7.7)
   Yes 84 (92.3)
Months from diagnosis to therapy
   <1 month 91 (100)
   ≥1 month 0 (0.0)
Bone metastasis
   No 81 (89.0)
   Yes 10 (11.0)
Liver metastasis
   No 73 (80.2)
   Yes 18 (19.8)
Lung metastasis
   No 82 (90.1)
   Yes 9 (9.9)

BM, brain metastasis; N, node; SCLC, small cell lung cancer; T, tumor.

In external validation, the LASSO model demonstrated excellent predictive performance and robustness, with AUC values of 0.801, 0.763, 0.838, and 0.900 for the prediction at 6-month, 1-year, 2-year, and 3-year survival, respectively (Table 6). Combining the stable performance of the model in the train set (AUC: 0.741–0.789) and the test set (AUC: 0.718–0.771), we concluded that the LASSO model had the best predictive efficacy and clinical applicability among the four ML methods, and its ROC curves are shown in Figure 3.

Table 6

AUC for each predictive model in the external validation set

Model AUC
6-month 1-year 2-year 3-year
LASSO 0.801 0.763 0.838 0.900
XGBoost 0.788 0.758 0.835 0.857
RF 0.675 0.633 0.737 0.775
GBM 0.784 0.747 0.801 0.769

AUC, area under the curve; GBM, gradient boosting machine; LASSO, least absolute shrinkage and selection operator; RF, random forest; XGBoost, eXtreme Gradient Boosting.

Figure 3 The LASSO model of ROC curves in the training set, test set, and external validation set at 6 months, 1 year, 2 years, and 3 years. (A) Training set; (B) test set; (C) external validation set. AUC, area under the curve; LASSO, least absolute shrinkage and selection operator; ROC, receiver operating characteristic.

To further assess the clinical utility of the LASSO model, we calculated the accuracy and F1 scores of the LASSO prognostic assessment model at different time points using a confusion matrix (Figure 4). The accuracy of the 6-month, 1-year, 2-year, and 3-year survival models were 0.77, 0.69, 0.75, and 0.86, and the F1 scores were 0.60, 0.68, 0.83, and 0.92 respectively. Overall, the LASSO model has high clinical application value in long-term survival prediction.

Figure 4 Performance evaluation of LASSO survival prediction models using confusion matrices in the external validation set. (A) 6-month survival prediction; (B) 1-year survival prediction; (C) 2-year survival prediction; (D) 3-year survival prediction. LASSO, least absolute shrinkage and selection operator; TN, true negative; TP, true positive.

Clinical characteristics in terms of importance

The feature importance analysis based on SHAP values revealed the contribution of prognostic factors in the LASSO prognostic model (Figure 5). The feature ordering was the same at different time points, and the top five factors were liver metastasis, surgical history, radiotherapy, lung metastasis, and chemotherapy. Liver metastasis (0.0454–0.0463) ranked first and lung metastasis (0.0333–0.0338) ranked fourth, with distant metastasis having a high contribution to the model. Surgical history (0.0431–0.0439), radiotherapy (0.0364–0.0372) and chemotherapy (0.0324–0.0334) were ranked second, third and fifth, respectively, indicating the critical role of therapeutic interventions on prognosis.

Figure 5 SHAP-based feature importance ranking of the LASSO prognostic model across different timepoints. (A) 6-month; (B) 1-year; (C) 2-year; (D) 3-year. LASSO, least absolute shrinkage and selection operator; N, node; SHAP, SHapley Additive exPlanations.

Discussion

Currently, with the personalization of cancer treatment plans and the improvement of health awareness of patients and their families, the accurate assessment of prognosis has received widespread attention. With the wide application of ML, there is a pressing need for models capable of accurately predicting the prognosis of patients with different types of cancers in the clinical work, so as to help doctors better assess the survival expectation and the risk of disease progression of patients. In this study, we developed prognostic models for SCLC patients with initial BM diagnosis using four ML methods: LASSO, XGBoost, RF, and GBM, based on the SEER database. We found that the four models in the test set performed comparably for survival prediction, and LASSO largely outperformed the other three models in the external validation set. Overall, LASSO was the best prognostic model for SCLC patients with BM at initial diagnosis in this study. An SCLC prognostic study established a nomogram for patients receiving chemotherapy, and its predicted AUCs for OS at 1, 2, and 3 years for extensive-stage patients were 0.71, 0.66, and 0.66, respectively (21). The AUCs of the LASSO machine-learning model in the present study exceeded 0.70, and the efficacy of prognostic assessment was superior to that traditional column-line graph model. In addition, a breast cancer BM prognostic study (16) found that the AUCs of the XGBoost prognostic model externally validated for 6 months, 1 year, 2 years, and 3 years were 0.820, 0.732, 0.795, and 0.936, respectively. In comparison, the AUCs of the same time cutoffs of the external validation of the LASSO prognostic model in this study were 0.801, 0.763, 0.838, and 0.900, showing comparable accuracy. Therefore, we concluded that the LASSO ML model of this study performed well in the prognosis of SCLC with primary BM.

This study revealed independent risk factors for SCLC patients with BM at initial diagnosis, including male, age ≥75 years, larger primary tumor size, worse N stage, liver metastasis, lung metastasis, and bone metastasis. BMs concomitant with larger primary tumor, larger N stage, or distant organ metastases implied greater systemic tumor load, resulting in a poorer prognosis for the patient, which is consistent with several previous studies (22-24). In addition, OS was significantly better in female patients than in males. A multicenter cohort study showed that among SCLC patients, fewer female patients smoked, and SCLC patients without a history of smoking had a lower tumor mutational load, which may also contribute to longer survival time in women (25). For cancer, we emphasize early detection, diagnosis, and treatment, but longer months from diagnosis to therapy was a protective factor for patient survival in this study. Previous prognostic models on patients with BMs from lung cancer have similarly shown that months from diagnosis to therapy is an independent protective factor (23). Combined with clinical practice, we believe that it may be the case that patients with the presence of BM symptoms, a high systemic tumor load, and a poor status are more likely to be treated as soon as possible, which reduces the average survival time of patients treated early.

In this study, surgery, radiotherapy, and chemotherapy were protective independent influences on OS in SCLC patients with BM at first diagnosis. Chemotherapy is the mainstay of SCLC treatment, but due to the existence of the blood-brain barrier, it is difficult for chemotherapeutic agents to reach the disease effective control of intracerebral lesions, especially for SCLC patients with obvious BM symptoms. Therefore, local radiotherapy has been widely applied in clinical practice for the control of BMs. Due to the cognitive decline caused by whole-brain radiation therapy (WBRT), Stereotactic radiosurgery that provides precise high-dose radiation to the tumor has gained much attention and is mostly used in BM patients with single or limited number of metastases (26,27). It has shown excellent efficacy in the control of brain lesions and symptoms, including in patients with recurrence after WBRT, with fewer adverse effects (28-30). However, given the poor prognosis of SCLC, clinicians are actively exploring combination strategies involving chemotherapy, immunotherapy, and anti-angiogenic agents to improve outcomes in patients with extensive-stage SCLC, including those with BMs (31). Nevertheless, there remains a pressing need to develop more effective therapeutic approaches.

Given the complexity of factors affecting cancer prognosis, both linear and nonlinear ML models have been increasingly applied in recent years to develop prognostic prediction models. In a study by Ji et al., the XGBoost prognostic model had an AUC of 0.84 in predicting OS for kidney cancer at 3 years, and AUCs of 0.83 and 0.91 in internal and external validation cohorts, which was superior to other ML models such as SVM (32). However, a study published in the British Medical Journal constructed four models, COX proportional risk, competing risk regression, XGBoost, and neural network, to predict the 10-year risk of death in patients with invasive breast cancer, and it is noteworthy that the COX model and competing risk regression model were more accurate (33). In this study, the LASSO model AUC also outperformed the XGBoost, RF and GBM nonlinear models. In a study on lung cancer prognostic modeling, artificial neural networks, recurrent neural networks, and convolutional neural networks had higher AUC and more accurate models compared to RF and support vector ML models (15). For the comparison of deep learning architectures with ML methods, different studies have given different conclusions (15,32,33). Therefore, there is no one ML method that consistently performs the best in different modeling tasks in the clinic, and we can use multiple modeling methods for comparative evaluation and then select the best model to apply in practice.

There are some limitations of this study. First, the SEER database lacks detailed information about BM lesions, such as size, number, location, presence or absence of symptoms due to BM lesions. As a result, these factors were not included in the analysis, which may impact the clinical applicability of the model. Second, the external validation data used in this study are derived from a single-center source, and the findings should be further validated with multi-center data. Third, although four models were selected for this study and their prognostic prediction efficacy is fair, there is still room for improvement. Future research could explore novel approaches, such as integrated learning or graph neural networks, to enhance prediction accuracy.

In summary, we constructed a machine-learning prognostic model for SCLC patients who were initially diagnosed with BM, and confirms the superiority of the LASSO algorithm in this scenario, and provides some references and guidance for clinical practice and future research.


Conclusions

In this study, a machine-learning prognostic model was constructed based on the SEER database for patients with SCLC with BMs at initial diagnosis. The LASSO model performed best overall (externally validated 6-month to 3-year AUC: 0.763–0.900), with the key predictors including liver metastasis, lung metastasis, surgery, radiotherapy, and chemotherapy. The model provides a quantitative tool for individualized survival assessment of SCLC patients with BM at first diagnosis and holds substantial clinical applicability.


Acknowledgments

We would like to thank the efforts of the SEER program tumor registries in the creation of the SEER database.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-961/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-961/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-961/prf

Funding: None.

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-961/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the ethics committee of Harbin Medical University Cancer Hospital (No. KY2024-08) and individual consent for this retrospective analysis was waived.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Siegel RL, Kratzer TB, Giaquinto AN, et al. Cancer statistics, 2025. CA Cancer J Clin 2025;75:10-45. [Crossref] [PubMed]
  2. Han B, Zheng R, Zeng H, et al. Cancer incidence and mortality in China, 2022. J Natl Cancer Cent 2024;4:47-53. [Crossref] [PubMed]
  3. Megyesfalvi Z, Gay CM, Popper H, et al. Clinical insights into small cell lung cancer: Tumor heterogeneity, diagnosis, therapy, and future directions. CA Cancer J Clin 2023;73:620-52. [Crossref] [PubMed]
  4. Wang Q, Gümüş ZH, Colarossi C, et al. SCLC: Epidemiology, Risk Factors, Genetic Susceptibility, Molecular Pathology, Screening, and Early Detection. J Thorac Oncol 2023;18:31-46. [Crossref] [PubMed]
  5. Surveillance, Epidemiology, and End Results. Small cell carcinoma of the lung and bronchus. [Internet]. [cited 2025 July 29]. Available online: https://seer.cancer.gov/statistics
  6. Kim SY, Park HS, Chiang AC. Small Cell Lung Cancer: A Review. JAMA 2025;333:1906-17. [Crossref] [PubMed]
  7. Lukas RV, Gondi V, Kamson DO, et al. State-of-the-art considerations in small cell lung cancer brain metastases. Oncotarget 2017;8:71223-33. [Crossref] [PubMed]
  8. Rudin CM, Brambilla E, Faivre-Finn C, et al. Small-cell lung cancer. Nat Rev Dis Primers 2021;7:3. [Crossref] [PubMed]
  9. Amano K, Maeda I, Shimoyama S, et al. The Accuracy of Physicians' Clinical Predictions of Survival in Patients With Advanced Cancer. J Pain Symptom Manage 2015;50:139-46.e1. [Crossref] [PubMed]
  10. White N, Reid F, Harris A, et al. A Systematic Review of Predictions of Survival in Palliative Care: How Accurate Are Clinicians and Who Are the Experts? PLoS One 2016;11:e0161407. [Crossref] [PubMed]
  11. Lakin JR, Robinson MG, Bernacki RE, et al. Estimating 1-Year Mortality for High-Risk Primary Care Patients Using the "Surprise" Question. JAMA Intern Med 2016;176:1863-5. [Crossref] [PubMed]
  12. Tran KA, Kondrashova O, Bradley A, et al. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med 2021;13:152. [Crossref] [PubMed]
  13. Zhang Y, Zhang Z, Wei L, et al. Construction and validation of nomograms combined with novel machine learning algorithms to predict early death of patients with metastatic colorectal cancer. Front Public Health 2022;10:1008137. [Crossref] [PubMed]
  14. Xi NM, Wang L, Yang C. Improving the diagnosis of thyroid cancer by machine learning and clinical data. Sci Rep 2022;12:11143. [Crossref] [PubMed]
  15. Doppalapudi S, Qiu RG, Badr Y. Lung cancer survival period prediction and understanding: Deep learning approaches. Int J Med Inform 2021;148:104371. [Crossref] [PubMed]
  16. Li C, Liu M, Zhang Y, et al. Novel models by machine learning to predict prognosis of breast cancer brain metastases. J Transl Med 2023;21:404. [Crossref] [PubMed]
  17. Che WQ, Li YJ, Tsang CK, et al. How to use the Surveillance, Epidemiology, and End Results (SEER) data: research design and methodology. Mil Med Res 2023;10:50. [Crossref] [PubMed]
  18. Munai E, Zeng S, Yuan Z, et al. Machine learning-based prediction model for brain metastasis in patients with extensive-stage small cell lung cancer. Sci Rep 2024;14:28790. [Crossref] [PubMed]
  19. Lee C, Light A, Alaa A, et al. Application of a novel machine learning framework for predicting non-metastatic prostate cancer-specific mortality in men using the Surveillance, Epidemiology, and End Results (SEER) database. Lancet Digit Health 2021;3:e158-65. [Crossref] [PubMed]
  20. Zhong X, Lin Y, Zhang W, et al. Predicting diagnosis and survival of bone metastasis in breast cancer using machine learning. Sci Rep 2023;13:18301. [Crossref] [PubMed]
  21. Liang M, Chen M, Singh S, et al. Prognostic Nomogram for Overall Survival in Small Cell Lung Cancer Patients Treated with Chemotherapy: A SEER-Based Retrospective Cohort Study. Adv Ther 2022;39:346-59. [Crossref] [PubMed]
  22. Zhang GH, Liu YJ, De Ji M. Risk Factors, Prognosis, and a New Nomogram for Predicting Cancer-Specific Survival Among Lung Cancer Patients with Brain Metastasis: A Retrospective Study Based on SEER. Lung 2022;200:83-93. [Crossref] [PubMed]
  23. Zhao X, Li C, Liu M, et al. Machine learning-based prognostic models and factors influencing the benefit of surgery on primary lesion for patients with lung cancer brain metastases. Am J Cancer Res 2024;14:5154-77. [Crossref] [PubMed]
  24. Hao Y, Li G. Risk and prognostic factors of brain metastasis in lung cancer patients: a Surveillance, Epidemiology, and End Results population based cohort study. Eur J Cancer Prev 2023;32:498-511. [Crossref] [PubMed]
  25. Thomas A, Mian I, Tlemsani C, et al. Clinical and Genomic Characteristics of Small Cell Lung Cancer in Never Smokers: Results From a Retrospective Multicenter Cohort Study. Chest 2020;158:1723-33. [Crossref] [PubMed]
  26. Gondi V, Bauman G, Bradfield L, et al. Radiation Therapy for Brain Metastases: An ASTRO Clinical Practice Guideline. Pract Radiat Oncol 2022;12:265-82. [Crossref] [PubMed]
  27. Li H, Zhao Y, Ma T, et al. Radiotherapy for extensive-stage small-cell lung cancer in the immunotherapy era. Front Immunol 2023;14:1132482. [Crossref] [PubMed]
  28. Rusthoven CG, Yamamoto M, Bernhardt D, et al. Evaluation of First-line Radiosurgery vs Whole-Brain Radiotherapy for Small Cell Lung Cancer Brain Metastases: The FIRE-SCLC Cohort Study. JAMA Oncol 2020;6:1028-37. [Crossref] [PubMed]
  29. Viani GA, Gouveia AG, Louie AV, et al. Stereotactic radiosurgery for brain metastases from small cell lung cancer without prior whole-brain radiotherapy: A meta-analysis. Radiother Oncol 2021;162:45-51. [Crossref] [PubMed]
  30. Gaebe K, Erickson AW, Chen S, et al. Brain metastasis burden and management in patients with small cell lung cancer in Canada: a retrospective, population-based cohort study. EClinicalMedicine 2024;77:102871. [Crossref] [PubMed]
  31. Liu M, Qiu G, Guan W, et al. Induction chemotherapy followed by camrelizumab plus apatinib and chemotherapy as first-line treatment for extensive-stage small-cell lung cancer: a multicenter, single-arm trial. Signal Transduct Target Ther 2025;10:65. [Crossref] [PubMed]
  32. Ji L, Zhang W, Huang J, et al. Bone metastasis risk and prognosis assessment models for kidney cancer based on machine learning. Front Public Health 2022;10:1015952. [Crossref] [PubMed]
  33. Clift AK, Dodwell D, Lord S, et al. Development and internal-external validation of statistical and machine learning models for breast cancer prognostication: cohort study. BMJ 2023;381:e073800. [Crossref] [PubMed]
Cite this article as: Guo M, Zhu J, Duan X, Ren P, Wang H, Zhang Y, Lu H, Zhao Y. Development of machine learning-based prognostic models for small cell lung cancer with brain metastases: an analysis of SEER and Chinese populations. J Thorac Dis 2025;17(11):9622-9641. doi: 10.21037/jtd-2025-961

Download Citation