A computed tomography-based deep learning radiomics for predicting the response to neoadjuvant chemotherapy combined with immunotherapy in patients with locally advanced esophageal cancer: a multicenter cohort study
Original Article

A computed tomography-based deep learning radiomics for predicting the response to neoadjuvant chemotherapy combined with immunotherapy in patients with locally advanced esophageal cancer: a multicenter cohort study

Minhua Ye1,2#, Junjie Mao2#, Jiang Jin2#, Hao Liu3, Haixie Guo4, Yunrui Xu5, Pengjie Yang6, Liang Ma1

1Department of Cardiovascular Surgery, The First Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, China; 2Department of Thoracic Surgery, Taizhou Hospital, Zhejiang University School of Medicine, Taizhou, China; 3Department of Thoracic Surgery, WenZhou Medical College Affiliated Taizhou Hospital, Taizhou, China; 4Department of Thoracic Surgery, Taizhou Hospital of Zhejiang Province, Taizhou, China; 5Donghua University, Shanghai, China; 6Thoracic Surgery Department, Peking University Cancer Hospital Inner Mongolia Hospital (Cancer Hospital Affiliated to Inner Mongolia Medical University), Hohhot, China

Contributions: (I) Conception and design: P Yang, J Mao, L Ma; (II) Administrative support: M Ye, Y Xu; (III) Provision of study materials or patients: M Ye, J Mao, J Jin, P Yang; (IV) Collection and assembly of data: H Liu, H Guo, Y Xu; (V) Data analysis and interpretation: J Mao, Y Xu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work as co-first authors.

Correspondence to: Liang Ma, MD. Department of Cardiovascular Surgery, The First Affiliated Hospital of Zhejiang University School of Medicine, 79 Qingchun Road, Hangzhou 310006, China. Email: ML1402@zju.edu.cn; Pengjie Yang, MD. Thoracic Surgery Department, Peking University Cancer Hospital Inner Mongolia Hospital (Cancer Hospital Affiliated to Inner Mongolia Medical University), No. 42, Zhaowuda Road, Saihan District, Hohhot 010110, China. Email: yangpengjie5933264@163.com.

Background: Esophageal squamous cell carcinoma (ESCC) ranks sixth in global cancer mortality and is the main subtype in China; neoadjuvant chemoimmunotherapy for locally advanced ESCC has heterogeneous responses, and there is a lack of non-invasive tools to pretherapeutically identify major pathologic responders (MPRs), leading to unnecessary expenditures and adverse events. The present study seeks to develop a deep learning (DL)-based radiomic nomogram and to prospectively assess its clinical value in pretherapeutically identifying MPR in patients diagnosed with locally advanced ESCC who are scheduled to receive neoadjuvant chemoimmunotherapy. This approach facilitates the reduction of superfluous pharmaceutical expenditures and mitigates the risk of treatment-related adverse events, thereby significantly aiding in personalized therapeutic strategy formulation and prognostic evaluation.

Methods: This study comprised 60 patients with a confirmed pathological diagnosis of ESCC. These participants were divided into a training set (n=42) and a testing set (n=18). From arterial-phase computed tomography (CT) images, radiomic features were obtained, while DL features were derived using a ResNet101-based network. Several machine learning classifiers—such as support vector machine, logistic regression, k-nearest neighbors, ExtraTrees, random forest, and XGBoost—were evaluated and compared. Classification performance was examined via receiver operating characteristic (ROC) curves and quantified by the area under the curve (AUC). An integrated model was subsequently developed by combining radiomics and clinical characteristics. The model’s predictive ability was evaluated using ROC analysis, and its practical value was further investigated through decision curve analysis.

Results: A total of 1,835 radiomics features and 2,048 DL features were extracted from the CT images. Through dimensionality reduction and feature selection, 8 radiomics features and 46 DL features were selected to form the deep learning radiomics (DLR). The combined DLR feature model demonstrated high predictive efficiency and robustness, with an AUC of 0.844 in the testing cohort. The predictive efficiency of different testing models was compared, and XGBoost showed superior predictive performance, achieving an AUC of 0.844 in the testing cohort. Finally, a nomogram was constructed by integrating the selected features with clinical baseline data, which exhibited the best discriminatory ability (AUC, testing cohort: 0.870).

Conclusions: Our research successfully constructed and assessed a DLR nomogram for predicting treatment outcomes to neoadjuvant chemoimmunotherapy in individuals diagnosed with locally advanced esophageal carcinoma. This has the potential to promote personalized treatment and improve patient prognosis assessment, providing a non-invasive and effective method for clinical decision-making.

Keywords: Computed tomography (CT); deep learning (DL); esophageal cancer; nomogram; radiomics


Submitted Jun 15, 2025. Accepted for publication Oct 21, 2025. Published online Nov 26, 2025.

doi: 10.21037/jtd-2025-1208


Highlight box

Key findings

• We developed and validated a model based on enhanced computed tomography (CT) that uses deep learning (DL) and radiomics combined with clinical features. The proposed deep learning radiomics nomogram (DLRN) integrates DL features, imaging features, and clinical factors, showing good performance in predicting response.

What is known and what is new?

• For locally advanced esophageal squamous cell carcinoma (ESCC), neoadjuvant chemoradiotherapy is standard; programmed death 1 (PD-1)/programmed death ligand 1 (PD-L1) inhibitors plus chemotherapy are effective. Radiomics predicts cancer treatment response, and DL supplements traditional radiomics, but no studies integrate deep learning radiomics (DLR) to predict major pathologic response (MPR) of ESCC neoadjuvant chemoimmunotherapy.

• We constructed a DLRN based on enhanced CT, fusing DL, radiomic and clinical features. The DLRN was validated in multi-center cohorts, and it achieved an area under the curve (AUC) of 0.870 in predicting ESCC neoadjuvant chemoimmunotherapy MPR, with XGBoost as optimal classifier, providing a non-invasive tool for personalized treatment.

What is the implication, and what should change now?

• Implementation of this model offers clinically actionable guidance for tailoring therapies to ESCC patients.


Introduction

Occupying the sixth rank in global cancer mortality, esophageal carcinoma imposes a severe disease burden on worldwide health resources (1). It has two cardinal histological subtypes: esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma, with geographical and racial variability in incidence, mortality, and histopathology (2). In China, ESCC is the most prevalent pathological form, comprising 85.29% of all diagnosed cases of esophageal cancer (3).

Surgery remains the mainstay for early-stage ESCC (4). For locally advanced ESCC, neoadjuvant chemoradiotherapy followed by surgery has become the standard treatment (5). Programmed death 1/programmed death ligand 1 (PD-1/PD-L1) checkpoint inhibition has fundamentally restructured contemporary cancer therapeutics, establishing an immunotherapy-dominant paradigm (6). Definitive clinical trials KEYNOTE 590 (7) and CheckMate 649 (8) have validated significant tumor control and manageable toxicity profiles for PD-1/PD-L1 inhibitor-based therapies, administered with or without chemotherapy, in advanced ESCC populations. Furthermore, recent findings indicate that ESCC patients exhibiting a major pathologic response (MPR) following neoadjuvant chemoimmunotherapy demonstrated markedly improved overall survival, with rates of 91.4% compared to 47.7% in non-responders (9). However, patients exhibiting primary resistance to neoadjuvant immunotherapy confront substantial financial toxicity and treatment-limiting immune-related adverse events (irAEs), mandating the development of predictive frameworks for a priori response assessment to facilitate precision patient selection and prevent unnecessary therapeutic hazards.

Enhanced computed tomography (CT) offers convenience and rapidity, playing a pivotal role in disease diagnosis and efficacy evaluation. Nevertheless, the distinct mechanisms of action associated with immunotherapy often lead to unconventional and heterogeneous patterns of response on imaging. These include delayed response, pseudoprogression, hyperprogression, and other atypical manifestations. As a result, conventional response assessment based on response evaluation criteria in solid tumors (RECIST) criteria may be insufficient or misleading in this context (10-12). Consequently, unimodal evaluation of immunotherapy efficacy using contrast-enhanced computed tomography represents a suboptimal diagnostic paradigm, failing to meet relevant standards of accuracy and clinical adequacy.

Radiomics represents a rapidly evolving methodology that facilitates the high-throughput quantification of extensive feature sets from medical images. These extracted features, undetectable through visual assessment, encompass information such as texture, intensity, heterogeneity, and morphological characteristics, reflecting tumor heterogeneity at the cellular level (13-15). In cancer treatment, radiomics methods successfully predicted treatment outcomes and early treatment responses, as well as determined personalized treatment plans (16-18). Radiomics has additionally demonstrated high accuracy in predicting histologic grades across multiple solid tumor types, such as pancreatic ductal adenocarcinoma, non-small cell lung carcinoma, breast cancer, and squamous cell carcinomas of the head and neck (19-22). However, conventional radiomics features are human-engineered and pre-defined, making their extraction a laborious process that may not encapsulate the full complexity of imaging data. Consequently, they might be insufficient for comprehensively characterizing tumor heterogeneity and phenotype. In contrast, deep learning (DL) techniques automatically learn hierarchical representations directly from raw image inputs, producing task-adaptive features tailored for specific analytical objectives. Owing to continuous progress in methodology, features obtained through DL now serve as a valuable supplement to conventional handcrafted radiomics characteristics in medical imaging research (23). Although the integration of DL features has allowed radiomics to capture intricate, task-specific patterns, these features have also exhibited notable efficacy in characterizing tumor phenotypes and forecasting clinical outcomes across multiple cancer types, including esophageal, gastric, colorectal, and nasopharyngeal carcinomas (24-27). To our knowledge, no studies have integrated the association between deep learning radiomics (DLR) and the prediction of treatment responses in patients with esophageal cancer undergoing neoadjuvant chemoimmunotherapy. Therefore, our aim is to develop and validate a deep learning radiomics nomogram (DLRN) to facilitate the early prediction of favorable responses in a large-scale, multicenter patient cohort before initiating neoadjuvant chemoimmunotherapy. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1208/rc).


Methods

Patient selection

This investigation employed a retrospective design to include individuals diagnosed with ESCC and administered radical surgical resection following neoadjuvant immunotherapy combined with chemotherapy from January 2020 to June 2024 at Taizhou Hospital of Zhejiang Province and Peking University Cancer Hospital Inner Mongolia Hospital. The study was approved by institutional ethics board of Taizhou Hospital of Zhejiang Province (NO. K20251044). The other institution is informed and agreed with the study. Informed consent was obtained from all the patients. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Eligible participants met the following criteria: (I) histological confirmation of ESCC; (II) clinical stage I–III disease; (III) receipt of combined immunotherapy and chemotherapy as neoadjuvant treatment; and (IV) availability of contrast-enhanced CT imaging performed within one month prior to neoadjuvant treatment. Exclusion criteria consisted of: (I) patient refusal of surgery leading to unavailability of pathological response assessment; (II) administration of additional antitumor agents (e.g., targeted therapy) beyond immunochemotherapy in the neoadjuvant regimen; and (III) absence of essential clinicopathological records. The patient selection process is summarized in Figure 1.

Figure 1 Flowchart for selecting the study patients. CT, computed tomography.

Treatments and response evaluation

All enrolled participants completed standardized pretreatment evaluations comprising lesion biopsy, comprehensive disease assessment, and ancillary diagnostic investigations. Clinical staging adhered to the American Joint Committee on Cancer (AJCC) 8th edition tumor-node-metastasis (TNM) classification framework. Subsequent therapeutic intervention involved neoadjuvant chemoimmunotherapy administration, with regimen selection determined through multidisciplinary tumor board consensus incorporating patient preference. Definitive surgical resection was undertaken following neoadjuvant treatment completion. Treatment response was evaluated according to postoperative pathological findings using the following criteria: pathologic complete response (pCR) referred to the absence of viable tumor cells in both primary tumor and lymph nodes; MPR was characterized by residual tumor cells comprising ≤10% of the specimen; partial pathological response indicated residual tumor cells exceeding 10%; and no response was defined by the presence of abundant residual tumor cells. Patients who achieved either pCR or MPR were categorized as the response group, whereas the remaining cases were assigned to the non-response group.

Imaging acquisition and radiomic segmentation

In this study, contrast-enhanced CT covering the cervical to abdominal regions was routinely performed for each participant as part of the pre-treatment assessment prior to neoadjuvant chemoimmunotherapy. All acquired CT images are in Digital Imaging and Communications in Medicine (DICOM) format, and the physical pixel spacing is normalized to 1mm through resampling. All contrast-enhanced CT scans were manually annotated using the open-source software ITK-SNAP (http://www.itksnap.org/pmwiki/pmwiki.php) to facilitate feature extraction. Two-dimensional regions of interest (2D ROIs) were defined on the axial slice exhibiting the largest tumor diameter, while three-dimensional ROIs (3D ROIs) were meticulously contoured across the entire lesion volume on a slice-by-slice basis. All ROIs were independently delineated by two experienced thoracic surgeons. After the annotation was completed, radiomic features were extracted from each group of ROIs, and the intraclass correlation coefficient (ICC) was calculated for each feature to assess interobserver agreement.

The deep transfer learning (DTL) architecture required standardized rectangular inputs encompassing entire lesion ROIs. Consequently, the maximal cross-sectional tumor slice per patient was selected for model processing. Tumor contours were used to define rectangular ROIs on CT images, followed by extraction of bounded regions. These ROI-delimited slices were exported in PNG format to facilitate downstream analytical procedures.

Feature extraction

In this study, a conventional set of 1,835 radiomic features was initially obtained from three-dimensional regions of interest (3D ROIs) utilizing PyRadiomics (v2.1.0). We subsequently developed a deep learning network (DLN) architecture founded on ResNet101, wherein model parameters were pre-trained on the entire collection of ROI images within the dataset. DL features were then derived from this pre-trained network according to the following procedure: all ROI slices were fed into the network; feature representations were obtained by averaging slice-wise predictions; and activations from the penultimate fully connected (FC) layer were employed as the final DL feature vectors. Through this transfer learning framework, 2,048 high-level features were extracted. All analyses were conducted in a Python 3.10 environment, utilizing computational resources powered by an Intel Xeon Silver 4214 CPU with 256 GB of RAM.

Feature fusion

To improve the accuracy of predicting pathological response outcomes, we integrated radiomic features and DL features. The integration scheme involved combining these different features for subsequent analysis.

Feature selection and model construction

We implemented PyRadiomics in Python to extract comprehensive quantitative features from CT imaging data. The derived radiomic features encompassed first-order statistics representing voxel intensity values from original images, which correlate with tissue density characteristics, along with gradient-based features capturing edge information and morphological contours. Additionally, second-order textural features, such as those derived from the gray-level co-occurrence matrix (GLCM) and gray-level run-length matrix (GLRLM), along with morphological characteristics capturing the geometry of delineated tumor volumes, were computed. All extracted radiomic and DL features underwent standardized preprocessing: initially, each feature was transformed via z-score normalization (mean =0, standard deviation =1) to achieve standard normal distribution. Subsequently, we employed Spearman’s rank correlation analysis to assess inter-feature relationships, retaining only one feature from any pair demonstrating high correlation (ρ>0.9). Finally, feature dimensionality reduction was accomplished through L1-regularized least absolute shrinkage and selection operator (LASSO) regression, which selected the most predictive features while promoting model sparsity, thereby improving both interpretability and generalization capability.

After feature selection and integration, we developed machine learning classification models using Python Scikit-learn based on the DLR feature set. The performance of different machine learning classification models, including support vector machine, logistic regression, k-nearest neighbors, ExtraTrees, random forest, and XGBoost, was compared. The discriminatory power of the models was evaluated using receiver operating characteristic (ROC) curves and the area under the curve (AUC) values. Quantitative metrics include accuracy, sensitivity, and specificity. The workflow is shown in Figure 2.

Figure 2 The study design and pipeline. 3D, three dimensional; AUC, area under the curve; CI, confidence interval; DL, deep learning; DLR, deep learning radiomics; LASSO, least absolute shrinkage and selection operator; Rad, radiomics; ROI, region of interest.

Statistical analysis

Statistical analyses were conducted to compare patient characteristics using appropriate tests based on data type. Categorical variables were analyzed with either chi-square or Fisher’s exact tests, whereas continuous variables were compared using Student’s t-tests or Mann-Whitney U tests. Model performance differences were evaluated using DeLong’s test in MedCalc (v20.100). All statistical tests employed two-sided analyses with significance thresholds set at P<0.05. Clinical variable assessments were performed using IBM SPSS Statistics (v20.0). Additional analyses including ICCs, Spearman rank correlation, z-score normalization, and LASSO regression were implemented in Python (v3.101) and R (v3.3.1) environments.


Results

Patients characteristics

Table 1 summarizes the baseline demographic and clinical information of the enrolled patients. The study comprised a training cohort of 42 individuals and a validation cohort of 18 patients, all diagnosed with ESCC. The MPR rates were 38.10% (16/42) in the training set and 38.89% (7/18) in the validation set. A statistically significant difference in patient age was observed within the training cohort (P<0.05), although this was not replicated in the validation group. All other baseline characteristics—including gender, tumor location, and number of neoadjuvant therapy cycles—exhibited no significant differences between the two cohorts.

Table 1

The baseline clinicopathological characteristics of the included patients with or without MPR

Characteristics Training group (n=42) Validation group (n=18)
MPR (n=16) Non-MPR (n=26) P value MPR (n=7) Non-MPR (n=11) P value
Age, years 61.62±7.48 68.96±7.08 <0.05 65.00±5.26 65.64±6.74 0.84
Gender 0.17 >0.99
   Female Null 5 (19.23) Null Null
   Male 16 (100.00) 21 (80.77) 7 (100.00) 11 (100.00)
Smoke history 0.91 0.55
   No 10 (62.50) 18 (69.23) 2 (28.57) 6 (54.55)
   Yes 6 (37.50) 8 (30.77) 5 (71.43) 5 (45.45)
Drink history 0.71 1.00
   No 10 (62.50) 19 (73.08) 5 (71.43) 8 (72.73)
   Yes 6 (37.50) 7 (26.92) 2 (28.57) 3 (27.27)
Diabetes history 0.82 1.00
   No 13 (81.25) 19 (73.08) 7 (100.00) 10 (90.91)
   Yes 3 (18.75) 7 (26.92) Null 1 (9.09)
Hypertension history 0.22 1.00
   No 14 (87.50) 17 (65.38) 5 (71.43) 8 (72.73)
   Yes 2 (12.50) 9 (34.62) 2 (28.57) 3 (27.27)
Tumor location 0.59 0.31
   Upper 2 (12.50) 6 (23.08) Null 1 (9.09)
   Middle 7 (43.75) 12 (46.15) 2 (28.57) 6 (54.55)
   Lower 7 (43.75) 8 (30.77) 5 (71.43) 4 (36.36)
Tumor stage 0.25 0.50
   T1 3 (18.75) 1 (3.85) 1 (14.29) 1 (9.09)
   T2 12 (75.00) 24 (92.31) 5 (71.43) 9 (81.82)
   T3 1 (6.25) 1 (3.85) 1 (14.29) 1 (9.09)
Node stage 0.73 0.35
   N0 9 (56.25) 14 (53.85) 1 (14.29) 4 (36.36)
   N1 7 (43.75) 11 (42.31) 5 (71.43) 4 (36.36)
   N2 Null 1 (3.85) 1 (14.29) 3 (27.27)
Clinical stage 0.73 0.14
   I 7 (43.75) 11 (42.31) 1 (14.29) 4 (36.36)
   II 9 (56.25) 14 (53.85) 4 (57.14) 2 (18.18)
   III Null 1 (3.85) 2 (28.57) 5 (45.45)
Treatment cycles 0.56 1.00
   2 cycles 10 (62.50) 18 (69.23) 5 (71.43) 7 (63.64)
   3 cycles 3 (18.75) 6 (23.08) 2 (28.57) 4 (36.36)
   4 cycles 3 (18.75) 2 (7.69) Null Null

Data are presented as mean ± standard deviation or n (%). MPR, major pathologic response.

Results of the feature extraction and selection

A total of 1,835 radiomic features were initially extracted from three-dimensional regions of interest using conventional methods. Dimensionality reduction was then performed using LASSO-Cox regression. Following this screening process, 14 features exhibiting non-zero coefficients were retained. The optimal penalty parameter (λ=0.0450) was identified, and the corresponding feature selection trajectory, along with the variation in coefficient estimates across λ values, is provided in Figure S1.

To reduce the dimensionality of the fused feature set, LASSO-Cox regression was applied. The chosen penalty coefficient (λ=0.0012), the procedure for feature selection, and the trajectory of coefficient variations across λ values are presented in Figure 3. Following the final feature refinement, 8 radiomic and 46 DL features were retained. A radiomics-DL importance score was subsequently developed using the selected features and their corresponding regression coefficients, as illustrated in Figure 4.

Figure 3 Fusion feature selection using the LASSO and the histogram of the DLR feature importance score based on the selected features. DLR, deep learning radiomics; LASSO, least absolute shrinkage and selection operator; MSE, mean squared error.
Figure 4 DLR features with non-zero coefficients after feature selection using the LASSO regression. DL, deep learning; DLR, deep learning radiomics; LASSO, least absolute shrinkage and selection operator.

Performance comparison between various feature fusions and different machine learning classifications

This study compared the modeling effects of combinations such as the radiomic feature group, DL feature group, and the combination of DL features with radiomic features (Figure 5). The results showed that, in the validation cohort, the prediction model based on DL features [AUC 0.792, 95% confidence interval (CI): 0.561–1.000] outperformed both the clinical variable features (0.669, 95% CI: 0.360–0.978) and the radiomic features (0.766, 95% CI: 0.539–0.993). However, the combination of DL features and radiomic features was superior to other feature-based prediction models, with an AUC of 0.844 (95% CI: 0.661–1.000) in the validation cohort. Additionally, we used the Delong test to compare the performance differences between various prediction models. Figure S2 shows the P values between different models in the validation cohort.

Figure 5 The AUC of feature fusion groups in the training and validation cohorts. (A) DL features; (B) Rad features; (C) clinical features; (D) DL features + Rad features. AUC, area under the curve; CI, confidence interval; DL, deep learning; DLR, deep learning radiomics; Rad, radiomics.

Additionally, we created a nomogram incorporating DLR features and clinical features to visualize the pathological outcome assessment of ESCC (Figure 6). DLRN outperformed all other radiomic models, achieving an AUC of 0.870 (95% CI: 0.706–1.000) for the test cohort, with an accuracy of 0.722, sensitivity of 0.857, and specificity of 0.636. For the test dataset, the decision curve analysis (DCA) curve demonstrated that DLRN benefited patients more than traditional radiomics, DLR, and feature fusion models.

Figure 6 Development and evaluation of a DLRN integrating clinical features. (A) DLRN integrating clinical features; (B) the AUC of various prediction models in the test cohort; (C) the DCA of various prediction models in the test cohort. AUC, area under the curve; CI, confidence interval; DL, deep learning; DLR, deep learning radiomics; DLRN, deep learning radiomics nomogram; Rad, radiomics.

Lastly, we compared the performance of different machine learning models under the DLR features, including support vector machine, logistic regression, k-nearest neighbors, ExtraTrees, random forest, and XGBoost (Figure 7). Among all classifiers, XGboost demonstrated the best performance, with an AUC of 0.844 (95% CI: 0.661–1.000) in the test cohort.

Figure 7 Performance of different machine learning models under DLR features. AUC, area under the curve; CI, confidence interval; DLR, deep learning radiomics; KNN, K-nearest neighbors; LR, logistic regression; SVM, support vector machine.

Discussion

In this research, we constructed a non-invasive signature derived from DL and radiomic analysis of contrast-enhanced CT scans and assessed its predictive capacity for pathological response to neoadjuvant chemoimmunotherapy within a multi-institutional cohort. Moreover, when integrated into the DLRN, this imaging biomarker demonstrated superior performance in forecasting treatment outcomes.

For patients with locally advanced esophageal cancer, a multimodal treatment strategy including surgery, chemotherapy, and immunotherapy is the primary approach. Currently, surgical treatment following neoadjuvant chemoradiotherapy has been shown to provide significant clinical benefits in the prognosis of patients with locally advanced ESCC and has become the standard treatment modality for mid to late-stage esophageal cancer. With the rapid advancement of immune checkpoint inhibitors, clinical trials combining neoadjuvant immunotherapy with chemotherapy have demonstrated promising efficacy in ESCC. Despite this, the treatment outcomes exhibit significant heterogeneity among different patients, and some patients may be at risk of irAEs. Therefore, identifying new biomarkers to predict treatment responses is both reasonable and urgent, particularly in the context of an era dominated by immunotherapy. Furthermore, current research suggests that a MPR is a good indicator for assessing the efficacy of neoadjuvant immunotherapy combined with chemotherapy, and it is associated with the prognosis of cancer patients (9). Therefore, there is an urgent need to identify a method for accurately predicting the pathologic response to neoadjuvant immunotherapy combined with chemotherapy.

Radiomics is an emerging computational approach that quantifies medical images by converting visual information into extractable feature sets, demonstrating significant potential across various clinical applications. A key strength of this methodology lies in its capacity to detect subvisual patterns indicative of tissue microenvironment and genomic variation, thereby providing valuable insights that are highly predictive of therapeutic efficacy, particularly among oncology patients receiving innovative treatment regimens. Currently, Li et al. have studied handcrafted radiomic features from CT images before and after treatment to predict the probability of pCR in patients with ESCC following neoadjuvant chemoradiotherapy. The AUC for pCR prediction using the radiomics plus clinical model was 0.840 (28). However, there are currently limited studies on the pathologic response following neoadjuvant immunotherapy combined with chemotherapy. Zhu et al. found that a 2D radiomics model showed a good ability to predict the treatment response to neoadjuvant immunotherapy combined with chemotherapy, with an AUC of 0.818 (29).

Compared to visual assessment, convolutional neural network (CNN) can extract deeper and more subtle features from images (30). The ResNet framework, a variant of CNN, has demonstrated considerable efficacy in the analysis of medical images. In a study by Hu et al., a ResNet50 model was employed to forecast pCR after treatment in individuals with ESCC, attaining a promising predictive performance reflected by an AUC value of 0.805 (26). In this study, we chose ResNet101 as the CNN architecture to extract DL features. The results also indicate that the DL features extracted based on the ResNet101 architecture performed better in prediction than traditional radiomics features and clinical features, with an AUC of 0.792. Although CNN models can acquire complete tumor information, we found that overfitting easily occurs during the model training process in DL analysis. To avoid overfitting, transfer learning often requires a larger sample size, which was clearly insufficient for CNN model training in this study.

In addition, in our study, DL features outperformed traditional radiomics feature modeling, achieving an AUC of 0.792 in the validation cohort. When DL features were combined with traditional radiomic features, DLR features were constructed and demonstrated even better performance, with an AUC of 0.844 in the validation cohort. These findings indicate that DL-based features are superior to traditional handcrafted features in predicting the pathological response classification after neoadjuvant immunotherapy combined with chemotherapy for ESCC, and suggest that the fusion of the two has even greater potential. This is consistent with the results of neoadjuvant chemoradiotherapy in ESCC reported by Hu et al. (26).

To identify the optimal machine learning approach—a central challenge in radiomic research—we evaluated a panel of six distinct classifiers. Among these, the XGBoost algorithm consistently exhibited robust predictive performance across both internal and external validation cohorts, leading to its selection as the most appropriate method for developing the radiomic signature. The XGBoost framework, which builds upon gradient-boosted decision trees, employs a second-order Taylor expansion for loss function calculation, thereby demonstrating superior efficiency and predictive accuracy (31).

There are some limitations in this study. While the current sample size aligns with preliminary exploration objectives, future expansion of cohorts may further refine this predictive framework. Notably, the multi-centre study design effectively enhances the generalizability of observations across diverse clinical settings. Second, our conclusions are based on a retrospective design. The performance of the combined model in prospective studies need to be explored. Additionally, the clinical variables used in the model are relatively limited (e.g., age, gender, and tumor stage). Such basic variables fail to fully reflect the biological characteristics of tumors and their complex pathological mechanisms. In contrast, integrating tumor marker data [e.g., specific molecular indicators such as carcinoembryonic antigen (CEA) and carbohydrate antigens (CAs)] and tumor immune status data (e.g., PD-L1 expression level, and the type and density of immune cell infiltration) may supplement key biological information regarding the occurrence and development of tumors, thereby enhancing the biological validity of the model. This approach can not only more accurately capture the molecular drivers of tumor progression and the characteristics of the tumor immune microenvironment but also make the model’s prediction results align more closely with the biological nature of tumors, providing a more pathologically meaningful reference for clinical decision-making. Third, the manual 3D segmentation of tumors in this study is very time-consuming, and it may be necessary to develop a semi-automatic or automatic segmentation method for clinical application. Therefore, in future research, our goal is to obtain multicenter data, increase the sample size, and conduct radiomic feature extraction in a more scientific and efficient manner, enhancing the predictive power and clinical applicability of the predictive model. Finally, in the future, integrating clinical data (such as tumor markers and genetic information) with different types of data, such as radiomic and pathomic data, will be an important direction for research (32).


Conclusions

In summary, we developed and validated a model based on enhanced CT that uses DL and radiomics combined with clinical features to predict the pathological response of ESCC patients to neoadjuvant immunotherapy combined with chemotherapy. The proposed DLRN integrates DL features, imaging features, and clinical factors, showing good performance in predicting response and providing valuable information for personalized treatment of ESCC patients. However, future prospective studies are needed to confirm the clinical utility of our DLRN model.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1208/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1208/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1208/prf

Funding: This work was supported by Surgical Standardized Diagnosis and Treatment Research Project (No. WKZX2023WK0116).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-1208/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was approved by institutional ethics board of Taizhou Hospital of Zhejiang Province (No. K20251044). The other institution is informed and agreed with the study. Informed consent was obtained from all the patients. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Siegel RL, Miller KD, Fuchs HE, et al. Cancer statistics, 2022. CA Cancer J Clin 2022;72:7-33. [Crossref] [PubMed]
  2. Abbas G, Krasna M. Overview of esophageal cancer. Ann Cardiothorac Surg 2017;6:131-6. [Crossref] [PubMed]
  3. He Y, Liang D, Du L, et al. Clinical characteristics and survival of 5283 esophageal cancer patients: A multicenter study from eighteen hospitals across six regions in China. Cancer Commun (Lond) 2020;40:531-44. [Crossref] [PubMed]
  4. Lagergren J, Smyth E, Cunningham D, et al. Oesophageal cancer. Lancet 2017;390:2383-96. [Crossref] [PubMed]
  5. van Hagen P, Hulshof MC, van Lanschot JJ, et al. Preoperative chemoradiotherapy for esophageal or junctional cancer. N Engl J Med 2012;366:2074-84. [Crossref] [PubMed]
  6. Huang TX, Fu L. The immune landscape of esophageal cancer. Cancer Commun (Lond) 2019;39:79. [Crossref] [PubMed]
  7. Sun JM, Shen L, Shah MA, et al. Pembrolizumab plus chemotherapy versus chemotherapy alone for first-line treatment of advanced oesophageal cancer (KEYNOTE-590): a randomised, placebo-controlled, phase 3 study. Lancet 2021;398:759-71. [Crossref] [PubMed]
  8. Janjigian YY, Shitara K, Moehler M, et al. First-line nivolumab plus chemotherapy versus chemotherapy alone for advanced gastric, gastro-oesophageal junction, and oesophageal adenocarcinoma (CheckMate 649): a randomised, open-label, phase 3 trial. Lancet 2021;398:27-40. [Crossref] [PubMed]
  9. Yang Y, Liu J, Liu Z, et al. Two-year outcomes of clinical N2-3 esophageal squamous cell carcinoma after neoadjuvant chemotherapy and immunotherapy from the phase 2 NICE study. J Thorac Cardiovasc Surg 2024;167:838-847.e1. [Crossref] [PubMed]
  10. Chiou VL, Burotto M. Pseudoprogression and Immune-Related Response in Solid Tumors. J Clin Oncol 2015;33:3541-3. [Crossref] [PubMed]
  11. Dercle L, Sun S, Seban RD, et al. Emerging and Evolving Concepts in Cancer Immunotherapy Imaging. Radiology 2023;306:32-46. [Crossref] [PubMed]
  12. Huang Q, Liu Z, Yu Y, et al. Prediction of response to neoadjuvant chemo-immunotherapy in patients with esophageal squamous cell carcinoma by a rapid breath test. Br J Cancer 2024;130:694-700. [Crossref] [PubMed]
  13. Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016;278:563-77. [Crossref] [PubMed]
  14. Kim G, Kim J, Cha H, et al. Metabolic radiogenomics in lung cancer: associations between FDG PET image features and oncogenic signaling pathway alterations. Sci Rep 2020;10:13231. [Crossref] [PubMed]
  15. Su GH, Xiao Y, You C, et al. Radiogenomic-based multiomic analysis reveals imaging intratumor heterogeneity phenotypes and therapeutic targets. Sci Adv 2023;9:eadf0837. [Crossref] [PubMed]
  16. Wu G, Jochems A, Refaee T, et al. Structural and functional radiomics for lung cancer. Eur J Nucl Med Mol Imaging 2021;48:3961-74. [Crossref] [PubMed]
  17. McGale J, Hama J, Yeh R, et al. Artificial Intelligence and Radiomics: Clinical Applications for Patients with Advanced Melanoma Treated with Immunotherapy. Diagnostics (Basel) 2023;13:3065. [Crossref] [PubMed]
  18. Horvat N, Papanikolaou N, Koh DM. Radiomics Beyond the Hype: A Critical Evaluation Toward Oncologic Clinical Use. Radiol Artif Intell 2024;6:e230437. [Crossref] [PubMed]
  19. Cen C, Wang C, Wang S, et al. Clinical-radiomics nomogram using contrast-enhanced CT to predict histological grade and survival in pancreatic ductal adenocarcinoma. Front Oncol 2023;13:1218128. [Crossref] [PubMed]
  20. Han Y, Ma Y, Wu Z, et al. Histologic subtype classification of non-small cell lung cancer using PET/CT images. Eur J Nucl Med Mol Imaging 2021;48:350-60. [Crossref] [PubMed]
  21. Petrillo A, Fusco R, Di Bernardo E, et al. Prediction of Breast Cancer Histological Outcome by Radiomics and Artificial Intelligence Analysis in Contrast-Enhanced Mammography. Cancers (Basel) 2022;14:2132. [Crossref] [PubMed]
  22. Zheng YM, Che JY, Yuan MG, et al. A CT-Based Deep Learning Radiomics Nomogram to Predict Histological Grades of Head and Neck Squamous Cell Carcinoma. Acad Radiol 2023;30:1591-9. [Crossref] [PubMed]
  23. Truhn D, Schrading S, Haarburger C, et al. Radiomic versus Convolutional Neural Networks Analysis for Classification of Contrast-enhancing Lesions at Multiparametric Breast MRI. Radiology 2019;290:290-7. [Crossref] [PubMed]
  24. Liu X, Zhang D, Liu Z, et al. Deep learning radiomics-based prediction of distant metastasis in patients with locally advanced rectal cancer after neoadjuvant chemoradiotherapy: A multicentre study. EBioMedicine 2021;69:103442. [Crossref] [PubMed]
  25. Peng H, Dong D, Fang MJ, et al. Prognostic Value of Deep Learning PET/CT-Based Radiomics: Potential Role for Future Individual Induction Chemotherapy in Advanced Nasopharyngeal Carcinoma. Clin Cancer Res 2019;25:4271-9. [Crossref] [PubMed]
  26. Hu Y, Xie C, Yang H, et al. Computed tomography-based deep-learning prediction of neoadjuvant chemoradiotherapy treatment response in esophageal squamous cell carcinoma. Radiother Oncol 2021;154:6-13. [Crossref] [PubMed]
  27. Dong D, Fang MJ, Tang L, et al. Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study. Ann Oncol 2020;31:912-20. [Crossref] [PubMed]
  28. Li Y, Liu J, Li HX, et al. Radiomics Signature Facilitates Organ-Saving Strategy in Patients With Esophageal Squamous Cell Cancer Receiving Neoadjuvant Chemoradiotherapy. Front Oncol 2020;10:615167. [Crossref] [PubMed]
  29. Zhu Y, Yao W, Xu BC, et al. Predicting response to immunotherapy plus chemotherapy in patients with esophageal squamous cell carcinoma using non-invasive Radiomic biomarkers. BMC Cancer 2021;21:1167. [Crossref] [PubMed]
  30. Chen L, Bentley P, Mori K, et al. DRINet for Medical Image Segmentation. IEEE Trans Med Imaging 2018;37:2453-62. [Crossref] [PubMed]
  31. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016;785-94. doi: 10.1145/2939672.2939785.
  32. Kang W, Qiu X, Luo Y, et al. Application of radiomics-based multiomics combinations in the tumor microenvironment and cancer prognosis. J Transl Med 2023;21:598. [Crossref] [PubMed]
Cite this article as: Ye M, Mao J, Jin J, Liu H, Guo H, Xu Y, Yang P, Ma L. A computed tomography-based deep learning radiomics for predicting the response to neoadjuvant chemotherapy combined with immunotherapy in patients with locally advanced esophageal cancer: a multicenter cohort study. J Thorac Dis 2025;17(11):10417-10429. doi: 10.21037/jtd-2025-1208

Download Citation