Prediction of EGFR mutation status and mutation sites in lung cancer based on radiomics and deep learning
Highlight box
Key findings
• A non-invasive predictive model integrating radiomics, deep learning, and clinical data was developed to identify EGFR mutation status and subtypes in NSCLC, achieving superior accuracy over radiomics-only approaches.
What is known and what is new?
• EGFR genotyping traditionally relies on invasive tissue sampling. This study introduces a combined imaging-based model that improves prediction of both mutation status and specific mutation loci.
What is the implication, and what should change now?
• The model offers a reliable tool for molecular stratification, supporting personalized treatment decisions and potentially reducing the need for repeated invasive biopsies in NSCLC management.
Introduction
Epidermal growth factor receptor (EGFR) mutation is one of the most prevalent genetic mutations in lung adenocarcinoma, with a higher incidence in women, non-smokers, and individuals of East Asian descent (1,2). It is recognized as a critical predictor of response to EGFR-tyrosine kinase inhibitors (TKIs) and is advocated for inclusion in the ninth edition of the lung cancer tumor-node-metastasis (TNM) staging system (3). Prior studies have established that patients with EGFR mutations exhibit significantly prolonged survival compared to those without mutations following EGFR-TKI therapy (4-6).
EGFR mutation sites are predominantly located within exons 18, 19, 20, and/or 21, with the most commonly observed mutation subtypes being exon 19 deletions and exon 21 L858R mutations (7,8). Hence, precise identification of EGFR mutation subtypes is becoming increasingly essential for guiding personalized treatment strategies. Notably, exon 19 deletions are associated with significantly prolonged progression-free survival (PFS) under EGFR-TKI therapy compared to exon 21 L858R mutations (9,10), underscoring the clinical necessity of distinguishing mutation subtypes. Simultaneously, certain scholars have proposed a hypothesis regarding the development of lung adenocarcinoma, suggesting that EGFR mutations play a pivotal role in driving the transformation of precancerous lesions into invasive adenocarcinoma (11). As a result, the precise identification of EGFR gene mutations and their subtypes becomes critically important in the diagnosis, treatment, and management of non-small cell lung cancer (NSCLC) (12).
Currently, the detection of EGFR mutation status primarily relies on invasive tumor specimen biopsies or genetic testing based on circulating tumor cell DNA (ctDNA) (13). However, the former presents challenges for widespread clinical application due to limitations such as tissue sampling difficulties, high tumor heterogeneity, and the invasive nature of the procedure (14,15), particularly in patients with ground-glass nodules (GGNs) or simultaneous multiple primary lung adenocarcinomas. The latter method also possesses a certain false negative rate; for instance, low levels of ctDNA in early-stage lung cancer can result in misleading negative test outcomes (15,16). Therefore, there is an urgent need for a non-invasive and efficient method to predict the EGFR mutation status and subtypes of lung cancer.
Previous studies have demonstrated the utility of machine learning and deep learning in identifying and predicting the prognosis of EGFR mutation subtypes in lung cancer (17-19). Despite advances in EGFR genotyping, non-invasive prediction of specific mutation sites (e.g., exon 19 vs. 21) remains underexplored, particularly through multimodal integration of radiomics and deep learning.
In this study, we developed radiomics model and deep learning model utilizing only plain computed tomography (CT) images and clinical data to predict the gene mutation status and subtype of EGFR. Furthermore, we integrated both types of features to create a combined model, which is anticipated to enhance predictive performance and assist patients in selecting appropriate treatment strategies. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-567/rc).
Methods
Patients
We retrospectively collected CT images from 1,043 patients with NSCLC treated at the Sun Yat-sen University Cancer Center between August 2014 and June 2018. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Sun Yat-sen University Cancer Center (No. SL-B2025-189-01). Given the retrospective nature of this study, the requirement for informed consent was waived.
Inclusion criteria: (I) availability of valid clinical information; (II) diagnosed with NSCLC; (III) underwent postoperative biopsy and tested for EGFR genotype and mutation sites; (IV) no evidence of other malignant tumors; (V) no radiotherapy or chemotherapy prior to CT image acquisition.
Exclusion criteria: (I) incorrect or incomplete clinical information; (II) presence of multiple pulmonary tumors; (III) absence of EGFR gene mutation information; (IV) radiotherapy, chemotherapy, or immunotherapy prior to CT scan; (V) poor CT image quality unsuitable for texture analysis or with unclear edges.
Based on the inclusion and exclusion criteria, a total of 557 lung cancer patients were included in this study, of whom 197 were EGFR wild-type and 360 were EGFR mutant. Among the mutant patients, 168 were found with 19-site mutations, 162 with 21-site mutations, and 30 with mutations at other sites. The inclusion process for lung cancer patients is illustrated in Figure 1. The detailed parameters of CT are provided in the supplementary file (Appendix 1).
Definition of imaging findings
All imaging findings were evaluated by one radiologist with more than ten years of experience in thoracic imaging. The imaging parameters were defined as follows: lobulated was characterized by two or more outward convexities, resulting in a lobulated contour; spiculation was defined as fine linear strands radiating from the tumor margin into the surrounding lung parenchyma; pleural indentation was described as linear retraction of the adjacent pleura toward the tumor margin; air bronchogram was defined as air-filled bronchial structures visible within the lesion on CT images. All imaging features were assessed on lung window settings [window width, 1,500 Hounsfield Unit (HU); window level, −600 HU].
Pathological diagnosis and EGFR testing
The tissue pathological sections of the primary lung cancer and corresponding metastatic lesions were classified by pathologists based on the World Health Organization standards. The mutation status of EGFR exons 18–21 was evaluated using either the Amplification-Refractory Mutation System-Polymerase Chain Reaction or next-generation sequencing technology (20).
Volumes of interest (VOI) segmentation
To address the issue of extensive CT slices in lung cancer datasets and the time-consuming nature of manual segmentation, a semi-automatic segmentation method was employed. Initially, the pre-trained nnU-Net model was used to automatically segment the CT images (21). To ensure segmentation accuracy, all automated segmentations were reviewed and manually refined by a board-certified thoracic radiologist with >10 years of experience. Before feature extraction, the preprocessing of CT images includes voxel normalization, which normalizes the images to a grayscale intensity range of 0 to 2,048. Two months later, the radiologist revised the pre-segmented CT images of 50 randomly selected patients to calculate the intraclass correlation coefficient (ICC), ICC >0.75 were considered adequate consistency.
Radiomics analysis and radiomics model development
A total of 1,309 radiomics features were extracted using Pyradiomics. Figure 2 illustrates the comprehensive workflow of the radiomics methodology employed in this study. A four-step method was employed to screen the extracted radiomics features in order to prevent model overfitting: In the first step, the ICC value was calculated to assess inter-observer consistency, and features with an ICC greater than 0.75 were retained. In the second step, Spearman correlation analysis was conducted on the retained features from the previous step, and Spearman correlation coefficient (SCC) values for paired features were calculated. Feature pairs with SCC values higher than 0.9 were identified as highly correlated pairs. For each highly correlated pair, the average absolute correlation of each feature was evaluated, and the feature with the largest average absolute correlation was removed. In the third step, the minimum redundancy maximum relevance (mRMR) method was applied to select the features with the highest relevance to categorical variables (22,23). Finally, in the fourth step, the least absolute shrinkage and selection operator (LASSO) algorithm was utilized with 10-fold cross-validation to select the final feature subset (Rad).
Rad were ultimately selected were utilized to construct SVM and LR classification models. The SVM classification prediction score (Rad-SVM) and LR classification prediction score (Rad-LR) were then computed. Subsequently, in combination with independent clinical predictors (CLN), a radiomics model was developed using LR to predict the EGFR mutation status and specific mutation sites in lung cancer.
Deep learning model development
ResNet-18 (RN) and DenseNet-121 (DN) convolutional neural networks pre-trained on the ImageNet dataset were employed as substitutes for traditional image feature extractors to address the challenge posed by small sample sizes. High-throughput deep features were directly extracted from medical images through convolutional calculations, thereby bypassing the need for time-consuming manual segmentation. The deep learning features were then subjected to the following screening process: first, features with low variance (variance <0.001) were removed; second, features with low reproducibility (ICC <0.75) were excluded; and finally, redundant features exhibiting high correlations were eliminated through Spearman correlation analysis. Afterward, the important deep learning features were selected using the mRMR and LASSO methods to form the final deep feature subset. The selected deep learning features were used to train support vector machine (SVM) and logistic regression (LR) classifiers, respectively, and the classification prediction scores were calculated. The best classification prediction score was then selected and combined with CLN to construct a deep learning tumor assessment model based on LR. Figure 3 illustrates the overall process of deep learning tumor assessment in this study.
Combined model development
For the prediction of EGFR mutation status and mutation sites, the most effective Rad, RN, and DN models were selected for inclusion. First, Rad was combined separately with the prediction scores from the RN and DN models. Subsequently, all radiomics and deep learning prediction scores were integrated to construct three joint models: Rad + RN, Rad + DN, and Rad + RN + DN, using a LR classifier. Furthermore, to enhance the model’s predictive performance, CLN were introduced, resulting in a comprehensive model combining radiomics, deep learning, and clinical features (Rad + RN + DN + CLN). Figure 4 illustrates the model combination strategy employed in this study.
Model evaluation
The predictive capability of each model was assessed using the area under the receiver operating characteristic curve (AUC), while the model’s performance was quantified in terms of sensitivity, specificity, and accuracy. The DeLong test was employed to statistically compare the performance of the radiomics model, deep learning model, and combined model. Additionally, a nomogram was generated to visualize the predictive performance of the best radiomics model, the best deep learning model, and the combined model. To further evaluate the accuracy and clinical applicability of these models, calibration curves were drawn to assess their calibration performance, while decision curve analysis was conducted to evaluate the clinical net benefit provided by each model.
Statistical analysis
For clinical characteristics, the P value for categorical variables was computed using the Chi-squared test or Fisher’s exact test. For continuous variables, the t-test was applied if the data followed a normal distribution; otherwise, the Mann-Whitney U test was utilized for non-normally distributed data.
Univariate LR analysis was conducted on the clinical characteristics, followed by multivariate LR analysis on variables with P<0.05. Characteristics with P<0.05 in the multivariate analysis were identified as CLN.
In this study, IBM SPSS Statistics software (version 22) was employed for univariate and multivariate LR analysis, while R (version 4.2.0) was used to construct the radiomics model. Pre-trained models ResNet-18 and DenseNet-121, based on ImageNet, were loaded via the pytorch 1.13 framework, and the deep learning feature extraction process was executed using python 3.7.12.
Results
Clinical characteristics
The following clinical characteristics were included: age (whether over 60 years old), gender, lobulated, spiculation, pleural indentation, air bronchogram, tumor long diameter, and nodule type (ground glass or solid). The baseline clinical characteristics of the cohort for predicting EGFR mutation status and sites are presented in Table 1.
Table 1
| Clinical characteristics | EGFR mutation status | EGFR mutation sites | |||||
|---|---|---|---|---|---|---|---|
| EGFR wild-type (n=197) | EGFR mutant (n=360) | P | 19-site mutation (n=168) | 21-site mutation (n=162) | P | ||
| Age (years) | 0.35 | 0.004 | |||||
| <60 | 103 (52.3) | 203 (56.4) | 78 (66.1) | 55 (48.7) | |||
| ≥60 | 94 (47.7) | 157 (43.6) | 40 (33.9) | 58 (51.3) | |||
| Gender | <0.001 | 0.001 | |||||
| Male | 125 (63.5) | 166 (46.1) | 65 (55.1) | 49 (43.4) | |||
| Female | 72 (36.5) | 194 (53.9) | 53 (44.9) | 64 (56.6) | |||
| Lobulated | 0.65 | 0.49 | |||||
| No | 5 (2.54) | 7 (1.94) | 0 (0.0) | 1 (0.6) | |||
| Yes | 192 (97.5) | 353 (98.1) | 168 (100.0) | 161 (99.4) | |||
| Spiculation | <0.001 | 0.02 | |||||
| No | 28 (14.2) | 100 (27.8) | 18 (10.7) | 32 (19.8) | |||
| Yes | 169 (85.8) | 260 (72.2) | 150 (89.3) | 130 (80.2) | |||
| Pleural indentation | 0.09 | 0.01 | |||||
| No | 22 (11.2) | 25 (6.94) | 34 (20.2) | 12 (7.4) | |||
| Yes | 175 (88.8) | 335 (93.1) | 134 (79.8) | 150 (92.6) | |||
| Air bronchogram | 0.27 | 0.001 | |||||
| No | 96 (48.7) | 158 (43.9) | 47 (28.0) | 73 (45.1) | |||
| Yes | 101 (51.3) | 202 (56.1) | 121 (72.0) | 89 (54.9) | |||
| Tumor long diameter (cm), mean (IQR) | 2.3 (1.1) | 2.2 (1.2) | 0.29 | 2 (0.9) | 2.3 (1.1) | 0.02 | |
| Nodule type | <0.001 | 0.30 | |||||
| Ground glass | 21 (10.7) | 91 (25.3) | 9 (7.6) | 18 (15.9) | |||
| Solid | 176 (89.3) | 269 (74.7) | 109 (92.4) | 95 (84.1) | |||
Data are presented as n (%) unless otherwise specified. EGFR, epidermal growth factor receptor.
The results of the univariate and multivariate regression analysis for the clinical characteristics of the cohort for predicting EGFR mutation status and sites are presented in Table 2. In the univariate regression analysis, gender, spiculation, air bronchogram, tumor diameter, and nodule type were significantly associated with EGFR mutation status (P<0.05). The multivariate regression analysis further revealed that gender and nodule type were independent predictors of EGFR mutation status (P<0.05).
Table 2
| Clinical characteristics | EGFR mutation status | EGFR mutation sites | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Univariate analysis | Multivariate analysis | Univariate analysis | Multivariate analysis | ||||||||||||
| OR | 95% CI | P | OR | 95% CI | P | OR | 95% CI | P | OR | 95% CI | P | ||||
| Age | 0.847 | 0.598–1.201 | 0.35 | – | – | – | 1.884 | 1.216–2.919 | 0.004 | 1.969 | 1.228–3.157 | 0.005 | |||
| Gender | 0.493 | 0.345–0.704 | <0.001 | 0.534 | 0.371–0.769 | <0.001 | 0.491 | 0.316–0.761 | 0.002 | 0.523 | 0.328–0.837 | 0.007 | |||
| Lobulated | 1.313 | 0.411–4.194 | 0.65 | – | – | – | <0.001 | / | >0.99 | – | – | – | |||
| Spiculation | 0.431 | 0.272–0.684 | <0.001 | 0.791 | 0.396–1.580 | 0.51 | 0.488 | 0.261–0.909 | 0.02 | 0.687 | 0.345–1.368 | 0.29 | |||
| Pleural indentation | 1.685 | 0.923–3.074 | 0.09 | – | – | – | 3.172 | 1.578–6.375 | <0.001 | 2.477 | 1.198–5.121 | 0.01 | |||
| Air bronchogram | 1.215 | 0.858–1.722 | <0.001 | 1.239 | 0.860–1.783 | 0.25 | 0.474 | 0.300–0.748 | 0.001 | 0.388 | 0.228–0.661 | <0.001 | |||
| Tumor long diameter | 0.898 | 0.749–1.078 | <0.001 | 0.892 | 0.738–1.077 | 0.23 | 1.585 | 1.163–2.159 | 0.004 | 1.725 | 1.199–2.483 | 0.003 | |||
| Nodule type | 0.353 | 0.212–0.588 | <0.001 | 0.455 | 0.215–0.964 | 0.04 | 0.735 | 0.404–1.336 | 0.31 | – | – | – | |||
The 95% confidence interval of the lobulation sign is infinitely close to 0, expressed as /. CI, confidence interval; EGFR, epidermal growth factor receptor; LR, logistic regression; OR, odds ratio.
In the univariate regression analysis, age group, gender, spiculation, air bronchogram, and tumor diameter were significantly correlated with EGFR subtype mutation status (P<0.05). The multivariate regression analysis identified age, gender, air bronchogram, and tumor diameter as independent predictors of EGFR subtype mutation status (P<0.05).
Performance of radiomics models
The extracted radiomics features underwent screening, resulting in the retention of 12 radiomics features for predicting EGFR mutation status and 13 radiomics features for predicting EGFR mutation sites. Subsequently, these retained radiomics features, in conjunction with CLN, were utilized to construct SVM and LR classifiers for EGFR genotyping.
The performance metrics of radiomics models are summarized in Table 3. For the prediction of EGFR mutation status, the receiver operating characteristic (ROC) curve is shown in Figure 5. The Rad-LR model exhibited superior classification performance compared to the Rad-SVM model. Furthermore, the radiomics model that integrates the LR classification prediction score with CLN demonstrated the highest performance. Notably, the optimal radiomics model (Rad + CLN-LR) displayed a statistically significant enhancement in performance when compared to the Rad-LR model, as indicated by the Delong test (P=0.007).
Table 3
| Models | EGFR mutation status | EGFR mutation sites | |||||||
|---|---|---|---|---|---|---|---|---|---|
| AUC | ACC (%) | SEN (%) | SPE (%) | AUC | ACC (%) | SEN (%) | SPE (%) | ||
| Rad-SVM | 0.664 | 66.7 | 70.9 | 60 | 0.648 | 62.6 | 54.4 | 73.8 | |
| Rad-LR | 0.685 | 67.9 | 76.7 | 53.8 | 0.644 | 68.7 | 71.9 | 64.3 | |
| Rad + CLN-LR | 0.77 | 71.9 | 78.5 | 60 | 0.836 | 76.8 | 79.6 | 73.3 | |
ACC, accuracy; AUC, area under the curve; CLN, clinical predictors; EGFR, epidermal growth factor receptor; LR, logistic regression; SEN, sensitivity; SPE, specificity; SVM, support vector machine.
The performance indicators of these models are summarized in Figure 6. The classification performance of the Rad-LR model is better than that of the Rad-SVM model. The Rad + CLN-LR model demonstrated the best performance, showing significant improvement compared to both the Rad-SVM model (AUC =0.648) and the Rad-LR model (AUC =0.644) (Delong test: P=0.008 vs. Rad-SVM; P=0.008 vs. Rad-LR).
The comprehensive formula for the optimal radiomics signature, along with the nomogram, calibration curves, and decision curves for lung cancer EGFR genotyping based on the optimal radiomics model, is included in the supplementary file (Appendix 1).
Performance of deep learning models
The selected deep learning features, along with CLN, were utilized to develop SVM and LR classifiers, and the deep learning model performance indicators are detailed in Table 4.
Table 4
| Models | EGFR mutation status | EGFR mutation sites | |||||||
|---|---|---|---|---|---|---|---|---|---|
| AUC | ACC (%) | SEN (%) | SPE (%) | AUC | ACC (%) | SEN (%) | SPE (%) | ||
| RN-SVM | 0.713 | 69.1 | 68.5 | 70 | 0.672 | 64.7 | 79.1 | 53.6 | |
| RN-LR | 0.697 | 68.5 | 62 | 80 | 0.678 | 69.7 | 60.5 | 76.8 | |
| DN-SVM | 0.617 | 64.9 | 69.4 | 56.7 | 0.763 | 71.7 | 72 | 71.4 | |
| DN-LR | 0.626 | 64.9 | 72.2 | 51.7 | 0.752 | 67.7 | 63.6 | 70.9 | |
| RN + DN-LR | 0.87 | 80.2 | 78.9 | 82.8 | 0.841 | 76.8 | 73.1 | 80.9 | |
| RN + DN + CLN-LR | 0.908 | 82.6 | 81.8 | 84.2 | 0.872 | 76.8 | 80.4 | 72.9 | |
ACC, accuracy; AUC, area under the curve; CLN, clinical predictors; DN, DenseNet-121; EGFR, epidermal growth factor receptor; LR, logistic regression; RN, ResNet-18; SEN, sensitivity; SPE, specificity; SVM, support vector machine.
The ROC curves for each model are presented in Figure 7.
The classification performance of the SVM model constructed using ResNet-18 deep learning features exceeded that of the LR model. The LR model constructed using DenseNet-121 deep learning features performed better than the SVM model. The model was constructed by combining the RN-SVM score with the DN-LR score. The AUC, accuracy, sensitivity, and specificity were 0.87, 80.2%, 82.8%, and 78.9%, respectively. After introducing CLN, the AUC, accuracy, sensitivity, and specificity of the deep learning model improved to 0.908, 82.6%, 81.8%, and 84.2%, respectively. The best deep learning model (AUC =0.908) demonstrated significantly enhanced performance compared to both the RN-SVM model (Delong test: P<0.001) and the deep learning tumor assessment model based on DenseNet-121 deep features (Delong test: P<0.001). Furthermore, the best deep learning model significantly outperformed the best radiomics model (Delong test: P=0.002).
For the classification of EGFR mutation sites, the ROC curves of each model are shown in Figure 8.
The LR model constructed using ResNet-18 deep learning features outperformed the SVM model. Conversely, the SVM model constructed with DenseNet-121 deep learning features performed better than the LR model, achieving an AUC of 76.3% and accuracy of 71.7%. The AUC, accuracy, sensitivity, and specificity of the deep learning model, constructed using LR combined with the RN-LR score and DN-SVM score, were 84.1%, 76.8%, 80.9%, and 73.1%, respectively. After further introducing CLN, the model achieved its highest AUC value of 87.2%. Compared with the RN-LR and RN-SVM models, the performance improvement of this best deep learning model reached statistical significance (Delong test: P=0.004 and P=0.003, respectively).
The detailed formula for the optimal deep learning features, as well as the nomogram, calibration curves, and decision curves for lung cancer EGFR genotyping based on the optimal deep learning model, is provided in the supplementary file (Appendix 1).
Performance of combined models
The Rad-LR, RN-SVM, and DN-LR models were employed to predict EGFR mutation status, while Rad-LR, RN-LR, and DN-SVM models were utilized to predict EGFR mutation sites. The corresponding ROC curve is presented in Figures 9,10, and the performance indicators are detailed in Table 5.
Table 5
| Models | EGFR mutation status | EGFR mutation sites | |||||||
|---|---|---|---|---|---|---|---|---|---|
| AUC | ACC (%) | SEN (%) | SPE (%) | AUC | ACC (%) | SEN (%) | SPE (%) | ||
| Rad + RN | 0.869 | 80.8 | 82.6 | 77.6 | 0.808 | 76.8 | 68.9 | 83.3 | |
| Rad + DN | 0.76 | 68.9 | 71 | 65 | 0.846 | 76.8 | 82.7 | 70.2 | |
| Rad + RN + DN | 0.889 | 81.4 | 82.6 | 79.3 | 0.855 | 71.7 | 78.8 | 63.8 | |
| Rad + RN + DN + CLN | 0.911 | 85.6 | 87.6 | 81.5 | 0.936 | 89.9 | 90.9 | 89.1 | |
ACC, accuracy; AUC, area under the curve; CLN, clinical predictors; DN, DenseNet-121; EGFR, epidermal growth factor receptor; RN, ResNet-18; SEN, sensitivity; SPE, specificity.
The classification performance of the model combining Rad-LR-score and RN-SVM-score surpasses that of the model combining DN-LR-score. Further enhancement in performance was observed with the combined model incorporating Rad-LR-score, RN-SVM-score, and DN-LR-score features. When CLN were additionally included, the model achieved optimal performance, with an AUC of 0.911, accuracy of 85.6%, sensitivity of 81.5%, and specificity of 87.6%. This best combined model (AUC =0.911) demonstrates a significant improvement compared to the Rad + DN model (Delong test: P=0.001). Moreover, the improvement is statistically significant when compared with the best radiomics model (AUC =0.77, Delong test: P=0.002).
The model combining Rad-LR-score and DN-SVM-score exceeds that of the model with RN-LR-score. The performance further improved with the combined model integrating Rad-LR-score, RN-LR-score, and DN-SVM-score features, and the addition of CLN yielded the best model. This model achieved an AUC of 0.936, accuracy of 89.9%, sensitivity of 89.1%, and specificity of 90.9%. Compared to the Rad + DN and Rad + RN models, the performance of this best combined model is significantly superior (Delong test: P=0.048 vs. Rad + DN; P=0.01 vs. Rad + RN). Additionally, the model performance showed significant improvement relative to the best radiomics model (AUC =0.836, Delong test: P=0.04).
The Rad + RN + DN + CLN model is identified as the optimal combined modeling strategy. Figures 11,12 present the calibration and decision curves for this best combined model across, respectively. For predicting EGFR mutation status, the P value of the unreliability test was 0.71, indicating that the standard net benefit of clinical intervention with this model was superior to that of both no intervention and full intervention within the risk threshold of 0.1 to 0.9. For predicting EGFR mutation sites, the P value was 0.79, and the model’s standard net benefit of clinical intervention exceeded that of no intervention and full intervention within the risk threshold of 0.9. These results demonstrate that the best combined model exhibits strong predictive performance for both EGFR mutation status and mutation sites and holds significant promise for enhancing clinical interventions.
The detailed formula of the combined model, along with the nomogram, is provided in the supplementary file (Appendix 1).
Discussion
In this study, CT-based radiomics models, deep learning models, and radiomics-deep learning combined models were utilized to predict EGFR mutation status and mutation sites in lung cancer. All three models exhibited strong predictive performance and calibration, demonstrated favorable clinical utility, and hold considerable potential as important tools for the personalized treatment of patients.
Wu et al. integrated deep learning and radiomics models to predict EGFR mutation status in NSCLC, the AUC for the training, validation, and external test sets were 0.886, 0.812, and 0.790, respectively (24). However, that study was limited to stage I lung cancer and did not extend their investigation to include EGFR mutation sites.
Liu et al. reported an AUC of 0.65–0.76 in predicting EGFR mutation status and an AUC of 0.65–0.70 in predicting EGFR mutation sites (25). Chen et al. reported an AUC of 0.76 in predicting EGFR mutation status and an AUC of 0.55 in predicting EGFR mutation sites (26). Both studies solely utilized radiomics without incorporating deep learning techniques to differentiate EGFR mutation sites. In contrast, the fusion model employed in this study demonstrated substantial improvements in predicting EGFR mutation status and subtypes.
Our findings indicate that EGFR mutations are more prevalent among female patients compared with those harboring wild-type EGFR, which is consistent with prior studies (27,28). Furthermore, patients with exon 21 mutations tended to be older than those with exon 19 mutations, in line with previous reports (29). Notably, our study revealed a female predominance among patients with exon 21 mutations, a sex-related difference that was not observed in earlier studies (30). Ross A. Soo et al. reported that the incidence of EGFR mutations is notably higher in early-stage NSCLC patients compared to those with advanced-stage disease (31). As such, stratified prediction studies focused on EGFR gene status in patients with GGNs could offer valuable insights into their diagnosis and management. This research may also pave the way for the development of a dedicated prediction model tailored specifically for NSCLC patients with GGNs in the future.
In the radiomics models developed for predicting EGFR status and EGFR mutation sites, the majority of the radiomics features that were ultimately retained were wavelet-based features. This finding suggests that CT images processed with wavelet transformation more effectively capture the internal characteristics of EGFR gene expression. This is consistent with the findings of previous studies (26).
In this study, CT images were utilized to develop a deep learning model for EGFR genotyping, where feature maps were generated by a convolutional neural network. After reducing the dimensionality of these features, they were input into SVM and LR classifiers to construct a deep learning model. This model showed significant improvement compared to the best radiomics model. This enhanced performance can be attributed to the feature extractor based on the ResNet-18 deep convolutional network, which captures fine-grained details from shallow CT images in a smaller field of view, while the DenseNet-121-based extractor captures deeper structural information from a larger field of view. Clinical features further complement the model by incorporating medical knowledge.
In predicting EGFR mutation status, the optimal combined model was constructed by integrating Rad-LR, RN-SVM, DN-LR, gender, and nodule type. For the prediction of EGFR mutation sites, the most effective combined model incorporated Rad-SVM, RN-LR, DN-SVM, age group, gender, pleural indentation, bronchogram, and tumor diameter. These combined models demonstrated a significant improvement in performance compared to the best radiomics model. However, when compared to the best deep learning model, the improvement did not reach statistical significance. This lack of statistical significance may be due to the complementary nature of shallow and deep neural networks in deep learning models, which provide both high-level semantic information and low-level detail from the images, enabling the deep learning model to maintain robust independent tumor assessment capabilities. Moreover, partial collinearity between deep learning features and radiomics features may have contributed to the performance limitations observed in the combined models.
Furthermore, consistent with the findings of prior studies, the integration of clinical features with radiomics and deep learning markedly enhances the classification performance of the model compared to models that rely solely on radiomics and deep learning features (32-34). This enhancement demonstrates the model’s potential utility in the task of EGFR genotyping.
Additionally, the calibration test for the best-performing combined model yielded P>0.5. Decision curve analysis shows that the standardized net benefit of using this model for clinical intervention is significantly better than either full intervention or no intervention across most risk threshold ranges. This further demonstrates the superior capability of the model in distinguishing between EGFR wild-type and mutant genotypes, as well as between the 19-site and 21-site mutations, underscoring its substantial potential in predicting EGFR mutation status in clinical applications.
There are several limitations in this study, which can be addressed in future work: (I) as a single-center retrospective study, the generalizability of the model could be improved by incorporating data from multiple centers; (II) the model was not applied to an independent validation cohort, future research could prospectively collect data to further validate the findings; (III) while our fusion strategy (linear combination of radiomics and deep learning scores) improved performance, advanced techniques such as attention-based feature fusion or transformer architectures may further enhance model interpretability and generalizability; (IV) in addition, the study population consisted exclusively of an Asian cohort, which is known to have a higher prevalence of EGFR mutations compared with other populations. This ethnic homogeneity may limit the applicability of the model to more diverse patient populations, and future studies including multiethnic cohorts are warranted.
In summary, the integrated model combining radiomics, deep learning, and clinical data demonstrates superior performance in the task of lung cancer EGFR genotyping, highlighting its substantial potential for clinical tumor assessment. This study not only developed targeted radiomics methods but also introduced an innovative deep learning tumor assessment model, incorporating a deep learning feature extractor, feature screening, and a collaborative machine learning classifier to address the challenges of small sample medical data. By integrating radiomics, deep learning, and clinical knowledge, the model achieved non-invasive and precise identification of EGFR wild type and mutant type, as well as 19-site and 21-site mutations based solely on CT images. This advancement offers a significant technical approach for personalized decision-making in the management of advanced lung cancer.
Conclusions
Our study elucidated the potential of radiomics and deep learning methodologies in the non-invasive prediction of EGFR mutation status and specific mutation sites in lung cancer. These models are anticipated to serve as effective tools for guiding treatment decisions and facilitating personalized patient care. Furthermore, the integrated model that combines radiomics and deep learning demonstrates superiority over the standalone radiomics model, achieving enhanced efficiency in tumor assessment.
Acknowledgments
None.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-567/rc
Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-567/dss
Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-567/prf
Funding: None.
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-567/coif). X.G. is currently an employee of Jinan Guoke Medical Engineering and Technology Development Co., Ltd., a for-profit company. The author declares that this affiliation does not constitute conflicts of interest with respect to the content of this work. The other authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Sun Yat-sen University Cancer Center (No. SL-B2025-189-01). Given the retrospective nature of this study, the requirement for informed consent was waived.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Zhou F, Guo H, Xia Y, et al. The changing treatment landscape of EGFR-mutant non-small-cell lung cancer. Nat Rev Clin Oncol 2025;22:95-116. [Crossref] [PubMed]
- Shi Y, Au JS, Thongprasert S, et al. A prospective, molecular epidemiology study of EGFR mutations in Asian patients with advanced non-small-cell lung cancer of adenocarcinoma histology (PIONEER). J Thorac Oncol 2014;9:154-62. [Crossref] [PubMed]
- Kris MG, Natale RB, Herbst RS, et al. Efficacy of gefitinib, an inhibitor of the epidermal growth factor receptor tyrosine kinase, in symptomatic patients with non-small cell lung cancer: a randomized trial. JAMA 2003;290:2149-58. [Crossref] [PubMed]
- Mok TS, Cheng Y, Zhou X, et al. Improvement in Overall Survival in a Randomized Study That Compared Dacomitinib With Gefitinib in Patients With Advanced Non-Small-Cell Lung Cancer and EGFR-Activating Mutations. J Clin Oncol 2018;36:2244-50. [Crossref] [PubMed]
- Soria JC, Ohe Y, Vansteenkiste J, et al. Osimertinib in Untreated EGFR-Mutated Advanced Non-Small-Cell Lung Cancer. N Engl J Med 2018;378:113-25. [Crossref] [PubMed]
- Wu YL, Cheng Y, Zhou X, et al. Dacomitinib versus gefitinib as first-line treatment for patients with EGFR-mutation-positive non-small-cell lung cancer (ARCHER 1050): a randomised, open-label, phase 3 trial. Lancet Oncol 2017;18:1454-66. [Crossref] [PubMed]
- Locatelli-Sanchez M, Couraud S, Arpin D, et al. Routine EGFR molecular analysis in non-small-cell lung cancer patients is feasible: exons 18-21 sequencing results of 753 patients and subsequent clinical outcomes. Lung 2013;191:491-9. [Crossref] [PubMed]
- Harrison PT, Vyse S, Huang PH. Rare epidermal growth factor receptor (EGFR) mutations in non-small cell lung cancer. Semin Cancer Biol 2020;61:167-79. [Crossref] [PubMed]
- Batra U, Biswas B, Prabhash K, et al. Differential clinicopathological features, treatments and outcomes in patients with Exon 19 deletion and Exon 21 L858R EGFR mutation-positive adenocarcinoma non-small-cell lung cancer. BMJ Open Respir Res 2023;10:e001492. [Crossref] [PubMed]
- Zheng Z, Jin X, Lin B, et al. Efficacy of Second-line Tyrosine Kinase Inhibitors in the Treatment of Metastatic Advanced Non-small-cell Lung Cancer Harboring Exon 19 and 21 EGFR Mutations. J Cancer 2017;8:597-605. [Crossref] [PubMed]
- Hu X, Fujimoto J, Ying L, et al. Multi-region exome sequencing reveals genomic evolution from preneoplasia to lung adenocarcinoma. Nat Commun 2019;10:2978. [Crossref] [PubMed]
- Yoon HY, Ryu JS, Sim YS, et al. Clinical significance of EGFR mutation types in lung adenocarcinoma: A multi-centre Korean study. PLoS One 2020;15:e0228925. [Crossref] [PubMed]
- Malapelle U, Bellevicine C, De Luca C, et al. EGFR mutations detected on cytology samples by a centralized laboratory reliably predict response to gefitinib in non-small cell lung carcinoma patients. Cancer Cytopathol 2013;121:552-60. [Crossref] [PubMed]
- Thompson JC, Yee SS, Troxel AB, et al. Detection of therapeutically targetable driver and resistance mutations in lung cancer patients by next-generation sequencing of cell-free circulating tumor DNA. Clin Cancer Res 2016;22:5772-82. [Crossref] [PubMed]
- Li Y, Lv X, Wang B, et al. Differentiating EGFR from ALK mutation status using radiomics signature based on MR sequences of brain metastasis. Eur J Radiol 2022;155:110499. [Crossref] [PubMed]
- Wan JCM, Massie C, Garcia-Corbacho J, et al. Liquid biopsies come of age: towards implementation of circulating tumour DNA. Nat Rev Cancer 2017;17:223-38. [Crossref] [PubMed]
- Mu W, Jiang L, Zhang J, et al. Non-invasive decision support for NSCLC treatment using PET/CT radiomics. Nat Commun 2020;11:5228. [Crossref] [PubMed]
- Le NQK, Kha QH, Nguyen VH, et al. Machine Learning-Based Radiomics Signatures for EGFR and KRAS Mutations Prediction in Non-Small-Cell Lung Cancer. Int J Mol Sci 2021;22:9254. [Crossref] [PubMed]
- Wang S, Yu H, Gan Y, et al. Mining whole-lung information by artificial intelligence for predicting EGFR genotype and targeted therapy response in lung cancer: a multicohort study. Lancet Digit Health 2022;4:e309-19. [Crossref] [PubMed]
- Mosele F, Remon J, Mateo J, et al. Recommendations for the use of next-generation sequencing (NGS) for patients with metastatic cancers: a report from the ESMO Precision Medicine Working Group. Ann Oncol 2020;31:1491-505. [Crossref] [PubMed]
- Isensee F, Jaeger PF, Kohl SAA, et al. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 2021;18:203-11. [Crossref] [PubMed]
- Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 2005;27:1226-38. [Crossref] [PubMed]
- Sulaiman MA, Labadin J. editors. Feature selection based on mutual information. 2015 9th International Conference on IT in Asia (CITA); August 2015.
- Wu J, Meng H, Zhou L, et al. Habitat radiomics and deep learning fusion nomogram to predict EGFR mutation status in stage I non-small cell lung cancer: a multicenter study. Sci Rep 2024;14:15877. [Crossref] [PubMed]
- Liu G, Xu Z, Ge Y, et al. 3D radiomics predicts EGFR mutation, exon-19 deletion and exon-21 L858R mutation in lung adenocarcinoma. Transl Lung Cancer Res 2020;9:1212-24. [Crossref] [PubMed]
- Chen Q, Li Y, Cheng Q, et al. EGFR Mutation Status and Subtypes Predicted by CT-Based 3D Radiomic Features in Lung Adenocarcinoma. Onco Targets Ther 2022;15:597-608. [Crossref] [PubMed]
- Yang L, Xu P, Li M, et al. PET/CT Radiomic Features: A Potential Biomarker for EGFR Mutation Status and Survival Outcome Prediction in NSCLC Patients Treated With TKIs. Front Oncol 2022;12:894323. [Crossref] [PubMed]
- Jiang M, Zhang Y, Xu J, et al. Assessing EGFR gene mutation status in non-small cell lung cancer with imaging features from PET/CT. Nucl Med Commun 2019;40:842-9. [Crossref] [PubMed]
- Sheng M, Wang F, Zhao Y, et al. Comparison of clinical outcomes of patients with non-small-cell lung cancer harbouring epidermal growth factor receptor exon 19 or exon 21 mutations after tyrosine kinase inhibitors treatment: a meta-analysis. Eur J Clin Pharmacol 2016;72:1-11. [Crossref] [PubMed]
- Won YW, Han JY, Lee GK, et al. Comparison of clinical outcome of patients with non-small-cell lung cancer harbouring epidermal growth factor receptor exon 19 or exon 21 mutations. J Clin Pathol 2011;64:947-52. [Crossref] [PubMed]
- Soo RA, Reungwetwattana T, Perroud HA, et al. Prevalence of EGFR Mutations in Patients With Resected Stages I to III NSCLC: Results From the EARLY-EGFR Study. J Thorac Oncol 2024;19:1449-59. [Crossref] [PubMed]
- Liu Y, Kim J, Qu F, et al. CT Features Associated with Epidermal Growth Factor Receptor Mutation Status in Patients with Lung Adenocarcinoma. Radiology 2016;280:271-80. [Crossref] [PubMed]
- Zhang G, Shang L, Cao Y, et al. Prediction of epidermal growth factor receptor (EGFR) mutation status in lung adenocarcinoma patients on computed tomography (CT) images using 3-dimensional (3D) convolutional neural network. Quant Imaging Med Surg 2024;14:6048-59. [Crossref] [PubMed]
- Huang L, Xu L, Wang X, et al. Prediction of EGFR Mutations in Lung Adenocarcinoma via CT Images: A Comparative Study of Intratumoral and Peritumoral Radiomics, Deep Learning, and Fusion Models. Acad Radiol 2025;32:4880-92. [Crossref] [PubMed]


