A predictive nomogram for EGFR mutation status in lung adenocarcinoma manifesting as ground-glass nodules
Original Article

A predictive nomogram for EGFR mutation status in lung adenocarcinoma manifesting as ground-glass nodules

Xiaoxia Ping, Qian Meng, Nan Jiang, Su Hu

Department of Radiology, The First Affiliated Hospital of Soochow University, Suzhou, China

Contributions: (I) Conception and design: S Hu; (II) Administrative support: S Hu; (III) Provision of study materials or patients: X Ping; (IV) Collection and assembly of data: X Ping, Q Meng; (V) Data analysis and interpretation: X Ping, N Jiang; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

Correspondence to: Su Hu, PhD. Department of Radiology, The First Affiliated Hospital of Soochow University, No. 188, Shizi Street, Suzhou 215006, China. Email: husu_soochow@163.com.

Background: The mutation status of epidermal growth factor receptor (EGFR) in lung adenocarcinoma is significantly associated with postoperative progression-free survival. Computed tomography (CT)-based radiomics analysis may have potential value in predicting EGFR mutation status. This study aims to explore the predictive capacity of radiomics analysis for EGFR mutation status in lung adenocarcinomas presenting as ground-glass nodules (GGNs).

Methods: We included 199 GGNs confirmed by histopathology from 2016 to 2020. The clinical factors and radiographic characteristics were counted and evaluated. All GGNs were manually delineated and the radiomics features were extracted, using the least absolute shrinkage and selection operator for feature selection. Then the radiographic, radiomics, and combined nomogram model were constructed respectively, and compared with each other. Decision curve analysis (DCA) was used to assess the clinical usefulness of the models, while receiver operating characteristic curves and calibration curves were used to evaluate their predictive performance.

Results: Univariate analysis revealed five variables that were significantly different between the EGFR mutant and wild-type groups. Fifteen radiomics features were significantly associated with EGFR mutations. Among the three models, both the radiomics [area under the curve (AUC) =0.818] and the nomogram (AUC =0.820) had good discriminatory ability in predicting EGFR mutation status and performed consistently in the validation cohort (AUC =0.805, and 0.833, respectively), with higher predictive performance than the radiographic model. The DCA showed that when it comes to EGFR mutation status prediction, the nomogram and the radiomics model showed better overall net benefit than the radiographic model.

Conclusions: For preoperatively predicting the status of EGFR mutation in lung adenocarcinomas manifesting as GGNs, the CT-based radiomics analysis will be valuable.

Keywords: Ground-glass nodule (GGN); radiomics; lung adenocarcinoma; epidermal growth factor receptor (EGFR)


Submitted Jul 22, 2024. Accepted for publication Sep 27, 2024. Published online Nov 18, 2024.

doi: 10.21037/jtd-24-1166


Highlight box

Key findings

• For preoperatively predicting the status of epidermal growth factor receptor (EGFR) mutation in lung adenocarcinomas manifesting as ground-glass nodule (GGN), the computed tomography (CT)-based radiomics analysis will be valuable.

What is known and what is new?

• Currently, several studies have employed the radiomics model to forecast the lung adenocarcinoma’s EGFR mutation status; however, these researches have mostly focused on solid nodules or solid predominance nodules, with less research being done on GGN-featured lung adenocarcinoma.

• The study aims to investigate the predictive capacity of radiomic analysis for EGFR mutation status in lung adenocarcinomas presenting as GGNs.

What is the implication, and what should change now?

• A radiomics model and nomogram have been developed to predict the EGFR mutation status in lung adenocarcinomas that manifest as GGNs, which has a better performance than the radiographic model. Radiomics is a reusable, non-invasive method of data processing that may be applied to predicting the status of EGFR mutation in lung adenocarcinoma.


Introduction

According to the World Health Organization, lung cancer has become the leading cause of cancer-related deaths globally (1). Lung adenocarcinoma, the most prevalent histologic subtype of non-small cell lung cancer (NSCLC), accounts for approximately 85% of all lung cancer cases (1,2). In recent years, some patients have had a notable improvement in survival because of the identification of lung cancer driver mutations, particularly those targeting the epidermal growth factor receptor (EGFR) (3-5). These lung adenocarcinoma patients with EGFR-mutant are very responsive to tyrosine kinase inhibitors (TKIs), which also prolong the progression-free survival. On the other hand, individuals with EGFR-wildtype do not respond to TKI treatment at any point during the illness (6-8). Thus, before beginning targeted therapy, it is essential to ascertain whether a patient has an EGFR mutation.

For EGFR mutation, tissue biopsy-based genetic testing is widely regarded as the gold standard (9). But there are several disadvantages (10-12): (I) because of the tumor’s tiny size or the biopsy sample’s small size, the findings of the genetic testing cannot be acquired; (II) biopsy or surgical resection are invasive screening maneuvers, with the risk of potential metastasis of the tumor; (III) because of the tumor’s heterogeneity, tissues extracted from one area may not be indicative of the overall tumor. Liquid biopsies have recently been utilized to check for genetic changes in lung cancer patients (13-15). This technique detects DNA released into the circulation by necrotic or apoptotic tumor cells, which has a high sensitivity for EGFR mutation detection. But, there are also disadvantages (16-18): (I) there is a higher specificity but lower sensitivity, with a detection rate of only about 50% in stage I lung adenocarcinoma; (II) the molecular variants detected may not be tumor-related; (III) the majority of the early lesions in individuals with multifocal lung cancer show multiple ground-glass opacity, but liquid biopsy does not confirm the source of the mutations. Therefore, there is a need for a simple, non-invasive way to predict if lung adenocarcinoma appearing as ground-glass nodules (GGNs) has an EGFR mutation.

Computed tomography (CT) is a common examination technology that has been widely utilized for lung cancer screening. Prior studies have examined the connection between EGFR mutation in lung cancer and CT radiographic characteristics (19-23). However, the subjectivity in the assessment of CT radiographic features and inconsistency in measurement criteria may reduce the confidence in prediction, and CT examination could not reveal the microenvironment within the tumor. Conversely, “Radiomics” uses high-throughput features extracted from medical images, quantifying high-dimensional features that are unobservable by the human visual system, and applying them to clinical decision-making (24,25). Currently, several studies have employed the radiomics model to forecast the lung adenocarcinoma’s EGFR mutation status; however, these researches have mostly focused on solid nodules or solid predominance nodules, with less research being done on GGN-featured lung adenocarcinoma (26-28).

The study aims to investigate the predictive capacity of radiomics analysis for EGFR mutation status in lung adenocarcinomas presenting as GGNs. We developed a combined model that integrates the radiomics signature derived from preoperative CT imaging with clinical factors and radiographic characteristics, and illustrated using a nomogram to fulfill this objective. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-1166/rc).


Methods

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Ethics Committee of The First Affiliated Hospital of Soochow University (2023-No.203) and individual consent for this retrospective analysis was waived.

Patient selection

A retrospective database of patients with GGNs at The First Affiliated Hospital of Soochow University between 2016 and 2020 was established, all patients had performed a CT scan before surgery and underwent surgical resection. The exclusion criteria: (I) lack of EGFR test results; (II) assessment was impacted by poor image quality; (III) an invasive operation was done on the target lesion before the CT scan; and (IV) between the CT scan and the operation, an interval of time greater than 1 month.

Sex, age, and history of lung cancer for each patient were taken from their medical file. Figure 1 provides a summary of the study’s process.

Figure 1 The workflow of this study. LASSO, least absolute shrinkage and selection operator; HRCT, high-resolution computed tomography; EGFR, epidermal growth factor receptor.

CT imaging acquisition and features evaluation

The following CT scanners were used for patient scans: Brilliance iCT, IQON spectral CT (Philips Healthcare, Netherlands); Somatom Sensation 64, Somatom Definition (Siemens Healthineers, German); GE revolution, Discovery CT 750 HD (GE Healthcare, USA). The parameters: tube voltage: 100 kV, 120 kV; X-ray tube current: automatic; slice thickness: 1 mm, 1.25 mm; data collection diameter: 400 mm, 500 mm; collimator width: 0.6 mm, 0.625 mm; and rows × columns: 512×512.

CT image interpretation was performed on the lung window (window level −600 HU; window width 1,500 HU). Independent evaluation of the target lesions was conducted by two radiologists, each with 9 and 15 years of expertise in chest diagnosis; differences were settled by discussion. The CT radiographic characteristics listed below were assessed: (I) tumor density, described as pure GGN, mixed GGN; (II) shape, described as round or oval, irregular; (III) margin, described as well-defined, poor-defined; (IV) presence or absence of spiculated (spiked or star-like edges of the tumor), vacuole sign (small air-filled spaces or cavities within the tumor), air bronchogram (air-filled bronchi or small airways within a region of tumor), vascular convergence (blood vessels drawn towards or converge on the tumor), and pulmonary emphysema; (V) multiplicity, described as solitary nodule, multiple nodules; (VI) location of the tumor (lobe).

EGFR mutation detection

All mutation analyses were performed using post-surgical specimens. We performed EGFR mutation analysis of 4 tyrosine kinase domains, including G719X in exon 18, 19-del in exon 19, T790M, S768I, and 20-ins in exon 20, L858R and L861Q in exon 21. The tumor was classified as EGFR-mutant if any exon mutations were found; otherwise, it was classified as EGFR-wild type.

Tumor segmentation and radiomics feature extraction

To extract features, each image was exported in the DICOM format. One radiologist utilized an open-source program (ITK-SNAP, version 3.6.0, http://www.itksnap.org/) to draw the volume of interest (VOI) of the 199 GGNs (Figure 2), obtained the volume (mm3) and intensity mean (HU) of the VOI, then calculate the tumor mass according to the following formula: tumor mass = tumor volume × (1,000 + tumor density)/1,000 (29).

Figure 2 Outlining of the volume of interest. (A,B) An axial CT image of a 63-year-old female patient, showing a right middle lobe lung adenocarcinoma with EGFR mutant; (C,D) an axial CT image of a 46-year-old female patient, showing a right upper lobe lung adenocarcinoma with EGFR wild-type. CT, computed tomography; EGFR, epidermal growth factor receptor.

Radiomics features were extracted from the VOI using FeAture Explore (FAE, version0.4.2, http://github.com/salan668/FAE) (30). 1316 radiomics features in each nodule were extracted, including shape-based features (n=14) and texture features (n=1,302).

Thirty randomly chosen patients were assessed for intra- and inter-observer repeatability by two radiologists each. Features with an intraclass correlation coefficient (ICC) less than 0.75 were considered unreliable and removed.

Statistical analysis and prediction model construction

R software (version 4.2.2, http://www.Rproject.org) and GraphPad Prism (version 9.0, https://www.graphpad.com/) were used to conduct the statistical analysis.

A total of 199 nodules were randomly allocated into the training (n=139) and validation (n=60) cohort, with a ratio of 7:3. This commonly used approach ensures that the model is trained on a sufficient amount of data while reserving enough samples for validation (31,32). This balance helps to prevent overfitting and maintain computational efficiency. Numerical variables were analyzed using the t-test and the non-parametric Mann-Whitney U test, while categorical variables were assessed with the Chi-squared test. A P value of less than 0.10 was deemed statistically significant. Following univariate analysis, statistically significant variables were used to construct a radiographic model through logistic regression. The model’s performance was evaluated by calculating the area under the receiver operating characteristic (ROC) curve (AUC).

Before feature selection, radiomic features were normalized using the Z-score. The most valuable features were then selected using the least absolute shrinkage and selection operator (LASSO) method. Based on these selected features, the radiomics model was subsequently developed.

We constructed a combined model by employing multivariate logistic regression to integrate clinical factors, radiographic characteristics, and radiomics signature. Subsequently, we visualized this model using a nomogram. The DeLong test was employed to assess variations between the ROC curves of each model. Furthermore, the clinical utility of these models was evaluated using decision curve analysis (DCA), which quantified the net benefit at varying threshold probabilities and provided an assessment of the models’ practical applicability in clinical settings.


Results

Clinical factors and radiographic characteristics of patients

The overall EGFR mutation rate within the enrolled population was 50.25% (100 of 199 nodules). The training and validation cohorts had EGFR mutation rates of 52.52% (73 of 139 nodules) and 45% (27 of 60 nodules), respectively. There was no significant difference in EGFR mutation rates between the two cohorts (P=0.62).

Table 1 shows the relationship between the clinical and radiographic features and EGFR mutation status in the training and validation cohorts. Univariate analysis showed that there were five significant differences between the EGFR mutant and wild-type groups in the training cohort. Subsequently, we employed logistic regression, a multivariable analysis technique for binary outcomes, to construct the radiographic model using the five significant variables (volume, intensity mean, tumor mass, margin, and air bronchogram) identified from univariate analysis.

Table 1

Comparison of clinical factors and radiographic characteristics of patients according to EGFR mutation status in the training and validation cohorts

Variables Training cohort Validation cohort
EGFR (+) (n=73) EGFR (−) (n=66) P value EGFR (+) (n=27) EGFR (−) (n=33) P value
Sex 0.73 0.58
   Female 56 (76.71) 49 (74.24) 17 (62.96) 23 (69.70)
   Male 17 (23.29) 17 (25.76) 10 (37.04) 10 (30.30)
Age, years 55.34±11.25 53.18±11.76 0.27 52.52±11.45 52.21±10.89 0.91
History of lung cancer 10 (13.70) 6 (9.09) 0.39 3 (11.11) 6 (18.18) 0.68
Multiplicity 0.50 0.33
   Solitary 28 (38.36) 29 (43.94) 14 (51.85) 13 (39.39)
   Multiple 45 (61.64) 37 (56.06) 13 (48.15) 20 (60.61)
Volume, mm3 837.00 (1381.4) 301.70 (585.93) 0.003 431.60 (824.75) 293.00 (361.10) 0.03
Intensity mean, HU −530.14 (169.30) −579.29 (132.89) 0.03 −465.66 (160.32) −603.94 (149.78) <0.001
Tumor mass 374.32 (883.71) 122.77 (176.68) 0.001 243.91 (390.53) 113.10 (111.85) <0.001
Tumor density 0.33 0.29
   Pure GGN 44 (60.27) 45 (68.18) 18 (66.67) 26 (78.79)
   Mixed GGN 29 (39.73) 21 (31.82) 9 (33.33) 7 (21.21)
Shape 0.74 0.59
   Round or oval 49 (67.12) 46 (69.70) 22 (81.48) 25 (75.76)
   Irregular 24 (32.88) 20 (30.30) 5 (18.52) 8 (24.24)
Spiculated 41 (56.16) 32 (48.48) 0.36 15 (55.56) 15 (45.45) 0.43
Margin 0.09 0.054
   Well-defined 42 (57.53) 47 (71.21) 16 (59.26) 27 (81.82)
   Poor-defined 31 (42.47) 19 (28.79) 11 (40.74) 6 (18.18)
Vacuole sign 31 (42.47) 22 (33.33) 0.26 7 (25.93) 10 (30.30) 0.70
Air bronchogram 29 (39.73) 17 (25.76) 0.08 9 (33.33) 4 (12.12) 0.047
Vascular convergence 56 (76.71) 43 (65.15) 0.13 19 (70.37) 19 (57.58) 0.30
Pleural retraction 29 (39.73) 21 (31.82) 0.33 7 (25.93) 7 (21.21) 0.66
Pulmonary emphysema 17 (23.29) 19 (28.79) 0.46 4 (14.81) 5 (15.15) >0.99
Lobe 0.60 0.68
   RUL 29 (39.73) 20 (30.30) 9 (33.33) 9 (27.27)
   RML 6 (8.22) 7 (10.61) 4 (14.81) 4 (12.12)
   RLL 12 (16.44) 8 (12.12) 4 (14.81) 2 (6.06)
   LUL 17 (23.29) 19 (28.79) 8 (29.63) 14 (42.42)
   LLL 9 (12.33) 12 (18.18) 2 (7.41) 4 (12.12)

Data are presented as mean ± standard deviation, n (%), or median (interquartile range). EGFR, epidermal growth factor receptor; HU, Hounsfield unit; GGN, ground glass nodule; RUL, right upper lobe; RML, right middle lobe; RLL, right lower lobe; LUL, left upper lobe; LLL, left lower lobe; EGFR (+), EGFR mutant; EGFR (−), EGFR wild-type.

Radiomics features selection

From each VOI, a total of 1,316 radiomics features were extracted, and features with ICCs less than 0.75 for intra- or inter-observer agreement were eliminated. The features that had a Pearson’s coefficient higher than 0.9 were eliminated through feature pre-selection. After that, 15 features were kept using LASSO, and each feature that was chosen had a coefficient that LASSO had computed (Table 2, Figure 3). A radiomics-score (radscore) based on the coefficient of the chosen features was assigned to each patient (Figure 4). Based on the 15 chosen features, the radiomics model was built. Using multivariable logistic regression, a combined model was constructed by incorporating clinical, radiographic, and radiomic features. Additionally, the results were visualized using a nomogram, which can translate complex statistical models into intuitive graphical representations, allowing for easy visualization of predictive probabilities.

Table 2

Feature selection by LASSO regression

Groups Feature names Coefficients
First order original_Minimum 0.001183
wavelet.HHH_Median 0.044605
wavelet.HHL_Kurtosis −0.261803
wavelet.HLH_Skewness 1.147258
Shape original_Flatness −0.372150
GLCM wavelet.HHL_Correlation 6.016163
wavelet.HLL_ClusterShade −0.000860
wavelet.LLH_ClusterShade −0.000243
logarithm_Correlation 1.728932
GLSZM wavelet.HHL_SmallAreaEmphasis 1.875632
wavelet.HLH_SmallAreaLowGrayLevelEmphasis −9.545703
logarithm_SmallAreaEmphasis 0.573257
wavelet.LHL_SmallAreaLowGrayLevelEmphasis −9.294496
NGTDM wavelet.HLH_Strength −0.121665
gradient_Strength −0.033613

LASSO, least absolute shrinkage and selection operator; GLCM, gray-level co-occurrence matrix; GLSZM, gray-level size zone matrix; NGTDM, neighborhood gray-tone difference matrix.

Figure 3 Radiomics feature coefficients and correlation matrix. (A) Plot of coefficients for the top 10 radiomics features; (B) heatmap of correlation coefficients for selected radiomics features.
Figure 4 A violin plot was used to contrast the radscore of EGFR mutant and wild-type in both the training and validation cohorts. There was a statistically significant difference between EGFR wild-type and EGFR mutant in both cohorts. EGFR, epidermal growth factor receptor.

Performance and validation of each model

In the training cohort, the AUCs of the radiographic, radiomics, and nomogram models were 0.652, 0.818, and 0.820, respectively. In the validation cohort, the AUCs for the three models were 0.763, 0.805, and 0.833, respectively (Table 3, Figure 5). The results of the DeLong test indicated no significant difference between the radiomics model and the nomogram (Z=−0.603, P=0.54). However, the differences in AUC among the radiomics model, the nomogram, and the radiographic model were statistically significant (all P<0.01). The calibration curves demonstrated a good degree of agreement between the actual probability and the nomogram’s prediction of EGFR mutation (Hosmer-Lemeshow test: P=0.73, 0.66, respectively) (Figure 6).

Table 3

Predictive performance of the three models in the training and validation cohorts

Models AUC 95% CI Accuracy Sensitivity Specificity PPV NPV
Training cohort
   Radiographic 0.652 0.561–0.744 0.655 0.795 0.500 0.637 0.688
   Radiomics 0.818 0.748–0.888 0.755 0.890 0.606 0.714 0.833
   Nomogram 0.820 0.750–0.889 0.755 0.699 0.818 0.809 0.711
Validation cohort
   Radiographic 0.763 0.634–0.890 0.733 0.593 0.848 0.762 0.718
   Radiomics 0.805 0.689–0.920 0.800 0.741 0.848 0.800 0.800
   Nomogram 0.833 0.722–0.943 0.817 0.704 0.909 0.864 0.789

AUC, area under the curve; CI, confidence interval; PPV, positive predictive value; NPV, negative predictive value.

Figure 5 ROC curves between various models and cohorts. The top row showed the ROC curves of different models in the (A) training cohort, and (B) validation cohort, respectively. The bottom row showed the AUC values of different models in the (C) training cohort, and (D) validation cohort, respectively. ROC, receiver operating characteristic; AUC, area under the curve.
Figure 6 The calibration curve shows that the probability of predicting EGFR mutation status by the nomogram is highly consistent with the actual probability in both the training cohort (A) and the validation cohort (B). (C) Nomogram of the model combining radiomics signature (radscore) and radiographic feature (air bronchogram) for predicting EGFR mutation status. EGFR, epidermal growth factor receptor.

The DCA showed that in predicting the EGFR mutation status within a suitable threshold probability range, the radiomics and the nomogram model had a larger overall net benefit than the radiographic model (Figure 7).

Figure 7 Decision curves analysis for all models in the training cohort. A larger area under the decision curve indicates better clinical utility.

Discussion

Studies have shown that East Asian patients with advanced NSCLC have a higher rate of EGFR mutations (40% to 60%) compared to the Caucasian population (14% to 18%) (26,33). In this study, EGFR mutants were present in 55% (100/199) of nodules overall, with exon 19 and exon 21 having the highest frequency of mutations, which is consistent with earlier studies.

By radiographic analysis of the CT images, we found that the EGFR mutant nodules had higher volume and tumor mass, higher intensity mean, and a greater probability of a poorly defined margin and air-bronchogram than EGFR wild-type nodules. However, the radiographic model based on these selected features had an average predictive efficacy (AUC =0.652). Previous studies have attempted to clarify the correlation between CT radiographic features and EGFR mutation status, yet their findings have been inconsistent. Han et al. (19) suggested that non-smoking women and patients with lesions manifesting as GGN and air-bronchogram were more likely to exhibit EGFR-mutant, while Sugano et al. (34) found EGFR mutation status and the presence of GGN did not appear to be significantly correlated. This might be because the radiographic features were defined by the radiologists and may be evaluated differently based on different subjective experiences, and the differences in study design and grouping methods could be one of the reasons for the contradictory findings between these studies.

Currently, the emerging radiomics analysis is based on medical images, and image data-mining is achieved by converting images into high-dimensional features, eliminating the impact of the traditional radiologists’ personal factors on diagnosis. Furthermore, it mines and integrates the imaging features, especially some high-order texture features, which reflect not only the heterogeneity of tumor histology but even the genetic differences behind it. Jiang et al. (35) found that the radiomics signature, which consists of seven radiomics features, demonstrated strong discriminative performance between invasive and micro-invasive adenocarcinoma, with an AUC of 0.892. Chen et al. (28) built a radiomics model which can not only predict EGFR mutation status but also predict mutation subtypes. In this study, 1,316 radiomics features were retrieved from each nodule, and all the features were contracted into 15 potential predictors using LASSO, then a radscore was calculated to indicate the potential risk of EGFR mutation status for each nodule. Of the 15 features including 1 shape-based feature and 14 texture features, the shape-based feature is based on information such as the three-dimensional size and shape of the lesion itself and therefore does not involve heterogeneity. In contrast, texture features quantify tumor heterogeneity by providing information about the relative positions of different gray levels of the image through various mathematical methods. In this study, the EGFR mutant group had higher values in skewness, correlation, and small area emphasis, indicating that cell density, necrosis, or fibrosis are more pronounced than the wild-type group. Additionally, the EGFR mutant tumors exhibited a more structured and organized tumor microenvironment. The constructed radiomics model exhibited superior predictive efficacy, with an AUC of 0.818 and 0.805 in the training and validation cohorts, respectively. This performance was notably higher compared to the radiographic model, which had an AUC of 0.652 and 0.763, respectively.

The nomogram is employed to visually represent the combined model, which was constructed by integrating clinical and radiographic characteristics along with the radscore, using multivariate logistic regression. He et al. (27) discovered that the combined model performed better in each of the four machine-learning classifiers that they utilized to predict the EGFR mutation status. In this study, although the nomogram’s AUC was greater than the radiomics model’s (the training cohort 0.820 vs. 0.818; the validation cohort 0.833 vs. 0.805), there was no statistical difference between them. This disparity may be attributed to the difficulty in differentiating between the EGFR mutant and wild-type groups solely only based on radiographic features, as their distribution of radiographic features was quite comparable. On the other hand, the radiomics model outperformed the radiographic model in terms of performance, which may be a result of using the most informative features in the radiomics model.

Although the conclusion is encouraging, there are several limitations in this study. First, we collected 2,660 GGNs with pathologic findings, but only 199 nodules had genetic testing results, the sample size was not large enough, and given this reason, we did not stratify the analysis for different mutation subtypes. To mitigate this, we employed random allocation and cross-validation techniques to ensure the robustness of the model despite the small sample size. Second, even if the feature selection procedure had taken repeatability into account, the reproducibility and stability of radiomics features may be impacted by the fact that we utilized different CT scanning machines and parameters. To address this issue, we focused on reproducibility in our feature selection and used stability tests, such as ICC and cross-validation, to reduce the effects of varying imaging conditions. Third, smoking history was not taken into account due to the lack of information regarding smoking history in the medical file.


Conclusions

In conclusion, a radiomics model and nomogram have been developed to predict the EGFR mutation status in lung adenocarcinomas that manifest as GGNs, which has a better performance than the radiographic model. Radiomics is a reusable, non-invasive method of data processing that may be applied to predicting the status of EGFR mutation in lung adenocarcinoma.


Acknowledgments

Funding: None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-1166/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-1166/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-1166/prf

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-1166/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study was approved by the Ethics Committee of the First Affiliated Hospital of Soochow University (2023-No.203) and individual consent for this retrospective analysis was waived.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021;71:209-49. [Crossref] [PubMed]
  2. Chen W, Zheng R, Baade PD, et al. Cancer statistics in China, 2015. CA Cancer J Clin 2016;66:115-32. [Crossref] [PubMed]
  3. Pao W, Girard N. New driver mutations in non-small-cell lung cancer. Lancet Oncol 2011;12:175-80. [Crossref] [PubMed]
  4. Tan AC, Tan DSW. Targeted Therapies for Lung Cancer Patients With Oncogenic Driver Molecular Alterations. J Clin Oncol 2022;40:611-25. [Crossref] [PubMed]
  5. Oxnard GR, Binder A, Jänne PA. New targetable oncogenes in non-small-cell lung cancer. J Clin Oncol 2013;31:1097-104. [Crossref] [PubMed]
  6. Sasaki H, Endo K, Okuda K, et al. Epidermal growth factor receptor gene amplification and gefitinib sensitivity in patients with recurrent lung cancer. J Cancer Res Clin Oncol 2008;134:569-77. [Crossref] [PubMed]
  7. Hosomi Y, Morita S, Sugawara S, et al. Gefitinib Alone Versus Gefitinib Plus Chemotherapy for Non-Small-Cell Lung Cancer With Mutated Epidermal Growth Factor Receptor: NEJ009 Study. J Clin Oncol 2020;38:115-23. [Crossref] [PubMed]
  8. Maemondo M, Inoue A, Kobayashi K, et al. Gefitinib or chemotherapy for non-small-cell lung cancer with mutated EGFR. N Engl J Med 2010;362:2380-8. [Crossref] [PubMed]
  9. Lindeman NI, Cagle PT, Aisner DL, et al. Updated Molecular Testing Guideline for the Selection of Lung Cancer Patients for Treatment With Targeted Tyrosine Kinase Inhibitors: Guideline From the College of American Pathologists, the International Association for the Study of Lung Cancer, and the Association for Molecular Pathology. J Thorac Oncol 2018;13:323-58. [Crossref] [PubMed]
  10. Vanderlaan PA, Yamaguchi N, Folch E, et al. Success and failure rates of tumor genotyping techniques in routine pathological samples with non-small-cell lung cancer. Lung Cancer 2014;84:39-44. [Crossref] [PubMed]
  11. Bonanno L, Pavan A, Ferro A, et al. Clinical Impact of Plasma and Tissue Next-Generation Sequencing in Advanced Non-Small Cell Lung Cancer: A Real-World Experience. Oncologist 2020;25:e1996-2005. [Crossref] [PubMed]
  12. Uematsu T, Kasami M. The use of positive core wash cytology to estimate potential risk of needle tract seeding of breast cancer: directional vacuum-assisted biopsy versus automated core needle biopsy. Breast Cancer 2010;17:61-7. [Crossref] [PubMed]
  13. Zugazagoitia J, Gómez-Rueda A, Jantus-Lewintre E, et al. Clinical utility of plasma-based digital next-generation sequencing in oncogene-driven non-small-cell lung cancer patients with tyrosine kinase inhibitor resistance. Lung Cancer 2019;134:72-8. [Crossref] [PubMed]
  14. Bertoli E, De Carlo E, Basile D, et al. Liquid Biopsy in NSCLC: An Investigation with Multiple Clinical Implications. Int J Mol Sci 2023;24:10803. [Crossref] [PubMed]
  15. Park S, Olsen S, Ku BM, et al. High concordance of actionable genomic alterations identified between circulating tumor DNA-based and tissue-based next-generation sequencing testing in advanced non-small cell lung cancer: The Korean Lung Liquid Versus Invasive Biopsy Program. Cancer 2021;127:3019-28. [Crossref] [PubMed]
  16. Mitchell RL, Kosche C, Burgess K, et al. Misdiagnosis of Li-Fraumeni Syndrome in a Patient With Clonal Hematopoiesis and a Somatic TP53 Mutation. J Natl Compr Canc Netw 2018;16:461-6. [Crossref] [PubMed]
  17. Marquette CH, Boutros J, Benzaquen J, et al. Circulating tumour cells as a potential biomarker for lung cancer screening: a prospective cohort study. Lancet Respir Med 2020;8:709-16. [Crossref] [PubMed]
  18. Newman AM, Bratman SV, To J, et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat Med 2014;20:548-54. [Crossref] [PubMed]
  19. Han X, Fan J, Gu J, et al. CT features associated with EGFR mutations and ALK positivity in patients with multiple primary lung adenocarcinomas. Cancer Imaging 2020;20:51. [Crossref] [PubMed]
  20. Liu Y, Kim J, Qu F, et al. CT Features Associated with Epidermal Growth Factor Receptor Mutation Status in Patients with Lung Adenocarcinoma. Radiology 2016;280:271-80. [Crossref] [PubMed]
  21. Rizzo S, Petrella F, Buscarino V, et al. CT Radiogenomic Characterization of EGFR, K-RAS, and ALK Mutations in Non-Small Cell Lung Cancer. Eur Radiol 2016;26:32-42. [Crossref] [PubMed]
  22. Chen Y, Yang Y, Ma L, et al. Prediction of EGFR mutations by conventional CT-features in advanced pulmonary adenocarcinoma. Eur J Radiol 2019;112:44-51. [Crossref] [PubMed]
  23. Zou J, Lv T, Zhu S, et al. Computed tomography and clinical features associated with epidermal growth factor receptor mutation status in stage I/II lung adenocarcinoma. Thorac Cancer 2017;8:260-70. [Crossref] [PubMed]
  24. Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology 2016;278:563-77. [Crossref] [PubMed]
  25. Lambin P, Rios-Velazquez E, Leijenaar R, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer 2012;48:441-6. [Crossref] [PubMed]
  26. Rossi G, Barabino E, Fedeli A, et al. Radiomic Detection of EGFR Mutations in NSCLC. Cancer Res 2021;81:724-31. [Crossref] [PubMed]
  27. He R, Yang X, Li T, et al. A Machine Learning-Based Predictive Model of Epidermal Growth Factor Mutations in Lung Adenocarcinomas. Cancers (Basel) 2022;14:4664. [Crossref] [PubMed]
  28. Chen Q, Li Y, Cheng Q, et al. EGFR Mutation Status and Subtypes Predicted by CT-Based 3D Radiomic Features in Lung Adenocarcinoma. Onco Targets Ther 2022;15:597-608. [Crossref] [PubMed]
  29. Kim H, Park CM, Woo S, et al. Pure and part-solid pulmonary ground-glass nodules: measurement variability of volume and mass in nodules with a solid portion less than or equal to 5 mm. Radiology 2013;269:585-93. [Crossref] [PubMed]
  30. Song Y, Zhang J, Zhang YD, et al. FeAture Explorer (FAE): A tool for developing and comparing radiomics models. PLoS One 2020;15:e0237587. [Crossref] [PubMed]
  31. Sun Y, Li C, Jin L, et al. Radiomics for lung adenocarcinoma manifesting as pure ground-glass nodules: invasive prediction. Eur Radiol 2020;30:3650-9. [Crossref] [PubMed]
  32. Wang T, She Y, Yang Y, et al. Radiomics for Survival Risk Stratification of Clinical and Pathologic Stage IA Pure-Solid Non-Small Cell Lung Cancer. Radiology 2022;302:425-34. [Crossref] [PubMed]
  33. Shi Y, Au JS, Thongprasert S, et al. A prospective, molecular epidemiology study of EGFR mutations in Asian patients with advanced non-small-cell lung cancer of adenocarcinoma histology (PIONEER). J Thorac Oncol 2014;9:154-62. [Crossref] [PubMed]
  34. Sugano M, Shimizu K, Nakano T, et al. Correlation between computed tomography findings and epidermal growth factor receptor and KRAS gene mutations in patients with pulmonary adenocarcinoma. Oncol Rep 2011;26:1205-11. [PubMed]
  35. Jiang Y, Che S, Ma S, et al. Radiomic signature based on CT imaging to distinguish invasive adenocarcinoma from minimally invasive adenocarcinoma in pure ground-glass nodules with pleural contact. Cancer Imaging 2021;21:1. [Crossref] [PubMed]
Cite this article as: Ping X, Meng Q, Jiang N, Hu S. A predictive nomogram for EGFR mutation status in lung adenocarcinoma manifesting as ground-glass nodules. J Thorac Dis 2024;16(11):7477-7489. doi: 10.21037/jtd-24-1166

Download Citation