Computed tomography radiomics study of invasion and instability of lung adenocarcinoma manifesting as ground glass nodule

Wen-Zhao Zhang; Yao-Yun Zhang; Xin-Lin Yao; Pei-Ling Li; Xin-Yue Chen; Li-Yi He; Ji-Zhao Jiang; Jian-Qun Yu

doi:10.21037/jtd-24-27

Original Article

Computed tomography radiomics study of invasion and instability of lung adenocarcinoma manifesting as ground glass nodule

Wen-Zhao Zhang^1#, Yao-Yun Zhang^2#, Xin-Lin Yao², Pei-Ling Li³, Xin-Yue Chen⁴, Li-Yi He¹, Ji-Zhao Jiang⁵, Jian-Qun Yu¹

¹Department of Radiology, West China Hospital, Sichuan University, Chengdu, China; ²Department of Radiology, Sichuan Tianfu New Area People’s Hospital, Chengdu, China; ³Department of Critical Care Medicine, Chengdu Shangjin Nanfu Hospital, Chengdu, China; ⁴CT Collaboration, Siemens Healthineers, Chengdu, China; ⁵Customer Application Department, Siemens Healthineers, Chengdu, China

Contributions: (I) Conception and design: WZ Zhang, YY Zhang, JQ Yu; (II) Administrative support: JQ Yu; (III) Provision of study materials or patients: WZ Zhang, YY Zhang, XY Chen, PL Li, JZ Jiang; (IV) Collection and assembly of data: WZ Zhang, YY Zhang, LY He, XL Yao; (V) Data analysis and interpretation: WZ Zhang, JQ Yu; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

^#These authors contributed equally to this work as co-first authors.

Correspondence to: Jian-Qun Yu, MD. Department of Radiology, West China Hospital, Sichuan University, 37# Guoxue Xiang, Chengdu 610041, China. Email: cjr.yujianqun@vip.163.com.

Background: Ground-glass nodule (GGN) is the most common manifestation of lung adenocarcinoma on computed tomography (CT). Clinically, the success rate of preoperative diagnosis of GGN by puncture biopsy and other means is still low. The aim of this study is to investigate the clinical and radiomics characteristics of lung adenocarcinoma presenting as GGN on CT images using radiomics analysis methods, establish a radiomics model, and predict the classification of pathological tissue and instability of GGN type lung adenocarcinoma.

Methods: This study retrospectively collected 249 patients with 298 GGN lesions who were pathologically confirmed of having lung adenocarcinoma. The images were imported into the Siemens scientific research prototype software to outline the region of interest and extract the radiomics features. Logistic model A (a radiomics model to identify the infiltration of lung adenocarcinoma manifesting as GGNs) was established using features after the dimensionality reduction process. The receiver operating characteristic (ROC) curve of the model on training set and the verification set was drawn, and the area under the curve (AUC) was calculated. Second, a total of 112 lesions were selected from 298 lesions originating from CT images of at least two occasions, and the time between the first CT and the preoperative CT was defined as not less than 90 days. The mass doubling time (MDT) of all lesions was calculated. According to the different MDT diagnostic thresholds instability was predicted. Finally, their AUCs were calculated and compared.

Results: There were statistically significant differences in age and lesion location distribution between the “noninvasive” lesion group and the invasive lesion group (P<0.05), but there were no statistically significant differences in sex (P>0.05). Model A had an AUC of 0.89, sensitivity of 0.75, and specificity of 0.86 in the training set and an AUC of 0.87, sensitivity of 0.63, and specificity of 0.90 in the validation set. There was no significant difference statistically in MDT between “noninvasive” lesions and invasive lesions (P>0.05). The AUCs of radiomics models B₁, B₂ and B₃ were 0.89, 0.80, and 0.81, respectively; the sensitivities were 0.71, 0.54, and 0.76, respectively; the specificities were 0.83, 0.77, and 0.60, respectively; and the accuracies were 0.78, 0.65, and 0.69, respectively.

Conclusions: There were statistically significant differences in age and location of lesions between the “noninvasive” lesion group and the invasive lesion group. The radiomics model can predict the invasiveness of lung adenocarcinoma manifesting as GGNs. There was no significant difference in MDT between “noninvasive” lesions and invasive lesions. The radiomics model can predict the instability of lung adenocarcinoma manifesting as GGN. When the threshold of MDT was set at 813 days, the model had higher specificity, accuracy, and diagnostic efficiency.

Keywords: Ground glass nodule; radiomics; lung adenocarcinoma; doubling time (DT); instability

Submitted Jan 05, 2024. Accepted for publication May 17, 2024. Published online Jun 28, 2024.

doi: 10.21037/jtd-24-27

Highlight box

Key findings

• The radiomics model can predict the invasion of lung adenocarcinoma manifesting as ground glass nodules (GGN). The radiomics model can also predict the instability of GGN at first diagnosis.

What is known and what is new?

• Recent studies have focused on identifying benign and malignant nodules, determining infiltration and predicting genotype.

• This study aims to provide the mass doubling time (MDT) related information for GGNs that do not exhibit significant malignant signs.

What is the implication, and what should change now?

• The findings may help guide clinicians and imaging diagnostic physicians in formulating more reasonable follow-up computed tomography time schedule and personalized treatment plans.

Introduction

Lung cancer is the malignant tumor with the highest incidence rate and mortality in the world, and lung adenocarcinoma is the most common histological subtype of lung cancer (1). Computed tomography (CT) is the most commonly used method to detect lung cancer. Ground-glass nodule (GGN) is the most common manifestation of lung adenocarcinoma on CT. GGN refers to the nodular shadow with slightly higher cloud-like density on the CT image, with clear or blurred boundary. The bronchial and vascular edges can be seen in the focus. GGN usually does not appear on the mediastinal window or only shows the solid components of the focus (2).

Lung adenocarcinoma is divided into atypical adenomatous hyperplasia (AAH), adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA) and invasive adenocarcinoma (IAC) (3). Clinically, the success rate of preoperative diagnosis of GGN by puncture biopsy and other means is still low (4).

Radiomics is a new concept proposed in recent years. It converts traditional images to analyzable data, extracts feature data that are difficult to observe and distinguish with the naked eye, and performs quantitative and comprehensive analysis of radiological images (5). It has opened up a new perspective for the noninvasive diagnosis of tumors (6).

In recent years, radiomics has been widely used in various clinical stages. Many studies have confirmed that radiomics has unique advantages for cancer in the differentiation of benign and malignant tumors, pathological classification, evaluation of curative effects after adjuvant treatment and the risk of recurrence in cancer (7-10). In 2015, the World Health Organization (WHO) classification published by the International Agency for Cancer Research classified AAH and AIS as noninvasive lesions and MIA and IAC as invasive lesions (11). In 2021, the 5^th edition of the latest WHO Classification of Thoracic Tumors classifies AAH and AIS as glandular prodromal lesions and precancerous lesions, and lung adenocarcinoma as MIA and IAC (12). Although MIA is classified as an invasive disease, due to its lack of blood vessels, tumor necrosis or pleural invasion, several reports have noted that the 5-year tumor-free survival rate of MIA, AAH and AIS surgical resection patients is close to 100%, while the 5-year tumor-free survival rate of IAC after surgery is reduced to approximately 90%, of which the highest malignant degree of IAC with nipple components can be reduced to 79% (13). Therefore, GGNs with a pathological tissue type of IAC should be removed as soon as possible. Since the dynamic development of tumor histology is irreversible, the dynamic follow-up of lesions is particularly important, especially the need to be alert to the transformation of MIA to IAC (14). At present, regular CT follow-up is the main means to observe the development of GGN lesions. The way to determine the interval and follow-up period is more based on experience, which causes uncertainty in the timing of clinical treatment. Moreover, CT follow-up without scientific guidance would also cause the continuous accumulation of radiation damage, as well as a certain burden on the psychological and social economy aspects of patients. If surgical resection is selected for all GGNs, it will lead to excessive medical treatment, waste of medical resources and unnecessary iatrogenic injury to patients. At present, the progress of the lesions is determined by calculating the volume doubling time (VDT) of GGN. However, the volume of GGN may not change significantly over a period of time, but their density has changed. This change is manifested as the presence of solid components on CT. Solid components have been shown to be closely related to infiltrative segmentation in pathological sections (15). So, the mass doubling time (MDT) has been proposed (16). However, the follow-up of GGN is only evaluated by doubling time (DT), and further research is needed.

The objective of this study is to establish a model to predict the invasiveness and instability of nodules using the clinical and radiographic features of lung adenocarcinoma presented as GGN on CT images. The prediction results obtained by the models can be used as the basis for individual follow-up strategy decision-making. We present this article in accordance with the TRIPOD reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-27/rc).

Methods

Population

Patients with lung adenocarcinoma confirmed by surgery and pathology in our hospital from January 2014 to August 2022 were retrospectively enrolled. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study received approval from the Institutional Ethics Committee of West China Hospital of Sichuan University (approval No. 2022-564). Due to the retrospective nature of the study, the need for informed consent was waived by the ethics committee of West China Hospital of Sichuan University.

Inclusion criteria: (I) the patient had CT images within 1 week before surgery; (II) after surgical resection in our hospital, the pathological results were all from the official report of gross specimen; (III) for multiple lesions, there is corresponding relationships between the postoperative pathological results and lesions on CT images; (IV) CT slice thickness is 1 mm; (V) no puncture, radiotherapy and chemotherapy were performed before operation.

Exclusion criteria: (I) unqualified image delineation: the basic condition of the patients’ lung field is poor; large area of infection showed in the lung field; or vague boundary between the lung field and the nodule; (II) inconsistent slices for an image due to data lost or incomplete reconstruction process; (III) inadequate image quality with artefacts or noise, that impede diagnosis; (IV) patients were with other malignant tumors combined.

After screening with both the inclusion and exclusion criteria, a total of 249 patients were included in the study, with a total of 298 cases of GGN lesions. Among them, 81 lesions belonged to male patients and 217 lesions belonged to female patients, aged 53.8±10.6 years. A total of 298 GGN lesions were selected. Among them, 25 were AAH, 20 were AIS, 132 were MIA and 121 were IAC. In this study, AAH, AIS and MIA were defined as non-invasive lesions, and IAC were defined as invasive lesions. A total of 298 lesions were divided into two groups (Table 1).

Table 1

Summary of the pathological classification of 298 GGNs

Classification	Number of GGN (%)
Non-invasive lesion	177 (59.4)
AAH	25 (8.4)
AIS	20 (6.7)
MIA	132 (44.3)
Invasive lesion	121 (40.6)
IAC	121 (40.6)

GGN, ground glass nodule; AAH, atypical adenomatous hyperplasia; AIS, adenocarcinoma in situ; MIA, minimally invasive adenocarcinoma; IAC, invasive adenocarcinoma.

Subsequently, cases with qualified time-series CT information from the GGN lesions included were picked up, and the criterion of time difference between the first CT and the last CT before the operation was ≥90 days. A total of 112 lesions were found; 34 lesions were in male patients, 78 lesions were in female patients, and the age of the first diagnosis was 52.4±11.3 years (range, 28–79 years). The computational MDT method was used to represent the DT of tumors (16). MDT can be calculated by the following formula:

$M (mg) = V \times (H + 1000) \times 0.001$ [1]

$MDT = \log 2 \times T / \log (M_{i} / M_{0})$ [2]

M: mass; mg: milligram (the unit of mass); V: the volume of nodules (mm³) was automatically obtained by computer software; H: the average CT value of nodules was automatically obtained by computer software; M_i: quality of the last (preoperative) GGN; M₀: quality of the first GGN; T: follow-up interval (days).

Scanning technology and inspection parameters

All patients were scanned by second-generation dual-source CT (Somatom Definition Flash, Siemens Healthcare, Germany). Scanning method: supine position was used during scanning, with head first and hand raised. Deep inspiration with breath holding was instructed to ensure the adequate lung field depiction. The scanning range was from the level of the thoracic entrance to the bilateral costophrenic angle (including the bilateral adrenal glands). Image parameters, routinely used by the hospital, were as follows: tube voltage of 120 kVp with automatic tube current regulation technology (CARE DOSE 4D), the pitch value of 0.85, and the collimator width of 0.6 mm. Images were then reconstructed with 1 mm slice thickness by kernel B36f and D70f for soft and lung tissue depiction, respectively. Suitable window viewing with width and level was set accordingly by radiologist with respect to diagnostic tasks. All the images were stored as DICOM format.

Image postprocessing and data analysis

All DICOM images were imported into the Siemens scientific research prototype software (Radiomics) on Frontier platform. First, the software’s automatic segmentation algorithm (random walker-based) was used to roughly identify the respective nodules of interest. Then the software’s sketching tool (nudging and brushing) allows for manual edge modification of the segmented nodules, to ensure no normal lung tissue in the regions such as the cord, blood vessels or pleura. As Pyradiomic feature were embedded in the software, after segmentation, voxel-wise analysis could be applied by the software. Extracted image features included the first-order, morphological and texture features [gray level difference method (GLDM), gray level co-occurrence matrix (GLCM), gray level emphasis level matrix (GLELM), gray level size zone matrix (GLSZM), neighborhood gray tone difference matrix (NGTDM)].

The final model retained only the most pertinent features. Initially, the feature selection process targeted invasiveness discrimination, employing the least absolute shrinkage and selection operator (LASSO) method via the glmnet package in R to eliminate collinearity. Subsequently, the minimum redundancy maximum relevance (mRMR) algorithm was implemented to pinpoint a subset of these features that maximized relevance while minimizing redundancy. The subset was limited to a maximum size of 5 selected features. Further refinement involved testing the coefficients of the final regression model against invasive labels, whereby a P value exceeding 0.05 indicated the presence of a confounding factor, prompting its removal. For instability prediction, feature selection began with a random forest approach, ranking features by their Genetic Inheritance Index (GINI). The top five features were then combined using a support vector model for predictive analysis.

Both the LASSO and Random Forest algorithms had parameter tuning within predetermined ranges. Optimal configurations were trained based on 5 repetitions of 10-fold cross-validation, aimed at maximizing the area under the curve (AUC) value, which serves as the benchmark for optimal performance. To achieve a balance between computational complexity and predictive accuracy, logistic regression (for invasiveness prediction) and support vector machines (for instability prediction) were employed, utilizing an optimal number of features selected based on their AUC values that closely approximate the optimal performance.

The logistic models underwent three types of testing: using the training data, averaging results from 50 resampled training datasets, and validation with independent data. However, due to the limited sample size for instability, its model was tested solely on the training data and through averaging results from 50 resampled training datasets to demonstrate its generalization ability.

Statistical analysis

Qualitative variables (gender and location distribution) were assessed by Chi-squared test or Fisher’s exact test between groups. The distribution of the quantitative parameters (age, follow-up interval and MDT) and histological characteristics were checked, and were summarized as the mean ± standard deviation if they were normally distributed. Otherwise, median and range were used for the summarization. Independent sample t-test or the nonparametric Mann-Whitney U test was selected for group comparison accordingly with respective to their distribution. The diagnostic performance of the model was assessed by the receiver operating characteristic (ROC) curve, with the AUC and its 95% confidence interval by the bootstrap repeated sampling technique calculated. The differences in the performance of two models were checked by the Delong test. The calibration curve was drawn to show the accordance between the model output and actual class probability. P<0.05 indicated statistically significant difference. All statistical analysesd were performed on SPSS 21.0 software.

Results

The relationship between radiomics characteristics and GGN infiltration in GGN-type lung adenocarcinoma

A total of 298 cases of GGN lesions were included in this study, of which 81 (27.2%) were male patients and 217 (72.8%) were female patients, aged 53.8±10.6 years; 177 (59.4%) were noninvasive lesions (25 AAH, 20 AIS, 132 MIA), and 121 (40.6%) were invasive lesions (IAC). Slice CT images of the chest are shown in Figures 1,2.

Figure 1 A 47-year-old male with mGGN in the left upper lobe of the lung is shown on thin slice CT of the chest in cross section (A,B), sagittal plane (C), and coronal plane (D). Vascular passage shadows are visible within the lesion. Postoperative pathological diagnosis indicated that the nodule was MIA. mGGN, mixed ground-glass nodule; CT, computed tomography; MIA, minimally invasive adenocarcinoma.

Figure 2 Thin-layer CT images of “non” invasive lesions and invasive lesions. (A) Female, 60 years old, with mGGN in the left upper lobe of the lung. The postoperative pathological diagnosis of this nodule was AAH. (B) Male, 45 years old, with pGGN in the right lower lobe of the lung. Postoperative pathological diagnosis showed that the nodule was AIS. (C) Female, 50 years old, with pGGN in the left upper lobe of the lung. Postoperative pathological diagnosis showed that the nodule was MIA. (D) Female, 35 years old, with mGGN in the left lower lobe of the lung. Postoperative pathological diagnosis showed that the nodule was IAC. CT, computed tomography; mGGN, mixed ground-glass nodule; AAH, atypical adenomatous hyperplasia; pGGN, pure ground-glass nodule; AIS, adenocarcinoma in situ; MIA, minimally invasive adenocarcinoma; IAC, invasive adenocarcinoma.

Clinical characteristics comparison between noninvasive lesions and invasive lesions

The statistical results of the differences between groups about gender, age, and lesion distribution are shown in Table 2. Age and lesion location were found to be different between groups with P<0.05, while the difference of sexuality distribution was found to be insignificant (P>0.05).

Table 2

Comparison of basic clinical characteristics between noninvasive lesions and invasive lesions

Feature	Noninvasive lesions (n=177)	Infiltrating lesion (n=121)	Statistical value	P value
Gender (example), n (%)			0.056^†	0.81
Male	49 (27.7)	32 (26.4)
Female	128 (72.3)	89 (73.6)
Age (years), mean ± SD	51.6±10.4	56.9±10.1	−4.419^‡	<0.001
Location of the lesion, n (%)			12.707^§	0.01
Left superior lobe of lung	50 (28.2)	31 (25.6)
Left inferior pulmonary lobe	23 (13.0)	17 (14.0)
Right superior lobe of lung	61 (34.5)	44 (36.4)
Middle lobe of right lung	13 (7.3)	0 (0.0)
Right lower lobe of lung	30 (17.0)	29 (24.0)

^†, Chi-squared test; ^‡, independent sample t-test; ^§, Fisher’s exact test. SD, standard deviation.

Calculated radiomic features selection with association to the histological subtype of the GGN-type nodules

Data extraction, histological feature screening, and model establishment

In this study, we used a stratified random sampling method to split our data (n=298) into training (n=180) and validation cohorts (n=118) with a ratio of 6:4. The distribution of the non-invasive and invasive nodules, the two classes under study, was the same in both the training and validation cohorts. In the training cohort, there were 107 non-invasive and 73 invasive nodules, while in the validation cohort, the distribution was 70 non-invasive and 48 invasive nodules. A total number of 854 features were calculated after the 3D segmentation of each nodule using the Siemens scientific research prototype software. After feature selection by LASSO regression model, a total of 80 features were kept. Final model A for non-invasive and invasive nodule differentiation enrolled four features, including wavelet, was ultimately built. HLH_glcm_MCC, wavelet.HLH_firstorder_Maximum, original_glcm_Correlation and origin_shape_Compactness1. The formula for this model is expressed as:

$\begin{array}{l} R = - 0.9709 - 1.3788 \times wavelet .HLH_glcm_MCC \\ + 0.6684 \times wavelet .HLH_firstorder_Maximum \\ + 0.74168 \times original_glcm_Correlation \\ - 0.6415 \times original_shape_Compactness \end{array}$ [3]

If R≥0.5, the lesion is invasive; otherwise, it is noninvasive. This model A for predicting infiltration could also be presented as a nomogram (Figure 3).

Figure 3 Nomogram of radiomics model A established from the training set data.

Evaluation and effectiveness of radiomics model A in the training and validation sets

The averaged AUC value was 0.89 [95% confidence interval (CI): 0.84–0.94], and the model exhibited a sensitivity of 0.75, a specificity of 0.86, and accuracy of 0.82 (Figure 4). In the validation set, the ROC evaluation of model A yielded an AUC value of 0.87 (95% CI: 0.81–0.93), with a sensitivity of 0.63, specificity of 0.90, and an accuracy of 0.79 (Figure 4).

Figure 4 Analysis of ROC curves for the differentiation of noninvasive lesions from invasive lesions using radiographic model A in the training set (A) and the validation set (B). ROC, receiver operating characteristic.

Both calibration curves, tested in training and validation cohorts, showed good to perfect accordance between the predicted probability with the actual incidence. This indicates that the output of the formula can be interpreted as the probability of the studied class (Figure 5).

Figure 5 The relationship between the predicted results of the nomograms in the training set (A) and the validation set (B) and the actual occurrence of invasive lesions. ROC, receiver operating characteristic.

The relationship between radiomics characteristics and GGN instability in GGN-type lung adenocarcinoma

The study included a total of 112 GGN lesions, with 34 lesions belonging to male patients and 78 lesions belonging to female patients. The age at initial diagnosis was 52.4±11.3 years, ranging from 28 to 79 years. Of the total lesions, 75 were non-invasive lesions and 37 invasive. The follow-up interval ranged from 90 to 2,190 days, with a median of 235 days, and the median MDT was 1,047 days (156.3, +∞ days). Thin slice CT images of the chest are shown in Figures 6-8.

Figure 6 Male, 37 years old at first diagnosis, left upper lobe mGGN, postoperative pathological diagnosis of this nodule as MIA. (A) The baseline volume measured by Radiomics software was 284 mm³, and the mass was 177 mg. (B) Radiomics software measured a volume of 306 mm³ and a mass of 200 mg after 189 days of follow-up. The MDT of this nodule was calculated to be 1,067 days. mGGN, mixed ground-glass nodule; MIA, minimally invasive adenocarcinoma; MDT, mass doubling time.

Figure 7 Male, initial diagnosis age: 45 years old, left upper lobe pGGN, postoperative pathological diagnosis: MIA. (A) The baseline volume measured by Radiomics software was 140 mm³, and the mass was 60 mg. (B) Radiomics software measured a volume of 488 mm³ and a mass of 165 mg after 654 days of follow-up. The MDT of this nodule was calculated to be 450 days. pGGN, pure ground-glass nodule; MIA, minimally invasive adenocarcinoma; MDT, mass doubling time.

Figure 8 Male, initially diagnosed at 54 years old, has pGGN in the right upper lobe of the lung. Postoperative pathological diagnosis showed that the nodule was MIA. (A,B) The baseline volume measured by Radiomics software was 262 mm³, with a mass of 83 mg. (C,D) Radiomics software measured a volume of 422 mm³ and a mass of 138 mg after 326 days of follow-up. The MDT of this nodule was calculated to be 441 days. pGGN, pure ground-glass nodule; MIA, minimally invasive adenocarcinoma; MDT, mass doubling time.

Comparison of basic clinical characteristics between noninvasive lesions and invasive lesions

The statistical results for the follow-up interval and MDT are presented in Table 3, with both measures reported as median and interquartile range. There was no statistically significant difference observed between the noninvasive lesion and invasive lesion groups (P>0.05).

Table 3

Comparison of follow-up intervals and MDT between noninvasive lesions and invasive lesions

Feature	Noninvasive lesions (n=75)	Infiltrating lesion (n=37)	Statistical value	P value
Follow-up interval (days)	235 (150, 433.5)	216 (125, 645)	–	0.75
MDT (days)	1,047 (156.3, +∞)	925 (172.0, +∞)	–	0.92

Data are presented as median (25th, 75th percentile). –, Mann-Whitney U test; +∞, indicates that the quality has not increased during the follow-up process, and the MDT is infinite. MDT, mass doubling time.

We used diagnostic thresholds of MDT >813, 1,026, and 1,170 days for stable and unstable nodules, respectively, as suggested by previous studies (17), to classify the nodules. The statistical results are shown in Table 4.

Table 4

Statistics of cases of stable and unstable nodules

Group	Group
Group	B₁	B₂	B₃
Diagnostic threshold MDT (days)	813	1,026	1,170
Stable nodules (cases)	63 (56.3)	56 (50.0)	53 (47.3)
Unstable nodules (cases)	49 (43.7)	56 (50.0)	59 (52.7)
Total (example)	112	112	112

Data are presented as n or n (%). B₁ contains three first-order and three texture features; B₂ contains three first-order and three texture features; B₃ contains 2 first-order and 4 texture features. MDT, mass doubling time.

Extraction and selection of radiomic features and establishment of radiomics labels

Using the diagnostic threshold for unstable nodules at MDT value of 813, 1,026, and 1,170 days, the importance ranking based on GINI index of random forest models is presented in Figure 9. The top six features were selected and are presented in Table 5. Support vector machines were then used to establish radiomics models B₁, B₂, and B₃ for predicting instability.

Figure 9 Feature importance ranking of groups B₁ (A), B₂ (B), and B₃ (C) sorted by the magnitude of GINI index based on random forest modelling. B₁ contains three first-order and three texture features; B₂ contains three first-order and three texture features; B₃ contains 2 first-order and 4 texture features. GINI, Genetic Inheritance Index.

Table 5

Statistics of the histological characteristics included in imaging radiomics model B for predicting instability

Radiomics models	The most relevant feature
Radiomics model B₁	wavelet.LHH-glcm-ClusterShade
	wavelet.LLH-glcm-ClusterShade
	wavelet.HLH-firstorder-Median
	original-gldm-DependenceVariance
	wavelet.HHH-firstorder-Skewness
	wavelet.LHL-firstorder-10Percentile
Radiomics model B₂	wavelet.HHH-firstorder-Median
	wavelet.LLH-glszm-SizeZoneNonUniformityNormalized
	wavelet.HHL-firstorder-Median
	wavelet.LHH-glcm-ClusterShade
	wavelet.HLH-glcm-ClusterShade
	wavelet.LHL-firstorder-Kurtosis
Radiomics model B₃	wavelet.LLH-glcm-ClusterShade
	wavelet.LHH-glcm-ClusterShade
	wavelet.LHH-glszm-ZoneVariance
	wavelet.HLH-firstorder-Median
	wavelet.HLH-glcm-MCC
	wavelet.LLH-firstorder-Skewness

Model B₁ contains three first-order and three texture features; model B₂ contains three first-order and three texture features; model B₃ contains 2 first-order and 4 texture features.

The ROC curve analysis between different models (model B₁, B₂ and B₃) is shown in Table 6 (Figure 10). The highest accuracy and AUC values were observed for the model using images to predict instability (model B₁) when the MDT threshold was set at 813 days. The output of model B₁ showed good to perfect accordance with the actual incidence of the data, as demonstrated in Figure 11.

Table 6

Analysis of ROC curves for different radiomics models predicting instability

Histological model	Sensitivity	Specificity	Accuracy rate	Kappa	AUC (95% CI)	Averaged AUC (95% CI)
B₁	0.71	0.83	0.78	0.54	0.89 (0.83–0.94)	0.72 (0.62–0.81)
B₂	0.54	0.77	0.65	0.30	0.80 (0.72–0.88)	0.62 (0.51–0.72)
B₃	0.76	0.60	0.69	0.37	0.81 (0.73–0.89)	0.67 (0.57–0.78)

Model B₁ contains three first-order and three texture features; model B₂ contains three first-order and three texture features; model B₃ contains 2 first-order and 4 texture features. ROC, receiver operating characteristic; AUC, area under the curve; CI, confidence interval.

Figure 10 The analysis of ROC curves for identifying stable and unstable nodules using radiomics models (A,B) B₁ (A₁ tested on training, A₂ using 50 resamples), (C,D) B₂ (B₁ tested on training, B₂ using 50 resamples), and (E,F) B₃ (C₁ tested on training, C₂ using 50 resamples). ROC, receiver operating characteristic.

Figure 11 The relationship between the predicted results of imaging group model B₁ and the calculated occurrence of unstable nodules. The X-axis represents the incidence of instability predicted by the histological model, and the Y-axis represents the actual calculated incidence of unstable nodules. The gray solid line represents the ideal prediction line of the model, and the black solid line represents the actual prediction results of the model. ROC, receiver operating characteristic.

Discussion

Currently, with the widespread application of low-dose multislice spiral computed tomography (LDCT) and the gradual enhancement of general public’s awareness of physical examination, an increasing number of small intrapulmonary nodules have been found. However, the traditional imaging modalities for judging benign and malignant nodules are greatly influenced by the subjective impact of the diagnostic physician, and when the nodules are too small or the imaging characteristics are not obvious, the diagnostic accuracy will significantly decrease. Several studies have shown that compared to traditional CT signs, radiomics has a higher accuracy in differentiating benign and malignant GGNs and predicting the pathological classification of GGN-type lung adenocarcinoma (7,9). In this study, a computer-based model was established to analyze GGNs to predict the invasion and instability of GGN-type lung adenocarcinoma.

A total of 249 patients with 298 GGN lesions were included in this study, all of whom were pathologically confirmed of having lung adenocarcinoma, including 177 noninvasive lesions (25 AAH, 20 AIS, 132 MIA) and 121 invasive lesions (IAC). The research results showed that the ages of patients with noninvasive lesions and invasive lesions were 51.6±10.4 and 56.9±10.1 years, respectively, with significant differences. The age of patients with invasive lesions was greater than that of patients with noninvasive lesions, which is the same as previous scholars’ research results (18). In this study, although the number of female patients were significantly higher overall than male patients, the results showed that there was no statistically significant difference between the groups, and there was no correlation between the invasive nature of the lesions and the sex of the patients. In addition, in this study, a total of 62.4% (186/298) of the lesions were located in the upper lobe, which may be because this site is more prone to inhaling more carcinogens (19).

This study was based on a single factor analysis, removing information redundancy, and selecting “wavelet. HLH_glcm_MCC”, “wavelet.HLH_firstorder_maximum”, “original_glcm_correlation” and “origin_shape_compactness1”, these four imaging features include a first order (maximum histogram value of the image after wavelet smoothing filtering), a morphology (compactness of the original image 1), and two texture [maximum correlation coefficient (MCC) of the gray level co-occurrence matrix of the image after wavelet smoothing filtering and autocorrelation of the gray level co-occurrence matrix of the original image] features, which constitute a logistic model to distinguish “non” invasive lesions from invasive lesions in GGN. The results show that radiomics model A, composed of the above features, has excellent diagnostic effectiveness in both the training and validation sets (AUC >0.85), with a maximum sensitivity of 0.75 and a maximum specificity of 0.90. This also confirms that compared to traditional imaging diagnosis, which mainly relies on radial measurement to evaluate pulmonary nodules, the imaging omics model performs three-dimensional measurements by involving the number of voxels in the axial region of interest (ROI) region layer by layer to obtain data that are more consistent with the actual volume of the nodules and is also more reliable in the calculation of solid components (20).

With the increase in the detection rate of pulmonary nodules, the general accepted treatment for GGN with a low risk of malignancy is to conduct regular CT follow-up observations to assess the growth characteristics. DT refers to the time required for tumor volume or cell number to double, representing the activity and invasiveness of tumor cells, and is an important indicator of tumor growth characteristics (21). The literature reports that volume and mass are quantifiable indicators that reflect more sensitive and repeatable growth of pulmonary nodules (22,23).

$VDT = \log 2 \times T / \log (V_{i} / V_{0})$ [4]

V_i, and V₀ refer to the volume of the last (preoperative) and first GGN, respectively, and T refers to the follow-up interval. The traditional method of measuring volume by two-dimensional diameter is first to measure the diameter of a nodule and then calculate the volume of the nodule using a sphere or ellipsoid formula. However, both accuracy and repeatability are poor, and the error of the calculated results is greater when the nodule shape is irregular. In this study, based on the difference in CT values between nodules and surrounding lung tissue, threshold segmentation and manual modification were used to identify nodules, and the pixels within the nodules were statistically analyzed. By converting the size and number of pixels within the pulmonary nodules into volumes, the results obtained were more consistent with the actual volume of the nodules, reducing bias caused by different observers and having better clinical application value. This result was also confirmed in the study by Yankelevitz et al. (24).

Through research, it has been found that the MDT calculated during follow-up varies with the density of the first diagnosis of pulmonary nodules. Yuan et al. (17) conducted 3-year follow-up imaging statistics on 82 cases of small lung cancer screened by LDCT and found that the VDT of solid nodule, mixed ground-glass nodule (mGGN), and pure ground-glass nodule (pGGN) were 149±125, 457±260, and 813±375 days, respectively. Research by Oda et al. (22) showed that the average VDT for mGGN and pGGN based on 3D volume measurements was 276.9±155.9 and 628.5±404.2 days, respectively. However, for a large number of GGNs screened from physical examinations, multiple research results have shown that the growth of nonsolid nodules is manifested not only in volume growth but also in changes in density (25,26). Among them, the initial growth pattern of some GGNs only shows an increase in the density within the lesion, while the volume does not change. Kakinuma et al. (27) found that even though the volume of mGGN was stable, the increase in the solid portion often suggested the possibility of malignancy. Moreover, the growth of some tumors may be accompanied by the collapse of the alveolar cavity, and some GGNs may instead show a slight reduction in volume (5). Therefore, in recent years, attention has been given to MDT as a new reliable indicator (16,28). This calculation method of the MDT combines the changes in the internal density of the GGN and the increase or decrease in the infiltration range, which is more sensitive than the VDT. The subjects in this study all had nonsolid lesions (pGGN and mGGN), so the MDT index was ultimately chosen to observe the changes in the lesions over the time in this study.

This study screened 112 samples from 298 GGN lesions that met the requirements, with a median MDT of 539 days. Comparing noninvasive lesions with invasive lesions, the results showed that there was no significant difference in follow-up interval or MDT. Previous studies have shown that DT is of certain value in differentiating pathologic subtypes of invasive lung adenocarcinoma (16). However, this study did not find a statistically significant difference in MDT between noninvasive lesions and invasive lesions. Currently, there are relatively few reports of DT studies on the pathological subtypes of invasive lung adenocarcinoma, and it is expected to increase the sample size in future studies to explore the correlation between the pathological subtypes of invasive lung adenocarcinoma and MDT.

To explore the relationship between MDT and the prognosis of patients with lung cancer after surgery, this study extracted and reduced the dimensions of 78 relevant features from the first diagnosis image. According to the different MDT diagnostic thresholds for unstable nodules (813, 1,026, and 1,170 days), the six features with the highest correlation were selected, and the imaging group models B₁, B₂, and B₃ were constructed using support vector machines. Model B₁ contains three first-order and three texture features; model B₂ contains three first-order and three texture features; model B₃ contains 2 first-order and 4 texture features; and three of the models contain first-order median features and texture ClusterShade features. The ROC curves of the three histological models were analyzed and compared. The results showed that the three models in imaging histological model B had good diagnostic efficacy in predicting the instability of nodules (AUC ≥0.80), and imaging histological model B₁ had the highest diagnostic efficacy (AUC =0.89) and higher specificity (0.83), accuracy (0.78), and sensitivity (0.71). This study concluded that when the diagnostic threshold for MDT was set to 813 days, the effectiveness of identifying instability in GGNs was higher.

Limitations

There are certain limitations in this study: This study is a retrospective study, and there is a certain bias in the inclusion of research data. This study adopts a semiautomatic segmentation method. The subjective bias of the operator has a certain impact on the segmentation of the lesion. Multiple tests were conducted to minimize such bias. In this study, some nodules located at the edge of the pleura did not include the surrounding pleura when the ROI boundary of the nodules was delineated. Therefore, characteristic information related to the pleura may not be recognized by imaging features, requiring further exploration. The radiomics model A obtained in this study for predicting GGN infiltration, which does not incorporate clinical and CT morphological features, is expected to be completed in further research, and further testing of its effectiveness in multicenter studies is also needed.

Conclusions

This study retrospectively analyzed the basic clinical and CT texture characteristics of patients with GGN-type lung adenocarcinoma using an imaging group analysis method to establish an imaging group model and predict the possibility of pathological invasion and instability of GGNs. There was a statistically significant difference in age and distribution of lesions in the lung between the noninvasive lesion and invasive lesion groups. The radiomics model can predict the invasion of GGN-type lung adenocarcinoma. There was no significant difference in MDT between noninvasive lesions and invasive lesions. The radiomics model can predict the instability of GGN-type lung adenocarcinoma. When the MDT threshold was set to 813 days, the model had higher specificity, accuracy, and diagnostic efficiency.

For pulmonary nodules with an initial diagnosis of GGN, without the assistance of follow-up CT data, this study constructed a histologic model B₁ to predict the probability of instability of the nodules and to predict whether their MDT is less than our set threshold of 813 days. The purpose of this study is to provide MDT-related information for GGNs that do not exhibit significant malignant signs (such as noninvasive lesions predicted using imaging group model A) through group model B₁ and to guide clinicians and imaging diagnostic physicians in formulating more reasonable follow-up CT time schedule and personalized treatment plans.

Acknowledgments

Funding: None.

Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-27/rc

Data Sharing Statement: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-27/dss

Peer Review File: Available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-27/prf

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://jtd.amegroups.com/article/view/10.21037/jtd-24-27/coif). X.Y.C. and J.Z.J. are employees of Siemens Healthineers company. The other authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). The study received approval from the Institutional Ethics Committee of West China Hospital of Sichuan University (approval No. 2022-564). Due to the retrospective nature of the study, the need for informed consent was waived by the ethics committee of West China Hospital of Sichuan University.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

References

Bray F, Ferlay J, Soerjomataram I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2018;68:394-424. [Crossref] [PubMed]
Austin JH, Müller NL, Friedman PJ, et al. Glossary of terms for CT of the lungs: recommendations of the Nomenclature Committee of the Fleischner Society. Radiology 1996;200:327-31. [Crossref] [PubMed]
Sawada S, Komori E, Nogami N, et al. Evaluation of lesions corresponding to ground-glass opacities that were resected after computed tomography follow-up examination. Lung Cancer 2009;65:176-9. [Crossref] [PubMed]
Shimizu K, Ikeda N, Tsuboi M, et al. Percutaneous CT-guided fine needle aspiration for lung cancer smaller than 2 cm and revealed by ground-glass opacity at CT. Lung Cancer 2006;51:173-9. [Crossref] [PubMed]
Rizzo S, Botta F, Raimondi S, et al. Radiomics: the facts and the challenges of image analysis. Eur Radiol Exp 2018;2:36. [Crossref] [PubMed]
Conti A, Duggento A, Indovina I, et al. Radiomics in breast cancer classification and prediction. Semin Cancer Biol 2021;72:238-50. [Crossref] [PubMed]
Rozynek M, Tabor Z, Kłęk S, et al. Body composition radiomic features as a predictor of survival in patients with non-small cellular lung carcinoma: A multicenter retrospective study. Nutrition 2024;120:112336. [Crossref] [PubMed]
Libling WA, Korn R, Weiss GJ. Review of the use of radiomics to assess the risk of recurrence in early-stage non-small cell lung cancer. Transl Lung Cancer Res 2023;12:1575-89. [Crossref] [PubMed]
Fan L, Fang M, Li Z, et al. Radiomics signature: a biomarker for the preoperative discrimination of lung invasive adenocarcinoma manifesting as a ground-glass nodule. Eur Radiol 2019;29:889-97. [Crossref] [PubMed]
Granata V, Fusco R, Setola SV, et al. CT-Based Radiomics Analysis to Predict Histopathological Outcomes Following Liver Resection in Colorectal Liver Metastases. Cancers (Basel) 2022;14:1648. [Crossref] [PubMed]
Rami-Porta R, Bolejack V, Crowley J, et al. The IASLC Lung Cancer Staging Project: Proposals for the Revisions of the T Descriptors in the Forthcoming Eighth Edition of the TNM Classification for Lung Cancer. J Thorac Oncol 2015;10:990-1003.
Borczuk AC, Cooper WA, Dacic S, et al. WHO classification of tumours of thoracic tumours. 5th edition. Lyon: IARC Press; 2021:1-565.
Travis WD, Brambilla E, Noguchi M, et al. International Association for the Study of Lung Cancer/American Thoracic Society/European Respiratory Society: international multidisciplinary classification of lung adenocarcinoma: executive summary. Proc Am Thorac Soc 2011;8:381-5. [Crossref] [PubMed]
Kao TN, Hsieh MS, Chen LW, et al. CT-Based Radiomic Analysis for Preoperative Prediction of Tumor Invasiveness in Lung Adenocarcinoma Presenting as Pure Ground-Glass Nodule. Cancers (Basel) 2022;14:5888. [Crossref] [PubMed]
Liao JH, Amin VB, Kadoch MA, et al. Subsolid pulmonary nodules: CT-pathologic correlation using the 2011 IASLC/ATS/ERS classification. Clin Imaging 2015;39:344-51. [Crossref] [PubMed]
Song YS, Park CM, Park SJ, et al. Volume and mass doubling times of persistent pulmonary subsolid nodules detected in patients without known malignancy. Radiology 2014;273:276-84. [Crossref] [PubMed]
Yuan M, Zhao Y, Arkenau HT, et al. Signal pathways and precision therapy of small-cell lung cancer. Signal Transduct Target Ther 2022;7:187. [Crossref] [PubMed]
Yang L, Zhang Q, Bai L, et al. Assessment of the cancer risk factors of solitary pulmonary nodules. Oncotarget 2017;8:29318-27. [Crossref] [PubMed]
Winer-Muram HT, Jennings SG, Tarver RD, et al. Volumetric growth rate of stage I lung cancer prior to treatment: serial CT scanning. Radiology 2002;223:798-805. [Crossref] [PubMed]
Bae KT, Fuangtharnthip P, Prasad SR, et al. Adrenal masses: CT characterization with histogram analysis method. Radiology 2003;228:735-42. [Crossref] [PubMed]
Qu R, Ye F, Hu S, et al. Distinct cellular immune profiles in lung adenocarcinoma manifesting as pure ground glass opacity versus solid nodules. J Cancer Res Clin Oncol 2023;149:3775-88. [Crossref] [PubMed]
Oda S, Awai K, Murao K, et al. Volume-doubling time of pulmonary nodules with ground glass opacity at multidetector CT: Assessment with computer-aided three-dimensional volumetry. Acad Radiol 2011;18:63-9. [Crossref] [PubMed]
Oda S, Awai K, Murao K, et al. Computer-aided volumetry of pulmonary nodules exhibiting ground-glass opacity at MDCT. AJR Am J Roentgenol 2010;194:398-406. [Crossref] [PubMed]
Yankelevitz DF, Yip R, Smith JP, et al. CT Screening for Lung Cancer: Nonsolid Nodules in Baseline and Annual Repeat Rounds. Radiology 2015;277:555-64. [Crossref] [PubMed]
Mazzone PJ, Lam L. Evaluating the Patient With a Pulmonary Nodule: A Review. JAMA 2022;327:264-73. [Crossref] [PubMed]
Rampinelli C, Origgi D, Bellomi M. Low-dose CT: technique, reading methods and image interpretation. Cancer Imaging 2013;12:548-56. [Crossref] [PubMed]
Kakinuma R, Ohmatsu H, Kaneko M, et al. Progression of focal pure ground-glass opacity detected by low-dose helical computed tomography screening for lung cancer. J Comput Assist Tomogr 2004;28:17-23. [Crossref] [PubMed]
de Hoop B, Gietema H, van de Vorst S, et al. Pulmonary ground-glass nodules: increase in mass as an early indicator of growth. Radiology 2010;255:199-206. [Crossref] [PubMed]

Cite this article as: Zhang WZ, Zhang YY, Yao XL, Li PL, Chen XY, He LY, Jiang JZ, Yu JQ. Computed tomography radiomics study of invasion and instability of lung adenocarcinoma manifesting as ground glass nodule. J Thorac Dis 2024;16(6):3828-3843. doi: 10.21037/jtd-24-27

Computed tomography radiomics study of invasion and instability of lung adenocarcinoma manifesting as ground glass nodule

Highlight box

Introduction

Methods

Population

Table 1

Scanning technology and inspection parameters

Image postprocessing and data analysis

Statistical analysis

Results

The relationship between radiomics characteristics and GGN infiltration in GGN-type lung adenocarcinoma

Clinical characteristics comparison between noninvasive lesions and invasive lesions

Table 2

Calculated radiomic features selection with association to the histological subtype of the GGN-type nodules

Data extraction, histological feature screening, and model establishment

Evaluation and effectiveness of radiomics model A in the training and validation sets

The relationship between radiomics characteristics and GGN instability in GGN-type lung adenocarcinoma

Comparison of basic clinical characteristics between noninvasive lesions and invasive lesions

Table 3

Table 4

Extraction and selection of radiomic features and establishment of radiomics labels

Table 5

Table 6

Discussion

Limitations

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share