Lung cancer staging: the value of PET depends on the clinical setting
Introduction
Positron emission tomography (PET)—more recently PET with integrated computed tomography (PET/CT)—has become a cornerstone in evaluating patients with lung cancer. This paper focuses on the impact of PET in the pretreatment evaluation of patients with suspected lung cancer. One can look at specific components of this process, such as diagnosis or identification of nodal or distant metastases, but the clinical value is determined by the overall impact of PET on the accuracy of pre-treatment patient evaluation (i.e., definition of disease extent).
Randomized controlled trials (RCTs) assessing the value of PET in patients with known or suspected lung cancer have yielded somewhat inconsistent results. The RCTs suggest that upfront PET is similarly efficient compared to traditional staging (1) and that adding PET to traditional staging does (2) or does not (3) identify more metastases, and does (1,4,5) or does not (3) reduce the number of so-called “futile” thoracotomies. PET does (1) or does not (2,4,5) reduce the rate of invasive mediastinal staging. Furthermore, PET does not seem to affect the overall survival of patients (2,5).
The inconsistent RCT results leave the exact value of PET unclear. There are significant differences in the design of these trials, the patients included, and the outcomes assessed. This paper explores the results and differences between these trials in order to achieve a better understanding of the role of PET and factors that influence this in patients with lung cancer.
Methods
We chose a realist synthesis method as most appropriate to develop an understanding of factors influencing the impact of PET (what works, for whom, in which setting, why and how?) (6-8). The realist method considers that implementation of an intervention may yield slightly different results depending on the context. This approach combines theoretical understanding and available data to reach a deeper understanding of how an intervention produces the observed outcomes, with a focus on explaining the relationship between the context in which the intervention is applied, the mechanisms by which it works and the outcomes which are produced. This review was conducted following the RAMESES publication standards for realist syntheses (Figure S1) (7).
Preliminary scoping of the literature used the recent extensive systematic review conducted for the ACCP Lung Cancer Guidelines (9). We limited the focus to RCT because the impact of PET vs. traditional staging is not confounded, yet allows analysis of contextual factors by comparing across studies.
We performed a MEDLINE search [1990-2013] for English language papers assessing the utility of PET or PET/CT in the pre-treatment evaluation of patients with known or suspected lung cancer in a RCT (details available on request). We excluded studies of PET for chemotherapy response, RT treatment planning or restaging after induction therapy. We did not attempt to identify unpublished studies. There was no funding support and no involvement of other people or organizations.
We found five RCTs, involving 1,362 patients (Table S1) (1-5,10). Data was abstracted regarding study design, end-points, PET technology, scan interpretation, patient characteristics, and details of pre-enrollment and post-enrollment further studies. Slight discrepancies were noted between two papers reporting on one study; data from the later publication was chosen after communication with the authors (2,10).
Full table
The selected studies used various denominators and endpoints, including detection of mediastinal or distant metastases, early recurrence and appropriateness of resection. To enhance comparability, we abstracted raw data and calculated outcomes consistently across the studies on an intent-to-treat basis, for all enrolled eligible patients.
We calculated results according to parameters that could be affected by PET (i.e., whether PET identifies more patients with benign disease, N2,3 or M1 involvement preoperatively vs. intra-operatively or within 1 year). We excluded endpoints which PET is unlikely to impact (e.g., “unresectability” or T4) although we show these results when reported. We excluded outcomes such as unrelated death within 1 year, because it seems inappropriate to expect PET to predict unrelated events.
We defined stage-inappropriate resection as surgery for patients with benign lesions or with N2,3 or M1 involvement (although we acknowledge that exceptions exist and sometimes surgery may be considered appropriate). The Viney et al.’s study (3) results are presented as if N2 involvement was a contraindication for surgery (contrary to the study authors’ policy) in order to be consistent with the general view and with the other studies. Avoidance of inappropriate surgery is important, but so is inappropriately missed resection due to falsely interpreted preoperative staging. Because this was not explicitly reported, we estimated the risk of missed stage-appropriate resection from the incidence of positive PET results subsequently shown to be false-positives.
We assigned a qualitative assessment in each study to patient characteristics and the extent of pre- or post-enrollment but preoperative testing for N2,3 or M1 disease in order to facilitate evaluation of how these factors influenced the study outcomes (a quantitative assessment was not possible). Because of heterogeneity in patient and study design characteristics, a formal meta-analysis or calculation of summary statistics across all studies is not appropriate.
Results
Study characteristics
End-points
The primary outcome in the PLUS and Fischer studies was “futile thoracotomies”, defined as preoperatively unrecognized benign disease, N2,3 or M1 involvement, or recurrence or any death within one year (related or not) (4,5). Viney et al. also lists the thoracotomy rate as the primary endpoint, but effectively it is the identification of distant metastases (those with suspected N2 involvement underwent thoracotomy nevertheless) (3). The Maziak study assessed the percentage correctly and incorrectly upstaged, as well as incorrect understaging (but not correct downstaging) by PET/CT vs. traditional staging (2). Herder assessed whether PET as the first test reduced the number of tests/procedures to finalize staging and define operability, with secondary endpoints of work-up duration, morbidity and costs (1).
Patient characteristics
The RCTs all enrolled patients deemed potential candidates for surgical resection (Table 1), but according to varying criteria. Some studies (1,4) included many patients with significant weight loss—generally considered a marker of distant metastasis. Some studies (4,5) included many patients with clinical evidence of mediastinal node involvement. Merging such features yields a qualitative assessment of the risk of advanced disease, which varies markedly (Table 1).
Full table
In most studies, the vast majority of patients had lung cancer—either as demonstrated by subsequent work-up or mandated by biopsy prior to study entry. Only the Herder study included many patients (47%) who did not have lung cancer (1).
Patient entry into the RCTs varied markedly: from being referred by a general practitioner (GP) on the basis of a chest radiograph (CXR) alone (1), to a prerequisite of biopsy proof, specialist evaluation and extensive traditional imaging (3). A qualitative assessment of the extent of pre-enrollment evaluation can be assigned (see the last column in Table 1).
Thoroughness of preoperative staging
The extensiveness of staging procedures to rule out mediastinal and distant metastases differ markedly between the studies (Table 2). The value of PET may be different if all or few patients undergo further testing for N2,3 or M1 disease. The extent of further testing was generally similar between the PET and traditional work-up arms with 2 exceptions. In the Herder study mediastinoscopies were done less often in the PET arm (13% vs. 34%) (1). The Maziak study mandated liver/adrenal CT and bone scans only in the traditional arm (2).
Full table
Potential factors influencing the impact of PET include scanner technology, interpretation quality, and the extent of confirmation of abnormal findings. These factors also varied between studies (Table 2). The thoroughness of the PET interpretation is patterned according to a proposed scale (11). In most studies only a few institutions performed PET scans for all participating sites. This concentrated experience may affect the quality of the interpretation. Only in the Maziak study was PET performed in a more disseminated fashion. The thoroughness of the PET and of preoperative testing for N2,3/M1 disease is qualitatively summarized in the last three columns of Table 2.
Outcomes
Preoperative identification of benign disease
The RCTs demonstrate little difference for PET vs. traditional evaluation in identifying benign disease preoperatively vs. intraoperatively (Table 3), but most studies included few patients with benign disease. PET identified more benign disease preoperatively only in the Herder study, which involved patients referred by the GP for potential lung cancer resection based on only a CXR (1). However, this observation is weakened by the fact that there were also more patients overall in the PET arm with benign disease (19% vs. 13%) despite randomization (1).
Full table
These results suggest that if the diagnosis of lung cancer is fairly certain, either due to a biopsy result or evaluation by a specialist, there is little benefit to PET to identify benign lesions. However, PET can be helpful to evaluate the primary lesion when there is little specialist involvement and limited diagnostic evaluation.
Preoperative identification of mediastinal node involvement
In a RCT the total number of patients with N2,3 involvement should be similar, but PET might identify N2,3 disease more often preoperatively rather than intra- or post-operatively. These results are summarized in Table S2 and Figure 1. While PET seems beneficial in the Fisher and PLUS studies (4,5), the opposite was true in the Herder study (1). The Viney study provided no data for the traditional staging arm (3). The Maziak study observed fewer patients with intraoperative identification of N2 involvement in the PET arm, but with no corresponding increase of preoperatively identified N2 disease by PET (2,10). Therefore, the imbalance in the overall number of patients with N2 disease between the arms (despite randomization) seems to account for the lower rate of intraoperative discovery of N2, rather than PET imaging.
Full table
Comparing studies suggests that PET is beneficial in preoperatively identifying N2,3 involvement when the risk is high and there is a high rate of invasive staging. PET seems to help identify N2,3 disease even when almost all patients undergo mediastinoscopy-perhaps by directing attention to suspicious nodes. When PET is used to decrease the rate of invasive staging, it appears that the risk of intraoperative discovery of N2 involvement is increased, at least in a patient cohort with a moderate incidence of mediastinal nodal disease. When the risk of N2.3 disease is low, PET has little impact on preoperative identification of nodal involvement.
Preoperative identification of distant metastases
Does PET identify more patients with distant metastases preoperatively and reduce the number found to have M1 disease or recurrence within 1 year? This was seen in the Fisher and PLUS studies but not clearly so in the others (Table S3, Figure 2) (1-5).
Full table
An unexplained difference between the randomized arms exists in the Maziak study in the number of patients with M1 disease (identified at any time) (2). The discrepancy is in the opposite direction as for N2,3 involvement; however, these differences are not explained by postulating that preoperative identification of M1 disease would obviate the need to identify N2,3 involvement. Furthermore, the incidence of M1 disease is higher than expected from the patients’ characteristics. The reason for these findings is unclear.
Thus the RCTs suggest that PET helps identify M1 disease preoperatively when the risk is at least moderate and the thoroughness of searching for M1 disease without PET is low. However, PET has little additional impact if the risk of advanced disease is low or when extensive investigation for M1 disease is already being done.
Avoidance of stage-inappropriate resection
Table S4 summarizes the rate of pre- vs. intra-/post-operative identification of N2,3/M1 involvement. Figure 3 shows the rate of stage-inappropriate resection (defined as surgery for something other than stage I, II NSCLC). PET was beneficial by this assessment in the Fischer and PLUS studies and to a lesser extent in the Maziak trial (2,4,5). In the Fischer and PLUS studies PET lowered the overall rates of surgery, but the other studies found no difference. The discrepancy in the Maziak study between a similar overall rate of surgery, yet fewer stage-inappropriate resections with PET, appears to be due to unequal overall rates of N2,3 and M1 involvement between the arms (which should be similar in a randomized study).
Full table
The potential of PET to reduce stage–inappropriate surgery is seen in patients with a high risk of advanced cancer and with relatively little investigation for this without PET. PET has little impact when the risk of M1 disease is low or the rate of traditional investigation is high.
Missed stage-appropriate resection
A balanced assessment requires that the rate of missed stage-appropriate resection is assessed. Unfortunately, none of the RCTs addressed this directly. The potential for missed resection can be estimated from the rate of PET findings suggesting N2,3/M1 disease subsequently found to be false positive. This assessment suggests this risk is not minor, although the rate varies markedly (from 1% to 42%) (Table S4).
Discussion
The literature on PET for lung cancer staging has progressed from dramatic anecdotal PET images of metastases (not necessarily otherwise undetected) to series comparing PET to historical studies and finally RCTs. Clinical guidelines recommended PET in the evaluation of most lung cancer patients (9,12-14).
However, the RCT results are not consistent with respect to the identification of metastasis, the rate of so-called “futile” thoracotomies and the need to perform invasive mediastinal staging. However, the RCT have differed in terms of the patients included, the endpoints and the context of the studies. We hoped to clarify how details of the design and conduct of the RCTs affected the results. The realist method is specifically designed to explore differences in context to develop a deeper understanding of what works, for whom, in which setting, why and how (6-8).
Analyzing the reported data relative to consistent specific endpoints allows the RCTs to be compared (Table 4). This analysis suggests several conclusions. First, the benefit of PET to identify benign disease is moderate after evaluation by a generalist and limited investigation but low after a specialist’s evaluation and more extensive pre-enrollment testing.
Full table
Second, the benefit of PET in detecting N2,3 or M1 disease is low if the clinical evaluation and chest CT suggest a low risk of metastasis. PET is of little value for identification of N2,3 disease if the rate of invasive staging is low—and it may be detrimental if it is used to lower the rate of invasive mediastinal staging. However, in cohorts with at least moderate a risk of N2,3 involvement, PET appears to be of benefit even if invasive staging is done in most patients (perhaps by directing attention to suspicious nodes).
Finally, PET is of value in identifying M1 disease if the risk is at least moderate and little traditional imaging is done. If the patient undergoes extensive traditional imaging, there is little additional impact of PET in identifying M1 disease.
Taking everything together, PET is useful if there is at least a moderate risk of metastases (N2,3 or M1), the extent of traditional imaging is low, and the rate of invasive staging is high (Figure 4). In these settings PET appears to reduce the rate of stage-inappropriate resection. However, with initial evaluation by a specialist and extensive traditional imaging the impact of adding PET is low. PET does not appear to obviate the need for invasive nodal staging. Furthermore, the rate of potentially misleading false-positive PET results is substantial, suggesting a potential detriment if confirmation of positive PET findings (i.e., mediastinal or distant metastases) is not undertaken diligently.
We avoided the endpoint “futile thoracotomy” because we perceive this to be biased and not conducive to an objective scientific assessment. “Futile” has a strong negative connotation, and is one-sided, ignoring the converse risk of a missed stage-appropriate, potentially curative resection (Considering only the stage-inappropriate resection mandates that all surgery be avoided because then no stage-inappropriate resection will occur). Furthermore, counting unrelated (random, unpredictable) deaths as “futile” resections seems inappropriate as an outcome to assess pre-treatment evaluation. Finally, the majority of lung cancer resections are performed by thoracoscopy, not thoracotomy in cutting-edge thoracic surgical units.
The quality of the PET imaging and interpretation in the RCTs appears to have been good. We did not find that differences in technical details had an impact (Table 2). The use of PET/CT vs. stand-alone PET does not alter the findings, suggesting that the setting in which PET is implemented may have a greater impact that the technology itself. Furthermore, PET/CT is not universally available: in the US about half of PET imaging involves stand-alone PET. In the RCTs PET imaging was relatively centralized, whereas in the US PET is performed at many smaller institutions and using mobile scanners. What effect this has on the accuracy of interpretation of the PET scans is unknown.
A difficulty in this analysis are discrepancies in the Maziak study between the randomized arms in the overall rate of N2,3 and M1 disease. There is no discernible explanation for this. These discrepancies drive some of the face-value conclusions of the study, namely that PET identifies more patients with metastases; without this imbalance it is unclear whether this result would hold up.
Outcomes studies suggest PET has an impact in lung cancer, primarily by identifying distant metastases in cIII patients (15-17). Other studies suggest little impact in stage cI patients (15,18). PET detected N2,3 or M1 involvement in 7% of the subset of cI patients in the ACoSOG study, but at a price of falsely suggesting N2,3/M1 disease in 14% (18). This study also found that while PET could reduce the rate of biopsy for benign lesions from 21% to 11%, this would cause missed (or delayed) resection in 13% of cancers (18). Finally, many studies have consistently reported that ~25% of patients with central or cII or cIII tumors by CT harbor N2,3 involvement despite a PET that is negative in the mediastinum (10,19-24). Thus, other (non-RCT) studies corroborate a significant benefit with PET in some clinical settings but also a potential detriment when the false negative and false positive rates of PET in particular clinical settings are not recognized and the PET results are not appropriately confirmed.
Conclusions
This analysis suggests that while PET can be useful, it depends on many factors. PET appears to be of benefit when the chance of N2,3 or M1 involvement is moderate or high, when the extent of traditional imaging (abdominal/pelvis CT, bone scan) is low, and when the rate of invasive mediastinal staging is high. The impact of PET is low in other settings. If PET is used to avoid invasive mediastinal staging in clinical settings in which the risk of N2,3 involvement is moderately high, PET can lead to lower preoperative and higher intraoperative detection rates of N2 disease. Finally, the data suggests a significant risk of missed curative-intent treatment if positive PET findings are not interpreted carefully. Accurate evaluation is a complex interplay of various clinical aspects (symptoms), risk of metastases, extent of imaging, confirmation of suspicious findings, the thoroughness of intra-operative assessment and of follow-up.
The overriding conclusion of this analysis of RCTs comparing PET to traditional evaluation of lung cancer patients is that the results are dependent on the clinical setting. A blanket recommendation for PET may be too simplistic—various clinical aspects affect the value of PET. A more judicious use of PET may lower costs without negatively impacting the accuracy of evaluation of patients with lung cancer.
Acknowledgements
Disclosure: The authors declare no conflict of interest.
References
- Herder GJ, Kramer H, Hoekstra OS, et al. Traditional versus up-front [18F] fluorodeoxyglucose-positron emission tomography staging of non-small-cell lung cancer: a Dutch cooperative randomized study. J Clin Oncol 2006;24:1800-6. [PubMed]
- Maziak DE, Darling GE, Inculet RI, et al. Positron Emission Tomography in Staging Early Lung Cancer. Ann Intern Med 2009;151:221-8. [PubMed]
- Viney RC, Boyer MJ, King MT, et al. Randomized controlled trial of the role of positron emission tomography in the management of stage I and II non-small-cell lung cancer. J Clin Oncol 2004;22:2357-62. [PubMed]
- van Tinteren H, Hoekstra OS, Smit EF, et al. Effectiveness of positron emission tomography in the preoperative assessment of patients with suspected non-small-cell lung cancer: the PLUS multicentre randomised trial. Lancet 2002;359:1388-93. [PubMed]
- Fischer B, Lassen U, Mortensen J, et al. Preoperative Staging of Lung Cancer with Combined PET-CT. N Engl J Med 2009;361:32-9. [PubMed]
- Wong G, Greenhalgh T, Westhorp G, et al. Realist methods in medical education research: what are they and what can they contribute? Medical Education 2012;46:89-96. [PubMed]
- Wong G, Greenhalgh T, Westhorp G, et al. RAMESES publication standards: realist syntheses. BMC Med 2013;11:21. [PubMed]
- Pawson R, Greenhalgh T, Gill H, et al. Realist review-a new method of systematic review designed for complex policy interventions. J Health Serv Res Policy 2005;10:21-34. [PubMed]
- Silvestri GA, AV Gonzalez, Jantz M, et al. Methods for staging non-small cell lung cancer: Diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest 2013;143:e211S-50S.
- Darling GE, Maziak D, Inculet R, et al. Positron Emission Tomography-Computed Tomography Compared with Invasive Mediastinal Staging in Non-small Cell Lung Cancer-Results of mediastinal staging in the early lung positron emission tomograpy trial. J Thorac Oncol 2011;6:1367-72. [PubMed]
- Detterbeck F, Puchalski J, Rubinowitz A, et al. Classification of the thoroughness of mediastinal staging of lung cancer. Chest 2010;137:436-42. [PubMed]
- Silvestri GA, Gould MK, Margolis ML, et al. Non-invasive staging of non-small cell lung cancer: ACCP evidenced-based clinical practice guidelines (2nd Edition). Chest 2007;132:178S-201S.
- Crinò L, Weder W, van Meerbeeck J, et al. Early stage and locally advanced (non-metastatic) non-small-cell lung cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann Oncol 2010;21:v103-v115. [PubMed]
- Ettinger DS, Akerley W, Borghaei H, et al. Non-small cell lung cancer. J Natl Compr Canc Netw 2012;10:1236-71. [PubMed]
- Morgensztern D, Goodgame B, Baggstrom M, et al. The effect of FDG-PET on the stage distribution of non-small cell lung cancer. J Thorac Oncol 2008;3:135-9. [PubMed]
- Morgensztern D, Waqar S, Subramanian J, et al. Improving survival for stage IV non-small cell lung cancer: a surveillance, epidemiology, and end results survey from 1990 to 2005. J Thorac Oncol 2009;4:1524-9. [PubMed]
- Morgensztern D, Ng SH, Gao F, et al. Trends in stage distribution for patients with non-small cell lung cancer: a National Cancer Database survey. J Thorac Oncol 2010;5:29-33. [PubMed]
- Kozower BD, Meyers BF, Reed CE, et al. Does Positron Emission Tomography Prevent Nontherapeutic Pulmonary Resections for Clinical Stage IA Lung Cancer? Ann Thorac Surg 2008;85:1166-9. [PubMed]
- Gould MK, Kuschner WG, Rydzak CE, et al. Test performance of positron emission tomography and computed tomography for mediastinal staging in patients with non-small-cell lung cancer: a meta-analysis. Ann Intern Med 2003;139:879-92. [PubMed]
- Pozo-Rodríguez F, Martín de Nicolás JL, Sánchez-Nistal MA, et al. Accuracy of Helical Computed Tomography and [18F] Fluorodeoxyglucose Positron Emission Tomography for Identifying Lymph Node Mediastinal Metastases in Potentially Resectable Non-Small-Cell Lung Cancer. J Clin Oncol 2005;23:8348-56. [PubMed]
- Serra M, Cirera L, Rami-Porta R, et al. Routine positron tomography (PET) and selective mediastinoscopy is as good as routine mediastinoscopy to rule out N2 disease in non-small cell lung cancer (NSCLC). J Clin Oncol 2006;2006:24.
- Verhagen AFT, Bootsma GP, Tjan-Heijnen VCG, et al. FDG-PET in staging lung cancer: How does it change the algorithm? Lung Cancer 2004;44:175-81. [PubMed]
- Cerfolio RJ, Bryant AS, Ojha B, et al. Improving the inaccuracies of clinical staging of patients with NSCLC: a prospective trial. Ann Thorac Surg 2005;80:1207-13; discussion 1213-4. [PubMed]
- Dietlein M, Weber K, Gandjour A, et al. Cost-effectiveness of FDG-PET for the management of potentially operable non-small cell lung cancer; priority for a PET-based strategy after nodal-negative CT results. Eur J Nucl Med 2000;27:1598-609. [PubMed]