Prediction of prognosis of patients with lung cancer in combination with the immune score

Abstract Purpose: The host’s immune response to malignant tumor is fundamental to tumorigenesis and tumor development. The immune score is currently used to assess prognosis and to guide immunotherapy; however, its association with lung cancer prognosis is not clear. Methods: Clinical features and immune score data of lung cancer patients from The Cancer Genome Atlas were obtained to build a clinical prognosis nomogram. The model’s accuracy was verified by calibration curves. Results: In total, 1005 patients with lung cancer were included. Patients were divided into three groups according to low, medium, and high immune scores. Compared with patients in the low immune score group, the disease-free survival (DFS) of patients in medium and high immune score groups was significantly longer; the hazard ratio (HR) and 95% confidence interval (95% CI) were 0.77 [0.60–0.99] and 0.74 [0.60–0.91], respectively. The overall survival (OS) of patients in the medium and high immune score groups was significantly longer than in the low immune score group; the HR and 95% CI were 0.74 [0.57–0.96] and 0.69 [0.55–0.88], respectively. A clinical prediction model was established to predict the survival prognosis. As verified by calibration curves, the model showed good predictive ability, especially for predicting 3-/5-year DFS and OS. Conclusion: Patients with lung cancer with medium and high immune scores had longer DFS and OS than those in low immune score group. Patient prognosis can be effectively predicted by the clinical prediction model combining clinical features and immune score and was consistent with actual clinical outcomes.


Introduction
Currently, the main prognostic indicators for patients with lung cancer are based on the tumor-lymph-node metastasis (TNM) staging system developed by the American Joint Committee on Cancer/Union for International Cancer Control (AJCC/UICC). This system provides exhaustive guidelines for the prognosis and treatment of patients with cancer. However, in clinical practice, there is a marked difference in the survival prognosis among patients having the same TNM stage. The disease development of some patients with advanced lung cancer may remain stable for several years [1,2]. In addition, approximately 10-25% of patients with TNM I/II tumors, who received prompt radical surgical treatment and had no lymph nodes or distant metastases according to pathological findings, experienced cancer recurrence and rapid progression or even death [2,3]. The reason for these differences is that the TNM assessment system is based solely on the biological behavior of the tumor, without considering the immune response of the host. Indeed, increasing evidence has indicated that cancer progression is greatly influenced by the host's immune response [4][5][6][7]. The impact of the host's immune response on the tumor is mainly manifested by the degree of infiltration of immune cells (mainly cytotoxic cells and memory T lymphocytes) in tumor tissues. Given the diversity of the immune microenvironment in colon cancer, Van den Eynde et al. reported on the key role played by cytotoxic cells and memory T lymphocytes in inhibiting the growth, invasion, and metastasis of primary tumors, and created a scoring system based on immune cell density for scoring immune microenvironments. The authors classified and quantitatively determined the lymphocyte populations infiltrating the tumor, and then used this information to predict the prognosis of patients with colon cancer, including patients with early-stage cancer [8]. The study by Minami et al. revealed the close correlation between immune infiltrating cells and the prognosis of patients with lung cancer. However, all these findings have not been converted into an effective tool for predicting the survival prognosis of patients with lung cancer [9].
Following a search of the literature, we learned that only a limited number of studies have been published on the assessment of the survival prognosis of patients with lung cancer based on the immune score. Thus, we established a clinical prediction model to estimate the survival prognosis of patients with lung cancer.

Sample inclusion
The cases included in the present study were extracted from The Cancer Genome Atlas (TCGA), which currently contains data on over 200 types of cancer and the associated clinical information, and other information such as DNA methylation and RNA expression; it is the largest cancer genome analysis database in the world [10]. The sample information in the preent study was downloaded from the public website of TCGA. The contents downloaded included the identity (ID), sex, age, TNM status, clinical stage of tumor, pathological classification of tumor, disease-free survival (DFS), and overall survival (OS) of patients with lung cancer.
The immune score in the TCGA was calculated from tumor gene expression data. The overlapping data that constitute the immune signal was obtained from the comparison of the gene expression profile of normal hematopoietic tissues with that of other normal cells. The immune score was established to estimate the level of infiltrating immune cells and reflects the degree of infiltration of immune cells in tumor tissues. The basis and method of calculating the immune score in using TCGA datasets has been detailed in previous studies [11].

Data preprocessing
Upon completion of the data retrieval, the data were screened first to remove duplicate cases and then cases with missing information. R software (Ver. 3.6.1) was used to match (integrate) the clinical characteristics and the immune score data according to the ID of patients with lung cancer, and to further analyze the integrated data. See Figure 1 for details on specific operating steps and the sample size at each stage.

Statistical analysis
The main endpoint observation indicators in the present study were DFS and OS. OS was defined as death from any cause during the clinical course of a patient with lung cancer; while DFS was defined as the time before the recurrence of primary tumor. The software X-tile 3.6.1 (School of Medicine, Yale University, U.S.A.) was used in combination with the immune score, the survival status, and OS of patients to determine the optimal cut-off value of the immune score and was used to divide the patients into three groups according to low, medium, and high immune scores. The chi-square test was adopted for categorical data. R software (Ver. 3.6.1) was used to establish the Kaplan-Meier survival curve. A univariate Cox proportional hazard regression model was used to analyze age, sex, TNM stage, pathological type of tumor and immune score, to identify independent predictors of DFS and OS, and to define the significant univariate variables to be included in the multivariate Cox proportional hazard regression model for analysis. Next, a prognosis nomogram was established. In this process, a bootstrap algorithm was adopted for internal verification. One thousand bootstrap experiments were conducted. To determine the consistency of the results obtained from the clinical prediction model and the actual clinical status of patients, the 3-/5-year DFS and OS calibration charts were drawn. All statistical tests were bilateral. P<0.05 was considered statistically significant.

Clinical features of patients
A total of 1005 patients with lung cancer were included in the present study. The average age was 64.42 (range, 33-90) years and 731 patients (72.74%) were above age 60 years. There were 603 male patients (60.00%). In terms of TNM stage, 521 patients (51.84%) were at stage I, 284 patients (28.26%) at stage II, 167 patients (16.62%) at stage III, and 33 patients (3.29%) at stage IV. The average immune score was 723.74 (range, -1651.61 to 3286.67). The software X-tile 3.6.1 (School of Medicine, Yale University, U.S.A.) was used to identify the most appropriate cut-off value of the immune score. The range of the low immune scores was -1651. 6   (48.66%) being allocated into the low immune score group, 213 patients (21.20%) into the intermediate immune score group, and 303 patients (30.15%) into the high immune score group. The median OS was 21.90 months (range, 0-238.11 months) and the DFS was 18.99 months (range, 0-238.11 months) ( Table 1).

Analysis of DFS and OS with univariate and multivariate Cox proportional hazard regression models
The results of univariate Cox proportional hazard regression model showed that the DFS of the medium and high immune score groups was significantly longer than that of the low immune score group. The hazard ratio (HR) for the medium immune score group was 0.77 and 95% CI [0.60-0.99], P=0.04; while the HR for the high immune score group was 0.74 and 95% CI [0.60-0.91], P<0.01. Kaplan-Meier survival curves ( Figure 2) indicated that there was a significant difference among different groups when stratified by age, sex, TNM stage, or pathological types. According to the results of the univariate analysis for OS, compared with TNM stage I patients, the survival prognosis of patients at stages II, III, and IV was significantly poorer. The HR and 95% CIs were as follows for the TNM stage Abbreviation: TNM, tumor-lymph-node metastasis.

Time (year) B A Overall survival
Immune score group Immune score ≤698.1 Immune score group 698.1 < Immune score ≤1246.3 Immune score group Immune score >1246.3 Immune score group Immune score ≤698.1 Immune score group 698.1 < Immune score ≤1246.3 Immune score group Immune score >1246.3   (Figure 3).

Prognosis nomogram
A clinical prognosis nomogram was established combining all the independent factors that we found were able to significantly predict the prognosis of patients with lung cancer (Figure 4). The 3-/5-year DFS and OS calibration charts were drawn to determine the accuracy of the clinical prediction model ( Figure 5), which showed that the clinical prediction model is quite consistent with the actual patient outcomes. At the same time, we can see from the figure that compared with 3-year DFS and OS calibration charts, the prediction ability of 5-year DFS and OS calibration charts has decreased.

Discussion
Lung cancer is a common malignant tumor with a complicated pathogenesis and high mortality that severely threatens public health worldwide. Currently, the TNM stage is considered an optimal indicator to estimate the prognosis of patients with lung cancer [12]. Considering the limitations of TNM staging, which is based only on the biological behavior of the tumor, continued in-depth studies on the tumor microenvironments in recent years have led to the hypothesis that a patient's own immune system also plays an important role in the occurrence, invasion, and metastasis of tumors. Normal cells in the tumor microenvironment mainly include infiltrating stromal cells and immune cells, which affect the biological behavior of the tumor and regulate the signaling molecules in the tumor microenvironment [11,13]. Thus, the influence of the host's infiltrating immune cells and the degree of autoimmunity of the tumor have received increasing attention and probably have an important impact on the prognosis of tumor patients. Increasing evidence indicates that the massive infiltration of intracellular cytotoxic T lymphocytes (CD8 + ) and memory T cells (CD4 + ) in lung cancer is closely correlated to the good prognosis of patients with lung cancer [13,14]. CD8 + T cells are the main effector cells, while CD4 + T cells can induce and activate CD8 + T cells in the tumor microenvironment.
Studies have shown that such T lymphocytes play a role in inhibiting the growth, invasion, and metastasis of several tumors, such as lung, ovarian, and rectal cancers [4,15,16]. Further knowledge regarding tumor-associated normal cells in tumor tissues has deepened the understanding of the biological behavior of primary tumor and the host's immune response and has provided new ideas for the treatment of tumor. Some researchers have proposed to use the immune score as an immune biomarker to guide the clinical immunotherapy of tumors [17,18]. Meanwhile, a more valuable clinical prediction model should be established in combination with the immune score to help clinicians assess the survival prognosis of tumor patients.  In the present study, we evaluated the prognostic value of immune score in patients with lung cancer. Univariate and multivariate analysis of OS and DFS showed that both moderate and high immune scores were significantly associated with good prognosis. In addition, we integrated all the clinicopathological factors to construct the nomogram survival prediction model for patients with clinical nomogram lung cancer. The immune score in the present study was defined using a TCGA dataset and adopted a novel assessment method based on the transcription profiles of cancer samples to predict nontumor components in tumor cells and tumor tissues. Gene expression characteristics were used to infer the proportion of stromal and immune cells in tumor samples, and to identify the specific features associated with immune cell infiltration in tumor tissues. The immune score was calculated through enrichment analysis of single sample genes to predict the degree of infiltration of immune cells [19][20][21][22][23]. Such studies have been reported in several tumors, such as lung, breast, and ovarian cancers [24,25].

Points
In the preent study, the immune score from TCGA was used to assess the prognosis of patients with lung cancer. Patients were divided into three groups according to the immune score: low, medium, and high immune score groups. We found that, compared with the low immune score group, the DFS and OS of patients with lung cancer in the medium and high immune score groups were significantly longer. A possible reason for this is as follows: the higher the immune score, the stronger the host's autoimmunity to the tumor, the more immune cells can be mediated to enter the tumor microenvironment to play an antitumor role. The analysis of DFS with univariate and multivariate Cox proportional hazard regression models showed that, compared with immune score groups, the results in the TNM groups were positive, indicating the immune score may be superior to the TNM staging system in predicting the DFS of patients with lung cancer. This conclusion is of certain guiding significance in clinical practice. For patients with lung cancer at stage I/II according to the TNM staging system, if their immune score is low, they should be active during the treatment and should be closely followed-up to monitor disease development.
In order to predict the survival prognosis of patients with lung cancer more effectively and accurately, a survival prognosis model for patients with lung cancer was developed by combining previously commonly used clinical pathological features with the novel immune score. Moreover, the verification of this model showed that it had good predictive ability and could represent an effective reference for clinicians to better quantify the survival prognosis of patients with lung cancer. At the same time, we found that compared with 3-year DFS and OS calibration charts, the predictive power of 5-year DFS and OS calibration charts has declined. We believe that the reason is that human immunity is a changing process, and there is a time correlation. With the increase of age, changes in body function, gene expression and immunity of patients will change, leading to a decrease in the predictive power of this model.
Both patients and clinicians can achieve a more individualized survival prediction using this clinical prediction model and it may contribute to better predict a patient's disease development. Further, the application of the immune score in patients with lung cancer will provide a basis for choosing a better treatment plan for clinical management. However, the present study has certain limitations mainly in the following aspects: the treatment conditions of patients with lung cancer included were unknown; thus, the effect of specific treatments on patient survival could not be determined. Therefore, it is necessary to collect additional relevant information from patients with lung cancer to be included in the study, so as to establish a more effective and accurate clinical prediction model.

Conclusion
The immune score can be used as an effective indicator for predicting the survival prognosis of patients with lung cancer. The higher the immune score, the better prognosis of patients with lung cancer. In addition, the clinical prediction model established using the immune score can help clinicians assess the prognosis of patients with lung cancer more accurately, effectively, and conveniently and will assist in identifying patient subgroups requiring active adjuvant therapy, which underlines its certain clinical significance.