Development and validation of a nomogram to predict synchronous lung metastases in patients with ovarian cancer: a large cohort study

Abstract Purpose: Lung metastasis is an independent risk factor affecting the prognosis of ovarian cancer patients. We developed and validated a nomogram to predict the risk of synchronous lung metastases in newly diagnosed ovarian cancer patients. Methods: Data of ovarian cancer patients from the Surveillance, Epidemiology, and Final Results (SEER) database between 2010 and 2015 were retrospectively collected. The model nomogram was built on the basis of logistic regression. The consistency index (C-index) was used to evaluate the discernment of the synchronous lung metastasis nomogram. Calibration plots were drawn to analyze the consistency between the observed probability and predicted probability of synchronous lung metastases. The Kaplan–Meier method was used to estimate overall survival rate, and influencing factors were included in multivariate Cox regression analysis (P<0.05) to determine the independent prognostic factors of synchronous lung metastases. Results: Overall, 16059 eligible patients were randomly divided into training (n=11242) and validation cohorts (n=4817). AJCC T, N stage, bone metastases, brain metastases, and liver metastases were evaluated as predictors of synchronous lung metastases. Finally, a nomogram was constructed. The nomogram based on independent predictors was calibrated and showed good discriminative ability. Mixed histological types, chemotherapy, and primary site surgery were factors affecting the overall survival of patients with synchronous lung metastases. Conclusion: The clinical prediction model has high accuracy and can be used to predict lung metastasis risk in newly diagnosed ovarian cancer patients, which can guide the treatment of patients with synchronous lung metastases.


Introduction
Ovarian cancer is among the most common malignant tumors in the female reproductive system. Ovarian cancer is the fifth most common cause of cancer-related deaths among American women. In 2018, an estimated 14070 people died of ovarian cancer in the United States [1]. Since the symptoms of ovarian cancer are unclear and there is currently no effective screening method, most patients are already at advanced stages (III and IV) at the time of diagnosis, accompanied by synchronous distant metastases [2,3].
Lung metastasis is the third most common distant metastatic site of ovarian cancer, accounting for 28.42% of distant metastatic sites. The location of distant metastases is an independent prognostic factor for overall survival [4]. Previous studies show that the risk factors for distant metastases are stage, grade, and lymph node involvement [5]. However, the sample size of the study was small. There are few studies on the risk factors of synchronous lung metastases, and most of them are case reports [6,7]. The median interval between the diagnosis of ovarian cancer and recording of metastatic disease was 44 months [5].
Identifying the risk factors for synchronous lung metastases can ensure that high-risk patients are thoroughly investigated at the initial diagnosis.These patients can then be treated as early as possible or provided with appropriate preventive treatment. A large number of studies and realistic evidence is also needed to determine the risk factors for synchronous lung metastases in ovarian cancer patients.
The purpose of the present study was to use Surveillance, Epidemiology, and End Results (SEER) database to characterize the prevalence, related factors, and prognostic factors of synchronous lung metastases in ovarian cancer patients. At the same time, a nomogram to predict the risk of synchronous lung metastases was developed on the basis of clinical factors, which may guide screening.

Study population
Data were obtained from the SEER database. The SEER *Stat 8.3.5 software (https://seer.cancer.gov/data/) was used to access the database. The site code was restricted to the ovary. Since the details of metastases were not recorded before 2010, patients with primary cancer of the ovary, aged ≥ 18 years at diagnosis, between 2010 and 2015 were analyzed. The exclusion criteria for patient selection included the following: (1) unknown grade; (2) unknown AJCC T, N stage and AJCC T0 stage; (3) unknown metastases information; (4) unknown tumor size; (5) unknown laterality; and (6) unknown therapy information. The flowchart of the subjects' selection is listed in Figure 1. According to the inclusion and exclusion criteria, 16059 patients with ovarian cancer were finally enrolled in our study. We further randomly divided the patients in a 7:3 ratio to form a training cohort (n=11242) for nomogram construction and a validation cohort (n=4817) for internal verification.
Data regarding clinical characteristics including age, race, marital status, insurance status, year of diagnosis, household income at diagnosis, histological type, grade, laterality, clinical AJCC T, N stage, tumor size, metastatic status, and therapy information were collected from the SEER database. Since all information from the SEER database was identified and no personal identifying information was used in this analysis, informed consent was not required. The present study complied with the 1964 Helsinki Declaration, its later amendments, and comparable ethical standards.

Statistical analysis
Statistical analysis was performed using the SPSS 21 software. Categorical data were presented as frequency (%) and analyzed using the chi-squared test. The Kolmogorov-Smirnov test was used to verify the normality of variables. Normally distributed variables were expressed as mean + − standard deviation, while non-normally distributed variables were expressed as median (interquartile range). Hazard ratios and 95% confidence intervals (CIs) were calculated. Univariate and multivariate logistic regression analyses were used to determine the risk factors of synchronous lung metastases in patients with ovarian cancer. Factors with a P-value less than 0.05 were incorporated into the multivariable logistic regression model.
A synchronous lung metastases nomogram was formulated on the basis of the results of multivariate logistic analysis using the rms package in R version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria; www.r-project.org). Receiver operating characteristic (ROC) curves were drawn. Finally, we evaluated the stability of the prognostic nomogram and the synchronous lung metastasis nomogram by internal validation with 1000 bootstrap samples. The nomograms were validated both internally and externally. The C-index (Harrell's concordance index) was used to assess the exact predicted values of nomograms. Calibration plots were drawn to analyze the consistency between the observed and predicted probabilities. Overall survival was estimated by the Kaplan-Meier method, and the difference between distinct groups was compared using the log-rank test. A multivariable Cox regression model, incorporating the significant factors in the Kaplan-Meier method (P<0.05) was conducted to analyze the independent prognostic factors for synchronous lung metastases.

Patients' basic information
According to the inclusion and exclusion criteria, data of 16059 of the 35333 ovarian cancer patients registered between 2010 and 2015 were collected from the SEER database. The patients were divided into training (n=11242) and verification (n=4817) groups. The basic information of the patients is listed in Table 1. The median age of the patients was 59 years. Among these patients, 13223 (82.3%) were white, 1057 (6.6%) were black, and 1711 (10.7%) were of other races. A total of 3377 (21.0%) patients were unmarried, 8549 (53.2%) were married, and 3486 (21.7%) were separated. The number of insured and uninsured patients was 861 (3.5%) and 15337 (95.5%), respectively. The median household income was 6255. The number of patients with tumor diameters <2 cm, 2-5 cm, >5 cm was 1311 (8.

Risk factors for lung metastasis
Univariable logistic analysis showed that factors closely related to the occurrence of lung metastasis included the following: older patient age (OR = 1.015; 95% CI, 1. Multivariable logistic regression analysis showed that higher T and N stages, and the presence of bone, liver, and brain metastases were associated with the earlier development of synchronous lung metastases (Table 2).

Nomogram development
A nomogram to predict synchronous lung metastases in patients with ovarian cancer was developed in the training cohort. The risk factors determined by multivariable logistic regression analysis, including higher T and N stage, and the development of bone, liver, and brain metastases were developed and used as the final nomogram ( Figure 2).

ROC curves analysis and prediction value evaluation
ROC curves were drawn to determine the predicted value of the nomogram of synchronous lung metastases in the training and validation cohorts. As shown in Figure 3A,C, ROC curves were drawn. We verified the nomogram internally and externally. The C-index was used to evaluate the prediction accuracy of the nomogram. As shown in Figure 3B, the internal verification of the nomogram was performed, and the C-index was 0.761 (0.736-0.787). As shown in Figure 3D, the external verification of the validation cohort showed that the C index was 0.757 (−0.718 to 0.795). Verification of the nomogram showed agreement with the predicted values.     (Table 3).

Discussion
Ovarian cancer is the seventh most common cancer among women and the eighth most common cause of cancer death worldwide, with a 5-year overall survival rate of <50% [8]. Two-thirds of the patients are already at advanced stages at the time of diagnosis (Stage III/IV) [9]. When the lungs are affected, the main route of metastasis is through the pleura. Lung metastases usually represent as visceral pleura involvement and continuous infiltration. Occasionally, isolated lesions are observed. Invasion of lymphatic and blood vessels also occurs [10]. The incubation period from the diagnosis of ovarian cancer to the development of lung metastases can be as long as 108 months [11]. Compared with standard chemotherapy treatment alone, early detection of lung metastases can increase the chances of timely, more aggressive treatments, which may lead to prolonged survival [4]. Active chemotherapy can significantly reduce the tumor load and metastasis of ovarian cancer [12]. Surgical removal of isolated lung metastatic lesions is reasonable [13]. Targeted therapy is also a promising treatment for metastatic ovarian cancer [14]. Routine imaging studies, such as computed tomography or magnetic resonance imaging, have not shown high sensitivity and specificity when diagnosing micrometastases <1 cm [15]. Therefore, there is a need for a non-invasive method to predict the likelihood of synchronous lung metastases in ovarian cancer patients. We used data from the SEER database to develop and validate the predicted nomogram, which demonstrated significant discernment and calibration capabilities and can provide a personalized estimation of the likelihood of synchronous lung metastases in ovarian cancer patients.     (Table 3).
To the best of our knowledge, the present study is the first to generate a risk model based on clinical and tumor characteristics through population-based surveillance, epidemiology, and final result databases to predict the risk of synchronous lung metastases in newly diagnosed ovarian cancer patients. We found that the higher the AJCC T and N stages, the higher the likelihood of metastases which is similar to likelihood of bone metastasis of ovarian cancer and the findings of other types of tumor metastases research [16][17][18]. Previous studies have shown that poor differentiation and lymph node involvement are risk factors for distant metastasis [4]. We found that liver metastases, brain metastases, and bone metastases are risk factors for synchronous lung metastases. If distant metastases are found in other parts of the body, it means that the cancer has metastasized [19], and the probability of lung metastases is higher. We verified the nomogram internally and externally. The nomogram of synchronous lung metastases includes five factors: AJCC T stage, AJCC N stage, bone metastases, liver metastases, and brain metastases. The nomogram showed agreement between the predicted results and the observed results in the verification. In addition, the C-indices of internal verification and external verification of the nomogram were 0.761 (0.736-0.787) and 0.757 (0.718-0.795), respectively, indicating consistency with the predicted values. For patients with a higher risk of synchronous metastases predicted by this model, imaging examination should be performed on time to diagnose the occurrence of lung metastases in the initial period, so as to better guide clinical procedures.
The determination of prognostic factors related to synchronous lung metastases in these patients may help doctors to provide targeted treatment strategies for patients at different risk levels and improve patient survival and quality of life. Previous studies have shown that lung metastases can significantly worsen the prognosis of patients [20]. The median survival time for the diagnosis of distant disease is 12 months [5]. In this study, the 3-and 5-year survival rates for 411 patients with synchronous lung metastases were 33.8 and 22.8%, respectively, similar to other studies [21,22]. Primary site surgical treatment and chemotherapy can improve overall survival. Therefore, for patients with ovarian cancer with synchronous lung metastases, active surgery, and chemotherapy are encouraged. At the same time, the mixed histological type is a high-risk factor for mortality, and physicians should attach great importance to it. The present study has several limitations that should be noted. The main limitation is that the variables used to construct the nomogram only used clinico-pathological features because there were no important tumor biomarkers in the SEER database. Another limitation is that although the established nomogram shows good discrimination and verification capabilities, it still requires further verification based on large-scale external queues. Third, only patients with synchronous lung metastases were analyzed. Since they may not be recorded in the SEER databases, metachronous lung metastases that occurred later in the disease were not analyzed. This was a retrospective study. The patients were selected from the hospital, so there was a selection bias.

Conclusion
Lung metastasis is an independent risk factor affecting the prognosis of patients with ovarian cancer. In the first diagnosis of ovarian cancer, early detection of synchronous lung metastases through routine screening is beneficial for high-risk patients.
The present study is the first to use population-based SEER database to generate a risk model based on clinical and tumor characteristics to predict the risk of synchronous lung metastases in newly diagnosed ovarian cancer patients with high accuracy. The present study preliminarily determined the prognostic factors related to synchronous lung metastases in patients with ovarian cancer, which will help doctors to provide targeted treatment strategies for patients at different risk levels and improve the survival rate and quality of life of patients.