An miRNA signature associated with tumor mutation burden in endometrial cancer

Abstract Tumor mutation burden (TMB) is an essential biomarker to predict immunotherapy response. TMB measurement was mainly evaluated by whole-exome sequencing (WES), which was costly and difficult to be widely applied. In the present study, we aimed to establish and validate a miRNA signature to predict TMB level in endometrial cancer using The Cancer Genome Atlas (TCGA) database. MiRNA expression and somatic mutation profiles of uterine corpus endometrial carcinoma (UCEC) were downloaded from TCGA database. Total 518 patients with UCEC were randomly classified into training set (n=311) and validation set (n=207). Thirty-five differentially expressed miRNAs between high-TMB and low-TMB group were identified in training set. Least absolute shrinkage and selection operator (LASSO) method was performed to select out 26 miRNAs to establish the optimal signature. The accuracy of the miRNA signature for predicting TMB level was 0.833 for training set, 0.749 for validation set and 0.799 for total set. Moreover, the miRNA signature had significant correlation with immune checkpoints related genes (PD-1, PD-L1, CTLA-4) and mismatch repair related genes (BRCA1, BRCA2, MLH1, MSH6) expression. In conclusion, this miRNA signature could predict TMB level in endometrial cancer and might have some merits in providing guidance for immunotherapy in endometrial cancer.


Introduction
Endometrial cancer was the fourth women cancer worldwide and the commonest gynecological malignancy in developed countries [1,2]. There were 61,880 new cases and 12,160 deaths of uterine corpus carcinoma in 2019 estimated by American Cancer Society [1]. Generally, majority (67%) of patients were diagnosed in early stage with localized diseases and approximately 30% of patients had regional/distant diseases in late stage [1]. Early-stage patients could achieve favorable outcomes with 5-year overall survival rate of 80-95%, but patients with advanced stage had decreased survival with 5-year survival rate of 68% and 17% for stage III and stage IV, respectively [3,4]. Localized patients could benefit from radical surgery; however, for metastatic/recurrent patients, current therapeutic strategies presented limited survival benefit [5,6].
In recent years, immunotherapy, including anti-programmed death-1 (PD-1)/anti-programmed death-ligand-1 (PD-L1) inhibitors and anti-cytotoxic T-lymphocyte antigen 4 (CLAT-4) inhibitors, had greatly implemented therapeutic advancements in various types of recurrent/metastatic cancers [7][8][9]. Pembrolizumab (a PD-1 immune checkpoint inhibitor) was approved by Food and Drug Administration (FDA), manifested good response in mismatch repair deficient (dMMR)/microsatellite instability-high (MSI-H) solid tumors including colon cancer and endometrial cancer [10,11]. In a phase 2 clinical trial, approximately 20% (3/15) of patients with dMMR/MSI-H endometrial cancer had a complete response and 33% (5/15) had a partial response after anti-PD-1 treatment [12]. Tumor mutation burden (TMB), an essential biomarker for immunotherapy response, was defined as whole number of gene variants [13]. High TMB manifested a large count of mutated genes, which encoded aberrant tumor neoantigen [14]. Higher TMB suggested better response to immunotherapy in various tumor types [13,15]. TMB assessment was largely from whole exome sequencing, which needed sufficient tumor samples and was extremely expensive to be widely implemented. Therefore, a costless and convenient tool to predicting TMB level was in urgent needed. Mutated genes underwent transcriptional and post-transcriptional modifications to generate abnormal oncoproteins, which triggered anti-cancer immune response. MicroRNAs (miRNAs) were viewed as crucial regulators in post-translational gene modifications [16].
MiRNAs were small, non-coding RNA molecules, approximately 20-22 nucleotides in length, participating in post-transcriptional modification process and degradation of targeted messenger RNAs, which was involved in tumor proliferation/tumor suppression [16]. In the past few years, several researches had revealed that miRNAs could be candidate biomarkers for carcinogenesis, diagnosis/differential diagnosis and prognosis of gynecological cancers [17][18][19][20]. Regarding endometrial cancer, miRNAs appeared associated with initiating lesions and worse clinical outcomes such as positive lymph node status, lymph-vascular space invasion, shorter overall survival [21,22]. However, the role of miRNAs in personalized cancer treatment has been not fully elucidated yet. A study conducted by Peng et al. [23] revealed that three miRNAs (hsa-miR-320d, hsa-miR-320c, hsa-miR-320b) might predict immunotherapy response in non-small cell lung cancer.
MiRNAs were crucial in post-transcriptional modifications of gene expression regulation. Theoretically, mutated genes might be modified by miRNAs, encoding loads of neoantigens. However, the association between miRNAs and TMB level was not investigated before. In the present study, we hypothesized that TMB level might affect miRNAs expression and differentially expressed miRNAs might predict TMB level. In the present study, we tried to establish and validate a miRNA signature to predicting TMB level in endometrial cancer from The Cancer Genome Atlas (TCGA) database.

Data source and cases grouping
First, transcriptome data and gene expression data of uterine corpus endometrial carcinoma (UCEC) including 25 normal tissues and 552 tumor samples were downloaded from TCGA database via GDC portal (https://portal.gdc. cancer.gov/). Second, we obtained somatic mutation profiles of 530 UCEC samples from 'Masked Somatic Mutation' category in TCGA database (https://cancergenome.nih.gov/), which included four subtypes of mutation profiles by four different software process. We selected somatic mutation profiles based on 'MuTect2 Variant Aggregation and Masking' process for subsequent analysis. Third, isoform expression data of UCEC (22 matched normal tissues and 546 tumor samples) were acquired from TCGA database.
Clinical information of 518 patients with UCEC on age at diagnosis, race, ethnicity, menopause status, histological type, tumor grade, clinical stage, survival time, survival status was collected. Final 518 patients with miRNA expression data and somatic mutation data were used for further analysis in the present study. Then, total 518 patients were randomly assigned into training set (60%) and validation set (40%) by R package 'caret' . The flow chart of study design was shown in Supplementary Figure S1.

Differentially expressed miRNA in two TMB groups in the training set
TMB, defined as total number of somatic gene coding errors, could be calculated as (whole counts of gene variants)/(whole length of exons) [14]. We calculated TMB of each samples via Perl script (https://www.perl.org/) and divided all patients into the low TMB group and the high TMB group by the median TMB. Then, we screened for all miRNAs and selected out differentially expressed miRNAs in two TMB groups with fold change ≥1.5 and FDR<0.01 by R package 'limma' . All differentially expressed miRNAs were presented in heatmap by R 'pheatmap' package.

Construction and validation of TMB-related miRNA signature by LASSO
The least absolute shrinkage and selection operator (LASSO) method with a powerful predictive value and a low correlation between each other to prevent overfitting was applied to select optimal features for the high-dimensional data [24]. LASSO regression method was performed to establish optimal miRNA signature to predicting TMB level in training set by R package 'glmnet' . Each miRNAs in this signature had their regression coefficients (β) for predicting TMB and the classifier index could be calculated as: index = (expression of miRNA 1 )*β 1 + (expression of miRNA 2 )*β 2 + (expression of miRNA 3 )*β 3 . . . . . . ..+ (expression of miRNA n )*β n . Receiver operating curve (ROC) analysis was performed to verify the accuracy of the miRNA signature in total set, training set and validation set. Principal component analysis (PCA) was used prior to LASSO method to present all differentially expressed miRNA for predicting TMB level and PCA was used after LASSO method to present optimal miRNAs in the signature for predicting TMB level. All samples were presented in two-dimensional plots by PCA.

Functional analysis
Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analysis were performed in differentially expressed miRNAs in the signature between high TMB group and low TMB group via online DIANA-mirPath software (version:3.0) derived from TarBase 7.0 (http://snf-515788.vm.okeanos.grnet.gr/) with P value <0.01 [25]. Then, top 20 significant biological pathways of KEGG and biological process (BP) component of GO analysis were visualized in bubble plot by R package 'ggplot2' . Moreover, we investigated the targets of all differentially expressed miRNAs in this signature via three miRNA target-predicting database (miRDB, TargetScan and miRTarBase database) [26][27][28]. All predicted targets in these three database were presented in Venn diagrams by R package 'VennDiagram' .

Correlation of miRNA-based signature with TMB and gene expression
We extracted gene expression of each samples from TCGA database and calculated TMB level and signature index of each samples. Then we investigated the association between the miRNA signature and TMB level and several genes expression including mismatch repair (MMR) related genes (BRCA1, BRCA2, MLH1, MSH2, MSH6, PSM2) and immune checkpoints related genes (PD-1, PD-L1, CTLA-4) by Spearman method.

Statistical analysis
All statistical analysis was conducted by R software (version 4.0.0) for windows. The statistical significance of all clinicopathological characteristics in training set and validation set was tested by Chi-Square test. The significance with the continuous variables such as age at diagnosis and TMB were assessed by nonparametric tests (e.g., rank sum tests). Statistical significance was set by P<0.05.

Differentially expressed miRNA in two TMB groups
Total 518 patients with UCEC from TCGA database were randomized into the training set (n=311) and validation set (n=207) (Supplementary Figure S1). Demographics and clinicopathological characteristics of these two cohorts presented no significant difference, shown in Table 1. Then 35 differentially expressed miRNAs in these two TMB groups were selected out by FC ≥1.5 and P value <0.01. All differentially expressed miRNAs in each samples were presented in heatmap plot ( Figure 1). Among these 35 differentially expressed miRNAs, 21 miRNAs were up-regulated and 14 miRNAs were down-regulated in high TMB group (Supplementary Table S1).

Function analysis of miRNAs in the signature
To investigate the functional role of 26 miRNAs in the signature, we performed KEGG and GO analysis via DIANA-mirPath software. We discovered that these miRNAs were mostly enriched in carcinogenesis process and activated tumor proliferation pathways (Ras pathway, PI3K-Akt pathway and Rap1 pathway) by KEGG analysis (Figure 3A). For BP process in GO analysis, most miRNAs were enriched in cellular metabolism and protein modifications process ( Figure 3B and Supplementary Table S3). Via three targets predicting database (miRDB, TargetScan and miR-TarBase database), we discovered that MMR-related genes (BRCA1, BRCA2, MLH1, MSH2, MSH6, PMS2) and immune checkpoints related genes (PD-1, PD-L1 and CTLA-4) were possible targets for 26 miRNAs in the signature (Supplementary Table S4). All predicted targets of 26 miRNAs were shown in Venn diagram (Supplementary Figure  S2).

Discussion
Immunotherapy had been initiated in advanced/recurrent endometrial cancer manifesting good response. In the Keynote 028 clinical trial (Pembrolizumab) enrolling 24 patients with PD-L1 positive endometrial cancer, the overall response rate was 13% and 6-month progression-free survival and overall survival were 19%, 68.8%, respectively [10]. Despite of PD-L1 expression, dMMR/MSI-H endometrial cancers also well responded to immunotherapy [12]. TMB level was viewed as another indicator for immunotherapy response and higher TMB was associated with longer overall survival after immunotherapy across multiple cancer types [13,29]. However, the golden standard for TMB measurement was whole-exome sequencing, which was costly and difficult for many institutions to put into use. Therefore, alternative methods for predicting TMB level were needed. In the present study, our group was the first to establish and validate a TMB-related miRNA signature in endometrial cancer, and we discovered that this signature had good predictability for TMB level and had potential to be an alternative tool to estimate TMB level in endometrial cancer.
To investigate functional role of these differentially expressed miRNAs in the signature, we performed KEGG and GO analysis to find out they were mostly enriched in carcinogenesis related process and tumor proliferation pathway (Ras, PI3K-Akt, Rap1 pathway). Furthermore, we predicted possible targeted genes for these miRNAs, and surprisingly discovered that immune checkpoints related genes (PD-1, PD-L1, CTLA4) and MMR-related genes (BRCA1, BRCA2, MLH1, MSH2, MSH6, PSM2) were potential targets for the signature. Above these findings, we explored the correlation between the miRNA signature and these genes expression, we discovered that this signature was positively correlated to the expression of PD-L1, CTLA4 but negatively correlated to PD-1 expression. Despite immune checkpoints related genes, MMR-related genes (BRCA1, BRCA2, MLH1, MSH6) were also proven association with the signature. In fact, Pembrolizumab was recommended to all MSI-H solid tumors and 78% had responses that lasting for at least 6 months; however, most endometrial cancers were microsatellite stable (MSS) [2]. MSS endometrial cancers might focus more on targeted therapy such as anti-angiogenic agents and poly (ADP-ribose) polymerase (PARP) inhibitors. Based on our findings that this miRNA-based signature had correlation with MMR-related genes, it might have some merits in providing guidance for PARP inhibitors against MMR deficiency in endometrial cancer.
Our study innovatively established and validated a miRNA signature to predict TMB level, which might provide evidence for the predicting value of miRNA on immunotherapy response. Meanwhile, this signature could predict the expression of immune checkpoints related genes and MMR-related genes, which will promote further exploration of immunotherapy and targeted therapy in endometrial cancer. However, this study was limited to a database study, the efficiency and accuracy of the miRNAs signature needed more clinical investigations to verify.

Conclusions
We demonstrated a miRNA signature was a useful tool to predict TMB level in endometrial cancer and had a correlation with expression of immune checkpoints genes and MMR-related genes.