Association analysis of miRNA-related genetic polymorphisms in miR-143/145 and KRAS with colorectal cancer susceptibility and survival

Abstract Background: There is accumulating evidence of aberrant expression of miR-143 and miR-145 and their target gene KRAS in colorectal cancer (CRC). We hypothesize that single nucleotide polymorphisms (SNPs) within or near mRNA–microRNA (miRNA) binding sites may affect miRNA/target gene interaction, resulting in differential mRNA/protein expression and promoting the development and progression of CRC. Methods: We conducted a case–control study of 507 patients with CRC recruited from a tertiary hospital and 497 population-based controls to assess the association of genetic polymorphisms in miR-143/145 and the KRAS 3′ untranslated region (3′UTR) with susceptibility to CRC and patients’ survival. In addition, genetic variations of genomic regions located from 500 bp upstream to 500 bp downstream of the miR-143/miR-145 gene and the 3′UTR of KRAS were selected for analysis using the Haploview and HaploReg software. Results: Using publicly available expression profiling data, we found that miR-143/145 and KRAS expression were all reduced in rectal cancer tissue compared with adjacent non-neoplastic large intestinal mucosa. The rs74693964 C/T variant located 65 bp downstream of miR-145 genomic regions was observed to be associated with susceptibility to CRC (adjusted odds ratio (OR): 2.414, 95% CI: 1.385–4.206). Cumulative effects of miR-143 and miR-145 on CRC risk were observed (Ptrend=0.03). Patients having CRC carrying variant genotype TT of KRAS rs712 had poorer survival (log-rank P=0.044, adjusted hazard ratio (HR): 4.328, 95% CI: 1.236–15.147). Conclusions: Our results indicate that miRNA-related polymorphisms in miR-143/145 and KRAS are likely to be deleterious and represent potential biomarkers for susceptibility to CRC and patients’ survival.


Introduction
Colorectal cancer (CRC) is one of the most commonly occurring malignancies worldwide. According to The Global Burden of Cancer 2013, colon and rectal cancer ranked third for cancer incidence and fourth for cancer deaths [1]. In China, incidence and mortality statistics of CRC for 2014, published by the National Cancer Center, showed a similar trend, with CRC ranking the third and fifth place for cancer incidence and cancer deaths, respectively [2].
The development of CRC is a multifactorial and multistep process involving the gain and maintenance of specific genomic alterations [3]. Over the past few decades, many associations have been identified between the variation of protein-coding genes and CRC. In the recent years, high-resolution maps of the human transcriptome have led to the discovery of a large number of non-protein-coding RNA genes and brought about a paradigm shift in our understanding of the function of variations in non-coding RNAs (ncRNAs) [4]. The ncRNAs include a class of short RNA molecules termed microRNAs (miRNAs), which are endogenous small ncRNAs that repress protein-coding genes by binding to target sites in the 3 untranslated region (3 UTR) of mRNAs. These miRNAs are involved in the regulation of almost all physiological and pathological processes, including cell proliferation, differentiation, and apoptosis [5].
MiR-143 and miR-145, which are located close to each other on 5q33, are co-transcribed from a single promoter and generate a primary transcript containing both miRNAs [6]. In 2003, miR-143 and miR-145 were reported to be down-regulated in colorectal tissue for the first time [7]. Subsequently, a series of studies confirmed these results [8][9][10]. Decreased expression of these two miRNAs is involved in various cancer-related events, including proliferation, invasion, and migration, suggesting that they have anti-tumorigenic activity [11][12][13]. The KRAS oncogene is an important upstream mediator of the MAPK pathway, and its overexpression can lead to increased activation of the RAF/MEK/MAPK pathway, thereby promoting tumorigenesis [14]. KRAS is an important target of miR-143/145, which has been identified not only by computational predictions using software such as TargetScan, miRanda, and PicTar, but also by experimental validation [15,16].
Mutations in either miRNAs or their co-expressed miRNA binding sites are often deleterious, which can affect miRNA/target gene interaction, resulting in differential mRNA or protein expression and increased susceptibility to common diseases [17]. This view was supported by some studies of miRNA-related genetic alterations with different types of cancer, including CRC [18][19][20]. However, published evidence for genetic variations of miR-143/145 and the 3 UTR of KRAS with CRC susceptibility are limited and not comprehensively investigated. Therefore, we conducted a case-control study to assess the association between these candidate biomarkers with risk of having CRC.

Study population
The present study was conducted in Hangzhou City, Zhejiang Province, China. Five hundred and seven patients with CRC and four hundred and ninety-seven cancer-free controls were enrolled in the study from May 2014 to May 2015. These patients were recruited from a tertiary hospital in Hangzhou, Zhejiang Province, China. Eligible cases were newly diagnosed and histologically confirmed CRC without pre-operative radiotherapy or chemotherapy. The control population was recruited from among the individuals who came to the community health service centre for medical examinations. The controls had no cancer history or intestinal diseases. All participants were Han Chinese and had lived in Zhejiang Province for more than 20 years.
The study was approved by the Medical Ethical Committee of Hangzhou Center for Disease Control and Prevention (No. 2019-4). All participants had signed informed written consent. Face-to-face interviews were conducted by trained interviewers who administered a structured questionnaire asking about demographic characteristics, family history of cancer, previous medical history, and lifestyle-related factors. Smoking history was defined as having smoked at least one cigarette per day for more than 1 year. Chronic alcohol drinking or tea drinking was defined as having consumed an alcoholic drink or tea for at least once per day for more than 3 months.

Polymorphism selection and genotyping
First, single nucleotide polymorphisms (SNPs) of genomic regions located from 500 bp upstream to 500 bp downstream of the miR-143/miR-145 gene and the 3 UTR of KRAS were downloaded from 1000 Genomes (http://www. internationalgenome.org/) if they had minor allele frequency > 0.05 within the Southern Han Chinese (CHS) population. Then, tag single nucleotide polymorphisms (tagSNPs) representing SNPs with the pairwise correlation of r 2 > 0.8 were further selected using the tagger algorithm implemented in the Haploview software. The function of the tagSNPs was predicted using RegulomeDB and HaploReg. Finally five polymorphisms were selected for study: KRAS rs712, rs1137196, miR-143 rs41291957, miR-145 rs74693964, and rs80026971. Detailed information regarding the selected SNPs is listed in Supplementary Table S1.
In all the patients, 5 ml of peripheral blood was collected in anticoagulation tube at the time of pre-surgery examination and stored in −80 • C refrigerator. Genomic DNA was extracted from peripheral blood samples using a magnetic bead method with KingFisher Flex (Thermo Scientific, U.S.A.). The concentration and purity of the DNA samples were determined using a NanoDrop2000 spectrophotometer (Thermo Scientific, U.S.A.). Genotyping was performed using the Agena MassArray Genotyping Platform (Agena Inc. San Diego, CA, U.S.A.). Five percent blinded samples were repetitively genotyped and a negative control was interspersed throughout the genotyping assays. The detection rates of all SNP genotyping assays were ≥96%. The concordance rates for duplicated samples were 100%.

Gene expression analysis
The miR-143/145 microarrays were downloaded from the Gene Expression Omnibus (GEO) database (www.ncbi. nlm.nih.gov/geo), accession no. GSE38389. From this dataset, 66 paired samples from rectal tumor tissue and non-neoplastic large intestinal mucosa samples were collected, and miRNA expression profiles were detected using the GPL11039 platform (Exiqon miRCURY LNA microRNA array v.9.2 Extended Version).
We used Oncomine (www.oncomine.org, last accessed on 15 March 2021) to conduct a meta-analysis for KRAS gene expression. We extracted the qualified datasets by key words as follows: Gene 'KRAS'; Cancer type: 'Colorectal cancer'; Analysis type: 'Cancer vs. Normal Analysis'; Data Type: 'mRNA' from the the Oncomine database. There are seven arrays (Ki colon, Kaiser Colon, Skrzypczak Colorectal 2, TCGA Colorectal, Skrzypczak Colorectal, Hong Colorectal, Gaedcke Colorectal) including 578 CRC cases and 179 controls involved in the meta-analysis. Furthermore, We used the UALCAN database (http://ualcan.path.uab.edu/) to analyze the KRAS mRNA expression based on the TCGA CRC dataset.

CRC death surveillance
A follow-up survey of the patients with CRC was performed with Hangzhou household registration using the Cancer Registration System and Death Surveillance System of Hangzhou Center for Disease Control and Prevention. The date of censorship was 1 January 2019. We applied the identification card (ID) number and name of patients with CRC to match in order to acquire the survival outcome from the Surveillance Systems. The date and cause of death were recorded for the survival analysis.

Statistical analysis
A two-sided Student's t test was used to compare the differences in the quantitative data, and a chi-square test was used to compare categorical data between the two groups. Departures from Hardy-Weinberg equilibrium were tested using goodness-of-fit chi-square test. Multivariate logistic regression analysis was performed to explore the association between the selected SNPs and risk of CRC with adjustment for age, gender, and family history of cancer. A likelihood ratio test was used to assess the interaction effects between the SNPs and smoking with respect to CRC. The Cochran-Armitage test was used for trend analysis. Kaplan-Meier survival analysis and log rank test were used to assess survival outcome, that is, overall survival (OS) of the patients in relation to the genotypes. Multivariate Cox regression analysis was performed to calculate relative risk [hazard ratio (HR)] and 95% confidence interval (CI) associated with genetic polymorphisms from cancer diagnosis until the end of the study or death. Statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS) version 25 (IBM, New York, U.S.A.). A P-value <0.05 was considered statistically significant.

Characteristics of the study population
Of the 507 patients with CRC patients, 209 had colon cancer cases and 298 had rectal cancer cases. The baseline characteristics and lifestyle factors are shown in Table 1. There was no significant difference in age between the cases patients and the controls, but the proportion of males was higher in the cancer group (64.89 vs. 57.95%). In addition, patients with CRC were more likely to have a lower education level and lower body mass index (P<0.05) than the controls. CRC patients also reported higher percentages of family history of cancer and history of appendicitis in comparison with controls (P=0.034, P<0.001, respectively). On the other hand, no significant differences were found between the patients with CRC and control groups with respect to tobacco smoking, alcohol drinking, or tea drinking.

mRNA expression analysis of miR-143, miR-145, and KRAS
We extracted published microarray data from GEO datasets GSE38389 and compared the mRNA expression of miR-143, miR-145 between rectal cancer tissue and adjacent non-neoplastic large intestinal mucosa. As shown in Figure 1, miR-143 expression was under-expressed (log2-fold difference < −1) in 26 out of 66 matched pairs of rectal tumor samples and non-neoplastic samples (P-value for paired t test <0.001). MiR-145 showed the same trend, with decreased expression (log2-fold difference < −1) in 28 out of 66 pairs of samples (P-value for paired t test <0.001) ( Figure 1A,B).
We performed a statistical comparison of KRAS expression from multiple CRC studies published in Oncomine database. Seven independent microarray studies comprising a total of 578 CRCs and 179 normal colorectal mucosa samples were evaluated from meta-analysis data by Oncomine. Meta-analysis identified that KRAS mRNA was under-expressed in CRC tissue (median rank = 2534.5, P=0.001). Based on the UALCAN data, KRAS mRNA expression was down-regulated in CRC tissue (Figure 2A,B). The expression of KRAS in CRC samples at any stages was lower than that in normal samples ( Figure 3A,B).

Polymorphisms of miR-143, miR-145, and KRAS 3 UTR and risk of CRC
KRAS rs712 and rs1137196, miR-143 rs41291957, and miR-145 rs74693964 and rs80026971 were genotyped in the present study. The genotype distribution of the five SNPs in the control group all conformed to Hardy-Weinberg equilibrium; their associations with the risk of CRC are presented in Table 2. As shown in the  (Tables 2 and 3). When stratified by smoking status, we found that the genotype distributions of miR-143 rs41291957 among non-smokers differed significantly between cases and controls. Compared with the GG genotype, those carrying heterozygous genotype GA had a nearly 40% increased risk for developing CRC (adjusted OR = 1.397, 95% CI: 1.007-1.936). In non-smokers, miR-145 rs74693964 remained a significant risk factor for CRC among subjects carrying the CT genotype (adjusted OR = 3.086, 95% CI: 1.468-6.484). Interaction analyses of the two SNPs and tobacco  smoking were conducted using a multiplicative model; neither interaction effect showed statistical significance (Table  4).
Although miR-143 and miR-145 are located close to each other on 5q33, our analysis showed no linkage disequilibrium between them. To evaluate the potential cumulative effects of miR-143 and miR-145, we defined at-risk genotypes as those with OR values greater than 1 under a dominant model of rs41291957 and rs74693964. We compared the distributions of the number of at-risk genotypes between cancer cases and controls. The risk of having CRC increased with the number of at-risk genotypes (P trend =0.003). When split by cancer location, individuals harboring two at-risk genotypes had an increased risk of rectal cancer relative to those with none (OR = 3.738, 95% CI: 1.725-8.101) ( Table 5).     (Table 6).

Discussion
MiR-143 and miR-145 which are located on 5q23 which may originate from the same primary miRNA. Michael et al. showed that miR-143 and miR-145 displayed consistently decreased expression levels of mature miRNA at the colorectal neoplasm when compared with healthy colorectal mucosa. Several other studies have confirmed this finding [9,21]. The present study found that both miR-143/145 showed reduced expression in rectal cancer when compared with adjacent non-neoplastic large intestinal mucosa based on microarray gene expression datasets from GEO, which is consistent with previous studies. KRAS is one of the most frequently mutated genes in CRC risk. A number of recent studies have demonstrated the significance of KRAS mutation in CRC carcinogenesis [15,22]; however, KRAS gene expression status in CRC has been less reported. In view of this, we conducted a meta-analysis for KRAS gene expression from multiple CRC studies published in Oncomine database. We found that KRAS expression was   [16]. This result was interpreted in terms of a feed-forward mechanism in which the miR-143/145 polycistronic cluster targets the RAS-responsive element-binding protein RREB1 and KRAS, which, in turn, induce down-regulation of the cluster [14].
Emerging evidence has shown that miRNA-related SNPs may alter an individual's susceptibility to CRC by disrupting miRNAs' process, expression, or interaction with target mRNA [23]. However, no SNP of the miR-143 and miR-145 genes could be identified by the HapMap and SNP database (dbSNP) retrieval. Thus we selected the SNPs within the miRNA regulatory region/transcription factor-binding sites for further study. MiR-145 rs74693964 is located 65 bp downstream of miR-145. According to functional predictions based on HaploReg annotations [24] and  the RegulomeDB database [25], this SNP has been identified as a promoter histone modification or enhancer histone modification region in more than 20 tissues, including colonic mucosa and rectal mucosa. In the present study, individuals with the CT genotype of rs74693964 in the Chinese population had a two-fold increased risk for having CRC when compared with those carrying the CC genotype. After stratification by smoking status, miR-145 rs74693964 was found to be significantly associated with an increased risk of CRC among patients who were non-smokers. To date, only two studies have reported an association of miR-145 rs74693964 with risk of cancer; one was a study of cervical cancer and the other of non-small-cell lung cancer [26,27]. No similar study involving CRC has yet been reported. To our knowledge, the present work is the first investigation of the link between miR-145 rs74693964 and risk of CRC in the Chinese population.
In previous studies, Li et al. [28] reported a significant effect of mutant genotypes or alleles of rs41291957 on risk of having CRC, On the other hand, Ying et al. [29] failed to find any association between rs41291957 and susceptibility to CRC. In our study, risk of rectal cancer was shown to be associated with the rs41291957 heterozygous genotype. Rs41291957 is located 91 base pairs (bp) upstream of miR-143. Saini et al. demonstrated that up to 60% of miRNAs have transcription factor binding sites (TFBSs) within 1 kilobases (kb) of the start of the pre-miRNA [30], indicating that rs41291957 in the promoter region may be involved in the transcriptional activation of miR-143. Furthermore, bioinformatic predictions using HaploReg and RegulomeDB indicated that rs41291957 is probably involved in epigenetic modifications that promote colorectal tumorigenesis.
In the present study, the cumulative effects of significant polymorphisms of miR-143 and miR-145 were evaluated in CRC. The risk of CRC increased with the number of at-risk genotypes, especially in rectal cancer. The average SNP density of clustered miRNAs was significantly lower than that of the individual miRNAs, which may to some degree reflect the critical biological functions regulated by clustered miRNAs [31]. The miR-143/145 cluster co-ordinately plays an important part in the carcinogenesis of CRC [32]. It is thus a reasonable assumption that the more mutations occur in the miR-143/145 cluster, the greater the risk of CRC.
KRAS is a direct target of miR-143/145. In the present study, no SNP was identified within the binding region of miR-143/145, nor was there any association with risk of having CRC. However, our results indicated that the rs712 G>T polymorphism in the 3 UTR of the KRAS gene may modulate survival outcome of patients with CRC. Multiple miRNAs, including miR-200b, miR-200c, and miR-429, target rs712. The miR-200 family (miR-200b, miR-200c, and miR-429) has been widely investigated with regard to its role in tumor metastasis and regulating cancer stem cells [33]. Pichler et al. found that miR-200 family expression was associated with poor prognosis in patients with CRC and with cancer stem cell properties in CRC [34]. Therefore, the rs712 G>T change might attenuate its binding capacity with the miR-200 family. Although the association between the KRAS rs712 polymorphism and cancer risk has been widely studied [35][36][37], the effects of this polymorphism on survival of patients with CRC are still unclear. Schneiderova et al. [38] indicated that individuals with colon cancer carrying the heterozygous GT genotype had longer OS. On the other hand, the survival impact of rs712 on survival of patients with CRCl was not significant in a study by Dai and colleagues [39]. Our study suggests that a poor prognosis in Chinese patients with CRC is associated with the homozygous TT genotype. The limited and conflicting results on the prognostic value of KRAS rs712 as a predictor for survival of patients with CRC indicate that more studies in different populations are required.
There were some limitations to the present study. First, owing to the lack of RNA samples for the study population, we were unable to carry out functional validation tests. The biological functions of the selected SNPs in CRC were inferred and predicted using the available online tools. Second, the participants in the cancer group and control group were collected from a hospital and from the community, respectively; thus, selection bias cannot be ignored. Finally, the relatively small sample size, especially for the survival analysis, may have hindered the ability of the study to detect weak gene-disease associations and gene-environment interactions.

Conclusions
In conclusion, our results suggest that rs74693964 C/T and rs41291957 G/A in the miR-143/145 cluster might have cumulative effects on risk of rectal cancer. Rs712 G/T in KRAS might be associated with poorer survival in CRC. Further large population-based prospective studies as well as functional validation are warranted to advance our understanding of the role of these factors in CRC.

Data Availability
The analysed datasets generated during the study are available from the corresponding author on reasonable request.