APC gene 3′UTR SNPs and interactions with environmental factors are correlated with risk of colorectal cancer in Chinese Han population

Abstract Objective: To study the correlation between adenomatous polyposis coli (APC) gene 3′ untranslated region (UTR) single nucleotide polymorphisms (SNPs) and their interactions with environmental factors and the risk of colorectal cancer (CRC) in a Chinese Han population. Methods: Genotypes of APC gene 3′UTR rs1804197, rs41116, rs448475, and rs397768 loci in 340 Chinese Han patients with CRC and 340 healthy controls were analyzed. All patients with CRC were analyzed for progression-free survival (PFS) during a 3-year follow-up. Results: The risk of CRC in subjects carrying the APC gene rs1804197 A allele was 2.95-times higher than for the C allele carriers. The interactions of the rs1804197 SNP with body mass index (BMI) and smoking were associated with the risk of CRC. The risk of CRC in the APC gene rs397768 G allele carriers was 1.68-times higher than in the A allele carriers. The interaction between the rs397768 locus SNP and gender was also associated with the risk of CRC. The 3-year PFS of patients with APC gene rs1804197 AA genotype, CA genotype, and CC genotype CRC decreased in this order, with significant difference. In addition, the 3-year PFS of rs397768 locus GG genotype, AG genotype, and AA genotype CRC patients decreased in this order, and the difference was significant. Conclusion: The rs1804197 locus in the 3′UTR region of the APC gene and its interactions with BMI and smoking are associated with the risk of CRC in a Chinese Han population. In addition, the interaction between rs397768 locus SNP and gender is related to the risk of CRC.


Introduction
Colorectal cancer (CRC) is a common malignant tumor of the digestive tract, and its incidence in China has shown an increasing trend in recent years [1]. With the development of tumor molecular biology, the study and understanding of CRC have entered a new stage. CRC is a systemic kind of disease involving multiple stages, signaling pathways, and pathology-related genes, and is characterized by an ethnic distribution, familial aggregation, and genetic defects [2][3][4].
Owing to biological diversity, that is, human heterogeneity and the multiple steps leading up to carcinogenesis, individuals have different sensitivities to carcinogen exposure [5]. Genetic polymorphism is a critical reason for differences in response to environmental factors. Single nucleotide polymorphism (SNP) is an important part of functional genomics, being one of the genetic bases of individual differences, and analysis of it is important for studying the genetic features of tumors. For example, XPC gene polymorphisms have been linked to susceptibility to CRC [6,7]. Thus, properly identifying genes The criteria for smoking: smoking of one or more cigarettes per day; the criteria for drinking alcohol: drinking more than 50 g of alcohol per day. Abbreviation: SD, standard deviation.
and their allelic variants, and exploring the interactions between environmental factors and susceptibility genes is extremely important for defining higher risk populations, assessing the risk of disease and developing an effective earlier warning system. The adenomatous polyposis coli (APC) gene is one of the genes most closely related to CRC. This tumor suppressor gene was discovered by Herrera et al. [8]. Its product, APC protein, is mainly involved in Wnt/β-catenin signaling. It forms a complex with axin and glycogen synthase kinase-3β (GSK-3β), which ensures the normal regulation of the Wnt/β-catenin signaling pathway in cell differentiation, proliferation, polarity, and migration [9]. There are diverse variants of the APC gene in patients with CRC [10,11]. Dysfunction of this gene is believed to be related to the development of CRC, by affecting not only the proliferation and differentiation of epithelial cells, but also the adhesion and migration [12,13].
In the present study, the rs1804197, rs41116, rs448475, and rs397768 loci in the 3 untranslated region (3 UTR) region of the APC gene were selected. The minor allele frequency (MAF) of the rs1804197 locus in the southern Chinese Han population is 0.1095. A previous study found that rs1804197 SNP was associated with autism spectrum disorder (ASD) [14]. The MAF of the rs41116 locus is 0.3190. A number of studies have focused on this locus [15][16][17]; however, no study has yet confirmed that it is related to CRC risk. The MAF of rs448475 is 0.3190. It was reported that SNP in this locus is associated with non-syndromic cleft lip with or without cleft palate (NSCL+ − P), which may be related to the binding of microRNA-617 [18]. The MAF of the rs397768 locus is 0.2143, which was previously shown to potentially be associated with breast cancer [17]. The above four SNP loci are located in the 3 UTR region of the APC gene, which can bind to microRNA, degrade mRNA, or inhibit mRNA translation to regulate APC expression. The aim of the current study was to analyze the association between rs1804197, rs41116, rs448475, and rs397768 SNPs in the 3 UTR region of the APC gene and the risk of CRC, as well as to explore the effect of their interactions with environmental factors on CRC.
In summary, through a case-control study, we found that the rs1804197 locus in the 3 UTR region of the APC gene and its interactions with body mass index (BMI) and smoking were associated with the risk of CRC in the Chinese Han population. Moreover, the SNP of the rs397768 locus and its interaction with gender were related to the risk of CRC.

Basic information on the participants
A total of 340 Chinese Han patients with CRC were recruited from Taizhou Cancer Hospital and Zhejiang Cancer Hospital between February 2014 and January 2016, including 177 males and 163 females, aged 39-85 years. All patients were identified as having CRC by pathological diagnosis. The tumor-node-metastasis (TNM) staging was established with reference to the International Union Against Cancer (UICC) cancer staging criteria [19], with 54 cases in stage I, 62 cases in stage II, 126 cases in stage III, and 98 cases in stage IV. Another cohort of 340 healthy individuals without CRC was enrolled as a control group, with 181 males and 159 females, aged 41-83 years. There were no significant differences between the two groups in age, gender, BMI, smoking, drinking status, and other factors (P>0.05), as shown in Table 1. The control subjects did not have a history of tumors. Informed consent was signed by all CRC patients and control subjects who participated in the present study. The study was approved by the Medical Ethics Committee of Taizhou Cancer Hospital and Zhejiang Cancer Hospital. The recruitment was performed in accordance with the principles of the World Medical Association's Declaration of Helsinki.

Genotype analysis
To determine the genotypes of APC gene 3 UTR SNPs, 5 ml of peripheral venous blood was collected from all participants, and the genomic DNA was extracted using a DNA extraction kit (TIANGEN Biotech Co. Ltd., Beijing, China), and stored in a freezer at −70 • C before testing. The genotype was analyzed by PCR/Sanger sequencing. The rs1804197 locus primers were: 5 -GAG GGT TTT TGT TCT GGA AGC C-3 (forward); 5 -CCA TCA AGA GTG CCT CCC AA-3 (reverse). The rs41116 locus primers were: 5 -CAT TCC ATG CGT TGG CAC TT-3 (forward); 5 -AGT CTG TGC TAG GCT GCT TG-3 (reverse). The rs448475 locus primers were: 5 -TCC CTG CCT GTT AAG GAA ACT-3 (forward); 5 -CCT CCA CTG TAT AAG GGG ACA C-3 (reverse). The rs397768 locus primers were: 5 -ACA CTC TGT ATT TGG GGA GGG-3 (forward); 5 -TCA AGG CAC CAG GTA GGT GT-3 (reverse). The PCR mixture contained 20 ng of genomic DNA, 2 μl of 10× PCR buffer, 20 pmol of each primer, 0.5 U of Taq DNA polymerase, and 1.6 μl of 2.5 mmol/l dNTP. The PCR was carried out under the following conditions: pre-denaturation at 94 • C for 1 min; then denaturation at 94 • C for 30 s, annealing at 56 • C for 30 s, and extension at 72 • C for 1 min, for a total of 35 cycles; followed by extension at 72 • C for 10 min. Sanger sequencing was used to determine the sequence of the PCR products, and the SNP genotype was determined by comparison with the sequence in an online database (https://www.ncbi.nlm.nih.gov/snp/).

Clinical follow-up
All patients were followed up to 3 years. The first patient was followed up from February 2014 to February 2017, while the last one was followed up until January 2019. All patients' progression-free survival (PFS) was recorded.

Statistical analyses
Continuous variables were expressed as mean + − SD and statistical analyses were conducted using an independent t test. The categorical variables were expressed as a percentage [n (%)] and the statistical analysis was performed using the χ 2 test. The genotype frequency was analyzed for Hardy-Weinberg equilibrium by the χ 2 test. The correlation between SNPs in the 3 UTR region of the APC gene and the risk of CRC was determined based on the distribution of allele frequencies and genetic models (additive, dominant, and recessive models). The odds ratio (OR) and 95% confidence interval (CI) were calculated in an unconditional logistic regression analysis, adjusted for age, gender, BMI, smoking, drinking, and other factors. Statistical analyses in the present study were performed using SPSS22.0 software (IBM, Chicago, IL). Multifactor dimensionality reduction (MDR) analysis was used to analyze the interaction between the rs1804197 and rs397768 loci and age, gender, BMI, smoking, drinking, and other factors. All tests were two-tailed and P<0.05 was considered statistically significant.

Correlation between SNP in the 3 UTR region of the APC gene and the risk of CRC
The genotype frequency distributions of the 3 UTR region of the APC gene rs1804197, rs41116, rs448475, and rs397768 loci were consistent with Hardy-Weinberg equilibrium (P>0.05), as shown in Table 2. Taking the CC genotype of the rs1804197 locus as a reference, the CA and AA genotype frequencies of CRC patients were significantly higher than those in the control group (P<0.05). The risk of CRC was not significantly increased in the additive model; however, it was significantly enhanced in both the dominant and recessive models (P<0.001). The risk of CRC in subjects with the A allele was 2.95-times higher than in C allele carriers (95% CI: 2.14-4.05, P<0.001). There were no significant differences in the genotype and allele frequencies of rs41116 and rs448475 loci between CRC patients and controls (P>0.05). Taking the AA genotype of the rs397768 locus as a reference, the difference in the AG genotype frequency between CRC patients and the control group was not statistically significant (P>0.05), while the frequency of the GG genotype was significantly higher in CRC patients than in the control group (P<0.001). The risk of CRC was not increased in the additive model (P>0.05); however, it was significantly increased in the dominant and recessive models (P<0.05). The risk of CRC in the G allele carriers was 1.68-times higher than that in the A allele carriers (95% CI: 1.30-2.18, P<0.001).

Stratified analysis of the correlation between APC gene rs1804197 SNP and the risk of CRC
A stratified analysis of age, gender, BMI, smoking, and alcohol drinking status revealed that the risk of CRC was significantly increased in the APC gene rs1804197 A allele carriers at both ≥60 and <60 years of age (P<0.05). The risk of CRC was significantly increased in both males and females carrying the APC gene rs1804197 A allele (P<0.05). Besides, only in patients with BMI ≥ 24 kg/m 2 , the risk of CRC in the APC gene rs1804197 A allele carriers was significantly increased (P<0.001), while in the population with BMI < 24 kg/m 2 , the risk of CRC in the APC gene rs1804197 A allele carriers was decreased (P>0.05). In non-smokers, the risk of CRC was significantly higher in subjects carrying the APC gene rs1804197 A allele (P<0.001), whereas in smokers, the subjects carrying the APC gene rs1804197 A allele were not at increased risk for CRC (P>0.05). In addition, in both drinking and non-drinking subjects, the risk of CRC in the APC gene rs1804197 A allele carriers was significantly increased (P<0.001), as shown in Table 3.

Stratified analysis of the correlation between APC gene rs397768 SNP and the risk of CRC
Further, analysis stratified by age, gender, BMI, smoking, and drinking status showed that the risk of CRC was not significantly increased in the APC gene rs397768 G allele carriers in subjects aged both ≥60 and <60 years (P<0.05). The risk of CRC was significantly increased in females carrying the G allele of the APC gene rs397768 locus (P<0.05), while males carrying the APC gene rs397768 G allele were not at risk for CRC (P>0.05). The subjects carrying the G allele in the rs397768 locus in the APC gene had no risk of CRC at both BMI ≥ 24 kg/m 2 and BMI < 24 kg/m 2 (P>0.05). Moreover, in smokers and non-smokers, the risk of CRC was significantly higher in carriers of the G allele of the rs397768 locus of the APC gene (p < 0.05), and in drinking and non-drinking subjects, the carriers of the G allele of the rs397768 locus in the APC gene did not have an increased risk of CRC (p > 0.05), as shown in detail in Table 4.

Haplotype analysis
Four haplotypes were found in the APC gene rs1804197, rs41116, rs448475, and rs397768 loci; namely, ACGG, CTCG, CTCA, and ATCA, respectively ( Figure 1). Analysis of the results showed that the risk of CRC in carriers    Table 5.
The false positive report rates (FPRPs) at different levels of prior probability are shown in Table 6. In those aged ≥60 or <60 years, males, females, those with BMI ≥ 24 kg/m 2 , with a smoking history, no smoking history, and no drinking history, there were significant correlations between the susceptibility to CRC and the rs1804197 CA/AA genotype, with an FFRP below 0.2 when the prior probability was 0.1. In females and those without a history of smoking, carrying the rs397768 AG/GG genotype was significantly associated with susceptibility to CRC. When the prior probability was 0.1, the FFRP was less than 0.2. In those with a history of smoking, the FFRP value of the correlation between carrying the rs397768 AG/GG genotype and the susceptibility to CRC was greater than 0.2,  indicating that the sample size might be small and the results might be biased. As such, this needs further studies in large samples.

MDR analysis of the interaction between genes and environmental factors
MDR was used to analyze the interaction between APC gene SNPs rs1804197 and rs397768 and environmental factors such as age, sex, BMI, smoking, and drinking status. The results showed a robust interaction between rs397768 and smoking, followed by the interaction between rs397768 and gender, as shown in Figure 2.

Association of the APC gene 3 UTR SNPs with the PFS of patients with CRC
After 3 years of follow-up, we found that the PFS of CRC patients significantly differed among subjects with the APC gene rs1804197 AA, CA, and CC genotypes, in decreasing order (P=0.004; Figure 3A). However, there was no significant difference in the 3-year PFS between different genotypes in the rs41116 and rs448475 loci of CRC patients (P>0.05; Figure 3B,C). The 3-year PFS of CRC patients differed significantly among subjects with rs397768 locus GG, AG, and AA genotypes, in decreasing order (P<0.001; Figure 3D).

Discussion
With increasing progress in molecular biology, the study of the etiology of CRC has advanced from environmental factors to environmental-genetic interactions, and gradually encroached on analyses at molecular level [20,21]. By investigating the molecular mechanisms behind the formation of tumors, a clearer understanding of the relationship among environment, genetics, genes, and CRC has been established. This advance in our knowledge has changed the treatment of CRC from surgery-based approaches to a comprehensive toolkit of surgery, radiotherapy, chemotherapy, and targeted therapy [22]. After extensive studies of the pathogenesis, it was gradually realized that the progression of CRC is a continuous multistage process. Under the influence of internal and external environmental factors such as the continuous accumulation of genetic changes and disruption of the normal regulation of cell division, apoptosis, and tissue self-stability, survival advantages develop, eventually leading to a malignant tumor [23,24]. Previous studies found that SNPs are one of the main causes of differences in tumor susceptibility [25]. Genetic material is impaired by a variety of factors, leading to reduced genomic stability; in particular, the interaction between genetic and environmental factors has been recognized as a factor contributing to a variety of tumors [26].
The APC gene is a tumor suppressor gene, and the product of its expression is an important component of the Wnt signaling pathway, which plays a critical regulatory role in cell growth, apoptosis, and signal transmission [27]. Inactivation of the APC gene results in disruption of the degradation of β-catenin, leading to the accumulation of free β-catenin in the cytoplasm and its translocation into the nucleus. This in turn activates Tcf/Lef, causing abnormal transcription of the c-myc, c-jun, and cyclin D1 genes, which eventually causes cancers to develop [28]. Changes in APC protein expression levels may be associated with the development of CRC [29]. In the present study, the selected SNP loci are located in the 3 UTR region of the APC gene, which is the region where microRNAs bind to the APC gene. The regulation of APC gene expression by microRNAs may be related to the efficiency of binding of microRNA to the APC gene 3 UTR region.
The results of the present study show that the risk of CRC in subjects carrying the A allele of the rs1804197 locus in the APC gene was 2.95-times higher than in carriers of the C allele. A previous study found that the rs1804197 SNP is associated with ASD [14], and suggested that this SNP may be related to the expression level of APC protein. Our further analyses showed that the interactions of the rs1804197 SNP with BMI and smoking were associated with the risk of CRC, suggesting that obesity and smoking may have a certain impact on the risk of CRC in patients with different rs1804197 genotypes. In obese patients, the rs1804197 A allele carriers have a higher risk of CRC, whereas in non-smokers, the rs1804197 A allele carriers have a higher risk of it. Interestingly, there is no cumulative effect between smoking and the rs1804197 SNP; however, there is a cumulative effect between non-smoking and the rs1804197 SNP. We speculate that smoking may also be a risk factor for CRC and confers a higher risk of CRC regardless of which rs1804197 locus genotype is carried.
We also found that the risk of CRC in APC gene rs397768 G allele carriers was 1.68-times higher than in A allele carriers, after adjustment for age, gender, BMI, smoking, drinking, and other factors. Further analyses revealed that the interaction between rs397768 locus SNP and gender was associated with the risk of CRC, as it was shown that only in female subjects was the risk of CRC significantly increased in the APC gene rs397768 G allele carriers. We suspect that this may be related to bad habits. Specifically, men are more likely to be smokers, drinkers, and have bad eating habits than women. Therefore, the risk of CRC is relatively high in males, regardless of the alleles of rs397768. It was also shown previously that men are more susceptible to CRC than women [30].
We further found that the APC gene rs1804197 SNP was associated with 3-year PFS in patients with CRC, and the 3-year PFS differed among patients with AA, CA, and CC genotypes, in descending order. The APC gene rs397768 locus SNP was also associated with 3-year PFS in patients with CRC, and the 3-year PFS varied among CRC patients with GG, AG, and AA genotypes, in descending order. We considered that the reason for this may be that APC gene rs1804197 and rs397768 SNPs are related to the expression level of APC. Low expression or inactivation of APC is one of the causes of cell hyperproliferation, which may eventually lead to a reduction in 3-year PFS in CRC patients. However, further studies are needed to confirm this.
The present study has several limitations. First, it was not clear which microRNA binds to the rs1804197 and rs397768 loci in the 3 UTR region of the APC gene, and therefore there is no direct evidence to support the correlation between the regulation of APC expression by microRNAs binding to the rs1804197 and rs397768 loci and CRC risk. Second, the 3-year follow-up of CRC patients was relatively short. However, considering the high rate of becoming lost to follow-up after 5 years, we finally decided to select the data for 3 years of follow-up. In addition, genotype-based mRNA expression analysis was not performed in the present study, so further studies are needed to perform this.