The association of polymorphisms in lncRNA-H19 with hepatocellular cancer risk and prognosis

Hepatocellular cancer (HCC) is one of the major causes of cancer-related mortality. Genetic polymorphisms may affect the susceptibility and clinical outcomes of cancers. We aim to manifest the association of single nucleotide polymorphisms (SNPs) of lncRNA-H19 gene with the risk and prognosis of HCC. A total of 944 samples composed of 472 HCC patients and 472 matched controls were included in the risk analysis and amongst them 350 HCC samples were investigated in the prognosis analysis. KASP method was conducted for the SNP genotyping. The TT + CT genotype of rs2839698 was found to be associated with a 1.32-fold increased HCC risk (P=0.037, 95% confidence interval (CI) = 1.02–1.70). In the stratified analysis, rs2839698 (odds ratio (OR) = 1.57, P=0.007, 95% CI = 1.13–2.18) and rs3024270 (OR = 1.71, P=0.019, 95% CI = 1.09–2.68) were found to show more obvious increased HCC risk in the age ≤60 subgroup. And we found that rs2839698 showed an increased HCC risk in the ever smoking subgroup. But in the male subgroup of rs2735971, it showed a decreased HCC risk. Furthermore, haplotype analysis showed that rs2735971-rs2839698-rs3024270 G-T-C significantly increased the risk of HCC (OR = 1.23, 95% CI = 1.01–1.51, P=0.043). Multilogistic analysis revealed no significant results of the interaction effects of the SNPs and environment factors. And in our study, rs2839698 showed a significant poor prognosis in the ever smoking subgroup (hazard rate (HR) = 5.19, 95% CI = 1.12–24.07, P=0.035). lncRNA-H19 rs2839698 SNP has the potential to be predictors for HCC risk and prognosis.


Introduction
Hepatocellular cancer (HCC) is the major liver malignancy that attributes toward the second foremost cause of cancer-related mortality worldwide [1]. Individual hereditary and environmental factors proved to be associated with the incidence of HCC [2]. So far, there are many single nucleotide polymorphisms (SNPs) that have been reported to be related to HCC risk in some coding and non-coding genes and have manifested great significance for the selection of individuals who would benefit from the specific diagnostic and preventative measures [3]. However, few studies investigated the role of lncRNA polymorphisms as a precaution biomarker for HCC risk and prognosis. Furthermore, many studies have reported that the gene polymorphisms could serve as the predictor of the diagnosis and prognosis of cancers [4,5], suggesting a valuable application for the diagnosis and prognosis associated with polymorphisms.
In human genome, there are approximately 5-10% sequences transcribed constantly, and only approximately 1% are protein-coding sequences while a large part of others are non-coding RNAs (ncRNAs) [6]. lncRNA, larger than 200 nts, is one of the most important members of the ncRNA family and has been identified as abnormally altered in the genes and differently expressed in tumors [7,8]. The lncRNA-H19, located on chromosome 11p15.5 [9], was reported to be one of the major genes in cancer [10]. Many studies have reported that H19 as an oncogene lncRNA in multiple cancers, such as, colorectal cancer [11], gastric cancer [12], breast cancer [13], bladder cancer [14], and so on. In addition, recent researches have proved that lncRNA-H19 plays important role in cancer initiation, progression, metastasis, and indicates poor prognosis and promotes tumor growth [11,15,16].
It is well accepted that lncRNA-H19 works importantly in the incidence and prognosis of cancers and SNPs in H19 can be used as a promising biomarker for cancers risk [17]. Recently, a meta-analysis for the association of H19 polymorphisms and cancer risk had published [18], but the interaction of H19 SNPs and environmental factors as well as the association of H19 SNPs and the cancer prognosis were not analyzed further. And there is still no investigation about the H19 polymorphism associated with both HCC risk and prognosis. Whether lncRNA-H19 polymorphisms play some roles in HCC and could be promising biomarkers for the risk and prognosis of HCC, it is still not clear.
In the present study, we selected three potential functional SNPs in lncRNA-H19 gene according to the candidate gene association study strategy to explore the relationship between H19 polymorphism and HCC risk and prognosis. We aimed to manifest predictive biomarkers for risk and prognosis of HCC and provide the basic for the use of H19 gene polymorphisms as precautionary biomarkers of individuals and improve the comprehension of the etiology and disease development of HCC.

Patients and study design
This research project was approved by the Ethical Committee of the Shengjing Hospital of the China Medical University and written informed consent was obtained. The present study was designed as two independent but related parts including risk research and prognosis research. In the risk study, a total of 944 participants were involved, including 472 HCC patients and 472 sex and age (+ − 5) frequency-matched controls from the Shengjing Hospital of China Medical University from 2013 to 2015. The response rate for cases and controls are up to 90% or more.
For the aim to manifest the relationship between lncRNA-H19 polymorphisms and overall survival in HCC patients, we conducted the research with the data of 350 HCC patients, whose information of death and survival was available for analysis. The HCC patients had pathologically confirmed HCC. Patients (i) with distant metastasis found preoperatively, (ii) who underwent preoperative radiotherapy or chemotherapy, or (iii) with incomplete pathological data entries were excluded from the prognosis analysis. Follow-up was completed by 10 July, 2017.

Polymorphisms' sites selected
The studied polymorphisms of lncRNA-H19 were selected by the HapMap data [19]. TagSNPs were selected by Tagger via Haploview with the following criteria: pairwise tagging of HapMap population with r 2 ≥ 0.8; a minor allele frequency (MAF) ≥5%; and Chinese Han Beijing (CHB) ethnicity. And we expanded 10 kbp both upstream and downstream of H19. Then, 17 SNPs were included as candidate SNPs (Supplementary Figure S1 and Materials), and we referred a published literature [17] and took the intersection as the considering promising aiming SNPs. Ultimately, there were three SNPs covering lncRNA-H19 gene selected to proceed our study which were rs2735971 (G→A), rs2839698 (C→T), rs3024270 (G→C).

Genotyping
Genomic DNA was extracted by the method of literature [20] and was diluted to working concentrations of 20 ng.μl −1 for genotyping. The genotyping assay was performed by Gene Company using KASP (Gene Company, Shanghai, China). The information of KASP primers was summarized in the Supplementary Table S1. Five percent of the whole samples were repeatedly genotyped, the concordance rate of the repeated cases performed 100% which suggested that the genotyping results were reliable.

Statistical analysis
χ 2 test was used to compare the demographic characteristics of samples and ANOVA was conducted for age variability. Multivariate logistic regression with adjustments for age and gender was proceeded to calculate the association of the selected SNPs and HCC risk. SHEsis software was used to analyze the haplotype of the selected gene [21]. The analysis of polymorphisms and clinical parameters was performed by χ 2 test. Univariate and multivariate survival analysis was conducted by the log-rank test and the Cox proportional hazards model. Statistical analysis was performed by using SPSS version 18.0 software (SPSS, Chicago, IL, U.S.A.) and P-value <0.05 was considered to be significantly statistical. Abbreviations: Chr. Pos., chromosomal position; CI, confidence interval; Loc., localization; OR, odds ratio; P HWE , P-value for HWE. a The sort order was according to the SNP location in its genes from 5 to 3 ends. b P-value was calculated by adjusting age and gender. The bold text in this table means the P<0.05 and is significant.

The association of lncRNA-H19 SNPs with HCC risk
The demographic characteristics of HCC and controls are shown in Supplementary Table S2. In Table 1, it showed all the polymorphisms genotype distributions of both cases and controls, including three lncRNA-H19 SNPs (rs2735971, rs2839698, rs3024270) which were all conformed to Hardy-Weinberg equilibrium (HWE). LncRNA-H19 rs2839698 polymorphism was calculated to be associated with an increased risk of HCC. In the dominant model, rs2839698 TT + TC genotype appeared with a 1.32-fold increased HCC risk when compared with CC genotype (P=0.037, Table 1). Subsequently, we conducted stratified analysis by the factors of gender, age, smoking, and drinking to manifest the relationships between every SNP and HCC risk. The results displayed in Table 2 indicated the potential predicting values for specific subgroup populations. When stratified by gender, rs275971 showed a decreased HCC risk tendency in AG genotype of male subgroup for odds ratio (OR) = 0.72, P=0.048. In the subgroup stratified by age, rs2839698 showed an obvious HCC risk tendency in CT genotype of age ≤60 subgroup (OR = 1.57, P=0.007) and the similar situation was showed in rs3024270 CC genotype of age ≤60 subgroup (OR = 1.71, P=0.019). When stratified by smoking factor, rs2839698 showed a more obvious HCC tendency in CT subgroup of ever smoker subgroup (OR = 1.89, P=0.041, Table 2).

LncRNA-H19 SNP-environment interaction with HCC risk
Data mining was conducted to analysis the possible association between interaction model for lncRNA-H19 polymorphisms and environmental factors in HCC risk ( Table 4) and found that there were no significant results.

The association of lncRNA-H19 SNPs with HCC prognosis
The association of HCC patient clinical features and univariate analysis of overall survival was shown in Supplementary Table S3. We also analyzed the relationship of each lncRNA-H19 SNPs and the overall survival of HCC, there existed no significant association between the SNPs and the survival of HCC either in the univariate or multivariate survival analysis ( Table 5). In the stratified analysis, we found that the rs2839698 showed a significant poor prognosis in the ever smoking subgroup (hazard rate (HR) = 5.19, CI = 1.12-24.07, P=0.035, Table 6).   Haplotype for a , H19 rs2735971-rs2839698-rs3024270. The bold text means the significant results.

Discussion
H19 gene, with the length of 2.3 kb and located in 11p15.5, containing five exons and three introns [22], which has been well accepted that lncRNA-H19 plays an important role in the development, migration, invasion, and metastasis of cancers [23]. As a long ncRNA, H19 lacks the ORF to translate protein, however, the end product of which is RNA sequence and can also participate in RNA regulation [24]. Due to the relationship between H19 variants and cancer risk as well as prognosis is still needed to be clarified; the H19 polymorphisms have been of great interest in the recent years [17]. Many studies have been reported that H19 SNPs were related to cancers risk and prognosis, such as, rs217727 with breast cancer [13], rs2389698 with gastric cancer [25], but rs1859168 was reported to reduce the risk of pancreatic cancer [26]. However, the association between lncRNA-H19 SNPs and HCC risk and prognosis is still unreported.
In order to research the role of H19 SNPs in HCC risk and prognosis, we screened three intron SNPs in H19 gene. Under the dominant model, TT + CT genotype of rs2389698 was found to be 1.32-fold increased HCC risk compared with CC wild-type; this is the first report indicating that H19 SNPs was related to HCC risk. Thus, rs2389698 may serve as a promising predictor for HCC risk. The rs2389698 was an intron SNP and it is accepted now that intron SNP also had its possibly own functions such as affecting selective splicing [27,28]. Because some intron polymorphisms had some important location and even function, it is believable that choosing intron SNP for research is also a good choice such as this rs2389698 SNP.
When stratified by gender, age, smoking, and drinking factors, a more obvious OR of 1.57 and 1.71 was shown for the rs2389698 and rs3024270 in the age ≤60 subgroup, respectively. And in the ever-smoking subgroup of rs2389698, it showed 1.89-fold increased HCC risk. It was reported that the expression of H19 could be induced by cigarette smoke condensate in human respiratory epithelial cells [29]. Thus, we suppose that the mature lncRNA-H19 could be affected by cigarette smoke and when in the ever-smoking subgroup, the polymorphisms could contribute to more functions than the environmental factors. These results indicated that the promising SNPs of H19 may be better biomarkers for the certain subgroup and could bring benefit to the individualized diagnosis for HCC in the certain population.
Furthermore, we performed interaction analysis for the multiple lncRNA-H19 SNPs and environmental factors including smoking and drinking. Yet there showed no interaction between the three polymorphisms and environmental factors. And in the lncRNA-H19 SNPs haplotype and HCC risk analysis, rs2735971-rs2839698-rs3024270  G-T-C were found significantly increased the risk of HCC (OR = 1.23, CI = 1.01-1.51, P=0.043). These results suggested that the G-T-C haplotype suffers more risk than other haplotype. We further performed univariate and multivariate Cox proportional hazards regression analysis of overall survival time to explore the association between lncRNA-H19 SNPs and HCC prognosis. No significant association was found between H19 SNPs and HCC overall survival. In addition, we performed subgroup analysis for HCC prognosis and found rs2839698 that showed significant poor survival condition in the ever-smoking subgroup which suggested that it may affect HCC prognosis and could be a promising biomarker for HCC prognosis. As discussed above, the expression of H19 could be induced by cigarette smoke [29]. Thus, when in the ever-smoking subgroup, the polymorphisms could contribute more functions than the environmental factors and individuals carrying the variant genotype which also had a higher incidence of cancer risk could have a poorer survival of HCC.
However, there still existed several limitations in the present study. First, the sample size was relatively not large enough for the analysis which may limit the possible analysis of other subgroup analysis and interaction analysis for variant genotype. Second, we only studied the local population and did not include residents of other areas. Third, because the controls were collected from the health check program of our hospital and there was no message of the HBV for them which could not assess the influence of this factor. In future, larger sample and multicenter samples are needed for the confirmation study of our findings.
In conclusion, we found an intron rs2839698 SNP of lncRNA-H19 was associated with an increased risk of HCC. And more significant findings were shown in the age ≤60 subgroup in rs2839698 and rs3024270. In the ever-smoking subgroup, rs2839698 showed an obvious increased HCC risk too. But we got an adverse result in the male subgroup of rs2735971 SNP that showed a decreased HCC risk. In addition, we found that the rs2735971-rs2839698-rs3024270 G-T-C significantly increased the risk of HCC in the analysis of haplotype and HCC risk. And the MDR analysis had no significant findings. In the prognosis analysis, the rs2839698 showed a poor prognosis in the ever-smoking subgroup. In the future, the larger scale sample experiments and analyses are needed to confirm our results.