Circulating lncRNA UCA1 and lncRNA PGM5-AS1 act as potential diagnostic biomarkers for early-stage colorectal cancer

Abstract Background: Colorectal cancer (CRC) is one of the most common and significant malignant diseases worldwide. In the present study, we evaluated two long non-coding RNAs (lncRNAs) in CRC patients as diagnostic markers for early-stage CRC. Methods: Using Gene Expression Omnibus (GEO) datasets GSE102340, GSE126092, GSE109454 and GSE115856, 14 differentially expressed lncRNAs were identified between cancer and adjacent tissues, among which, the two most differentially expressed were confirmed using quantitative real-time polymerase chain reaction (qRT-PCR) in 200 healthy controls and 188 CRC patients. A receiver operating characteristic (ROC) analysis was employed to evaluate the diagnostic accuracy for CRC. Results: From four GEO datasets, three up-regulated and eleven down-regulated lncRNAs were identified in CRC tissues, among which, lncRNA urothelial carcinoma-associated 1 (UCA1) and lncRNA phosphoglucomutase 5-antisense RNA 1 (PGM5-AS1) were the most significantly up- and down-regulated lncRNAs in CRC patient plasma, respectively. The area under the ROC curve was calculated to be 0.766, 0.754 and 0.798 for UCA1, PGM5-AS1 and the combination of these two lncRNAs, respectively. Moreover, the diagnostic potential of these two lncRNAs was even higher for the early stages of CRC. The combination of UCA1 and PGM5-AS1 enhanced the AUC to 0.832, and when the lncRNAs were used with carcinoembryonic antigen (CEA), the AUC was further improved to 0.874. Conclusion: In the present study, we identified two lncRNAs, UCA1 and PGM5-AS1, in CRC patients’ plasma, which have the potential to be used as diagnostic biomarkers of CRC.


Introduction
As one of the most deadly cancer, roughly 1.8 million new cases of colorectal cancer (CRC) have been diagnosed, and more than 900000 patients died because of CRC in 2020 [1]. Although many effective new treatments have been developed in recent decades, the long-term survival rate is still low due to the lack of effective early-stage diagnosis methods [2]. Body fluid-based testing is an inexpensive and non-invasive method for the early-stage diagnosis of cancer and provides crucial information for tumor process monitoring and prognostic evaluation [3]. For example, the detection of α-fetoprotein (AFP) is of great importance in the early-stage diagnosis of hepatocellular carcinoma [4]. However, the currently used serum tumor biomarkers, carbohydrate antigen 19-9 and carcinoembryonic antigen (CEA), have shown a low positive rate and poor sensitivity toward the early-stage CRC. Hence, there is an urgent need to find new biomarkers with high sensitivity and specificity for the early-stage diagnosis of CRC.
In the present study, we aimed to find novel plasma lncRNAs as biomarkers for the early-stage diagnosis of CRC. By analyzing four Gene Expression Omnibus (GEO) datasets, we obtained a number of differentially expressed lncR-NAs, among which lncRNA urothelial carcinoma-associated 1 (UCA1) and phosphoglucomutase 5-antisense RNA 1 (PGM5-AS1) were confirmed using quantitative real-time polymerase chain reaction (qRT-PCR). These lncRNAs show great potentials as diagnostic biomarkers for the early-stage CRC and may provide new insights into the pathological mechanism of CRC.

LncRNAs expression in public datasets
The expression profiles matching 'colorectal cancer' and 'LncRNA' in the GEO public dataset were used to select and analyze differentially expressed genes in CRC tissues and adjacent tissues, by which, four datasets, GSE102340, GSE126092, GSE109454 and GSE115856, were selected. The information of the four GEO datasets is summarized in Supplementary Table S1. GEO2R was employed to analyze differentially expressed genes among each dataset using the following criteria: logFC ≥ 1, P-value <0.05 for up-regulated genes and logFC ≤ −1, P-value <0.05 for down-regulated genes. Then, the online Venn diagram tool (http://bioinformatics.psb.ugent.be/webtools/Venn/) was used to identify the common differentially expressed genes from these four datasets.

Sample collection
The present study was conducted on 188 CRC patients who were enrolled in the Affiliated Hospital of Qingdao University from 3 April 2019 to 30 September 2020. The CRC patients were diagnosed by two independent pathologists. A total of 200 healthy individuals who participated in health examination in the Health Management Center were selected as the control. The qualified controls should not have any related diseases such as immunodeficiency diseases, hypertension and diabetes. All the CRC patients were not treated before sampling.

RNA isolation
Plasma was collected from CRC patients and controls in separating gel coagulation-promoting tubes, which was used for Total RNA preparation using a TIANGEN RNA extraction kit (Beijing, China). A NanoDrop OneC Spectrophotometer (Thermo Scientific, Rockford, IL) was used to measure the quantity and purity of the RNA. The OD260/OD280 ratio of the samples used for reverse transcription was between 1.8 and 2.0.

qRT-PCR
The complementary DNA (cDNA) for real-time PCR was generated from 1.0 μg RNA sample using the Prime-Script RT reagent Kit with gDNA Eraser (#RR047A, Takara, Tokyo, Japan) which is a reverse-transcription kit for real-time RT-PCR (RT-qPCR) that includes a genomic DNA elimination reaction. The real-time PCR was performed starting with an initial denaturation step at 95 • C for 30 s, followed by 45 cycles at 95 • C for 5 s, and 60 • C for 30 s. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as the internal control. LncRNAs with non-specific amplification and primer dimerization were not included in the present study. In addition, lncRNAs that were hardly detected by qRT-PCR due to their low copy number (C t value > 35) were excluded from the present study. The primers of the lncRNAs were designed on exon regions spanning introns to avoid amplification derived from genomic DNA. The primers are as follows: UCA1, forward: 5 -GCCGAGAGCCGATCAGACAAAC-3 , reverse: 5 -AACGGATGAAGCCTGCTTGGAAG-3 ; PGM5-AS1, forward: 5 -AGCTGGTGGAATCATTCTAACA , reverse: 5 -GAGATAGGTCGATTCGGAGATC ; GAPDH, forward: 5 -TGACTTCAACAGCGACACCCA-3 , reverse: 5 -CACCCTGTTGCTGTAGCCAAA-3 . The relative lncRNA levels were calculated by the 2 − C t method.

Statistical analysis
The qRT-PCR results of the CRC patients were compared with those of the healthy individuals by the Mann-Whitney U test. The diagnostic potentials of the lncRNAs were evaluated using the ROC curves. The experimental results were presented as the mean + − standard deviation (SD), and the P<0.05 was considered to be statistically significant. The statistical analyses were performed using the SPSS package, version 19.0 (Chicago, IL).

Validation of the diagnostic potential of the lncRNAs
To validate the potential of these lncRNAs in the diagnosis of CRC, we measured their expression level in the plasma of 188 CRC patients and 200 healthy donors. Among these lncRNAs, the expression of UCA1 and PGM5-AS1 was greatly up-and down-regulated, respectively (P<0.001, Figure 2A). We also analyzed their expression in four GEO datasets, which confirmed the elevated expression of UCA1 in CRC tumor and the decreased expression of PGM5-AS1 in GSE102340, GSE126092, GSE109454, and GSE115856 datasets ( Figure 2B).

Correlation between UCA1 and PGM5-AS1 expression and clinicopathological characteristics
The correlation between the expression levels of UCA1 and PGM5-AS1 and the clinicopathological characteristics of CRC patients are analyzed (Table 2), which showed that the expression of UCA1 nor PGM5-AS1 had no correlation    with age, gender, differentiation, and AJCC stage of the patients.

Diagnostic value of lncRNAs with conventional biomarkers
CEA is one of the most commonly used CRC markers in clinical applications. Thus, we assessed its diagnostic value in combination with UCA1 and PGM5-AS1. The results indicated that the CEA expression in plasma in the healthy controls was significantly lower than that of the CRC patients as well as the early-stage patients ( Figure 4A,B). The AUC of CEA alone was 0.690 (95% CI: 0.636-0.743), which could be greatly increased to 0.838 (95% CI: 0.798-0.879) ( Figure 4C) when used in combination with UCA1 and PGM5-AS1. In addition, by using the combination with UCA1 and PGM5-AS1, the diagnostic value of the early-stage CRC could be improved to 0.874 (95% CI: 0.831-0.917) ( Figure  4D).

Discussion
Molecular diagnostic markers have been developed for different forms of tumors, and some of them have been used clinically [19]. However, most of these biomarkers are not adequately sensitive for the diagnosis of early-stage tumors [20]. Thus, new and more sensitive markers are urgently needed [21]. LncRNAs have shown great potential in the diagnosis of various tumors, such as bladder cancer, lung cancer and prostate cancer [16,17,[22][23][24][25][26][27]. In this study, we analyzed a total of 74 samples from GSE102340, GSE126092, GSE109454, and GSE115856 datasets to find potential lncRNA markers for the diagnosis of CRC. We identified two lncRNAs with the most significant changes of expression, UCA1 and PGM5-AS1, which could be used as potential markers for the diagnosis of early-stage CRC. UCA1 belongs to the human endogenous retrovirus H family and is localized on chromosome 19p13.12 [28]. Studies have shown that that UCA1 was universally expressed in embryonic tissues [29]. Recently, accumulating reports have revealed that UCA1 may function as an oncogenic lncRNA in the occurrence and progression of tumors, as well as a prognostic marker as its expression is highly correlated with high metastatic propensity and poor survival rate of cancer patients at the advanced TNM stages [30][31][32]. Wang et al. reported that overexpression of UCA1 promotes migration and proliferation of gastric cancer cells, indicating UCA1 may be used as a therapeutic target for gastric cancer [33]. Xue et al. have found that the expression of UCA1 is elevated in the hypoxic bladder cancer cell-derived exosomes compared with that of the healthy donors, which promotes bladder tumor growth though epithelial-mesenchymal transition (EMT) [34]. Our study showed that, compared with the healthy donors, the plasma expression of UCA1 was also increased in CRC patients, especially in those early-stage CRC patients (stages I and II).
Metabolic reprogramming has been regarded as an important hallmark of cancers [35], during which, phosphoglucomutase (PGM) plays a key role in glucose-1-phosphate and glucose-6-phosphate metabolism. LncRNA PGM5-AS1 has been identified as a tumor suppressor in CRC by a number of studies [36], as overexpression of PGM5-AS1 can inhibit CRC cells growth [37]. The expression of PGM5-AS1 may be positively correlated PGM5 expression, both of which are significantly down-regulated in CRC patients. It was also reported that PGM5-AS1 is down-regulated in CRC tissues and cells. PGM5-AS1 may promote tumor proliferation, migration and invasion by modulating the inhibitory effect of miR-100-5p on the tumor suppressor gene SMAD4 [38]. These findings indirectly indicate that, consistent with our study, low expression of PGM5-AS1 is beneficial to CRC patients. However, the effects of PGM5-AS1 on CRC progression lie on multiple aspects. For example, Zhu et al. have demonstrated that PGM5-AS1 may be associated with CRC progression. Moreover, high PGM5-AS1 expression levels were associated with worse overall survival in CRC. It could be used as a novel potential therapeutic and prognostic target for CRC [39]. Therefore, the underlying molecular mechanism responsible for the different functions of PGM5-AS1 in CRC required needs further study.
It should be noted that there are some restrictions of our study. Firstly, clinical verification of the lncRNAs by using larger sample sizes is needed before they can be applied in clinical practice. Secondly, further work is required to identify more key lncRNAs related to the early-stage diagnosis of CRC and make the best combination to improve the diagnostic efficiency. Thirdly, the underlying mechanisms of differentially expressed lncRNAs involved in tumorigenesis are still not well understood and require further investigation.

Conclusion
In summary, we identified two differentially expressed lncRNAs, UCA1 and PGM5-AS1, in the plasma of CRC patients, which showed great diagnostic potential of CRC, and by combining with traditional markers, the diagnosis of CRC, especially the early-stage CRC, could be improved.

Data Availability
All data analysed in the present paper are already included in the manuscript, including two tables.