Long noncoding RNAs (lncRNAs) represent one of the largest classes of transcripts and are highly diverse in terms of characteristics and functions. Advances in high-throughput sequencing platforms have enabled the rapid discovery and identification of lncRNAs as key regulatory molecules involved in various cellular processes and their dysregulation in various human diseases. Here, we summarize the current knowledge of the functions and underlying mechanisms of lncRNA activity with a particular focus on cancer biology. We also discuss the potential of lncRNAs as diagnostic and therapeutic targets for clinical applications.
The concept of ‘junk DNA’, i.e. regions of genomic DNA that lack protein-coding capacity and are thus thought to have no functional relevance, has long been a contested topic in biology. In recent years, advances in high-throughput sequencing platforms have revealed an astounding number of noncoding RNAs (ncRNAs) and suggested that they play critical roles in gene regulation and developmental processes . Intriguingly, although ∼70% of the genome is transcribed into RNA, only 1.5% of the genome encodes proteins . Furthermore, only half of the human transcriptome comprises protein-coding  transcripts (Table 1). These revelations have challenged the central dogma of gene regulation and suggested that ncRNAs may play key roles in various physiological and pathophysiological conditions.
|RNA type||Relative abundance (%)|
|Other (microRNA, small nucleolar RNA, etc.)||40|
|RNA type||Relative abundance (%)|
|Other (microRNA, small nucleolar RNA, etc.)||40|
ncRNAs are generally divided by size into two categories: small ncRNAs, transcripts that are lesser than 200 bp in length; and long ncRNAs (lncRNAs) that are greater than 200 bp . Small ncRNAs include the well-characterized transfer RNAs that are essential for protein synthesis and microRNAs (miRNAs), which act as important post-transcriptional regulators of gene expression via the RNA-induced silencing complex. On the other hand, the class of long ncRNAs (lncRNAs) is less well understood and its nomenclature is still constantly evolving. In general, an lncRNA is typically transcribed by RNA polymerase II (either from intronic regions of protein-coding loci or noncoding loci), with a capped 5′-end and a polyadenylated 3′-end. lncRNA genes can be subjected to similar epigenetic modifications as protein-coding genes, such as histone 3 lysine 4 trimethylation (H3K4me3) at transcription start sites . lncRNAs may also be subjected to post-transcriptional splicing to form the mature transcripts . Examples of lncRNAs include long intergenic noncoding RNAs (lincRNAs) such as HOTAIR and HOTTIP, antisense RNAs (RNAs that are transcribed from the opposite strand of a protein-coding gene and have overlapping sequences with that gene), transcribed ultraconserved regions (RNAs transcribed from genomic regions that are highly conserved across species), pseudogenes (nonprotein-coding copies of protein-coding genes) and circular RNAs.
One of the first lncRNAs identified was the X-inactive specific transcript (XIST). XIST was discovered in the 1990s by scanning through cDNA libraries in the hope of finding novel genes that were important for X chromosome inactivation . By searching for clones of interest, researchers aimed to study the expression patterns of these putative genes in the attempt to uncover their role in the X inactivation process. Subsequent experimental validation confirmed the noncoding property of XIST and its critical role as a positive regulator for the initiation of X inactivation .
Over the past decade, high-throughput RNA sequencing has enabled the methodical identification of lncRNA transcripts and the subsequent delineation of their functions in various biological processes [8,9]. Increasing evidence has demonstrated the tissue-specific expression of particular lncRNAs  and the impact of their dysregulation on the progression of human disease . In this review, we will discuss current knowledge about the functions and underlying mechanisms of lncRNA activity, with examples of lncRNAs that have been specifically implicated in cancer biology. In addition, we will discuss the potential utility of these lncRNAs as diagnostic markers or therapeutic targets.
Mechanisms of lncRNA function
To date, functional studies have revealed that lncRNAs play a diverse range of regulatory roles. Functionalities include epigenetic regulation via molecular scaffolding, regulation of mRNA processing, molecular decoying and lncRNA-derived peptides (Figure 1). lncRNAs may exert their regulatory effects via a sequence-based mechanism analogous to miRNAs, which post-transcriptionally regulate gene expression by binding specifically to response elements and/or motifs on target transcripts. These revelations thereby indicate a broad spectrum of endogenous lncRNA functions and highlight the functional capacity of the noncoding genome.
Functions and mechanisms of lncRNAs.
Epigenetic regulation via molecular scaffolding
Early studies on lncRNAs revealed that lncRNAs may be involved in the epigenetic regulation of target genes . Several studies have demonstrated that epigenetic regulation by lncRNAs usually leads to the transcriptional repression of target genes [13–15]. In this mode of regulation, lncRNAs serve as tethers linking different protein units together. These lncRNAs are characterized by the presence of various domains capable of binding distinct effector molecules. This simultaneous binding enhances the proximity and subsequent interactions between these units, which ultimately result in the typical transcriptional repressive effect on the target gene.
One of the most well-studied epigenetic interactors of lncRNAs is the PRC1/2 polycomb complex . PRC1 consists of multiple proteins such as Chromobox (CBX), RING1/2 and MEL18 whose primary function is to ubiquitinate histone H2A at lysine 119 position . As for PRC2, protein units such as EZH2, EED and SUZ12 act in concert to facilitate the trimethylation of histone H3 at lysine 27 position . Numerous studies have demonstrated the direct binding of lncRNAs with PRC1/2 proteins. For example, the lncRNA ANRIL was found to bind directly with subunits of the PRC2 complex, thereby recruiting the PRC2 complex to the INK4B/ARF locus to repress transcriptional activation of tumor suppressor p15 via chromatin compaction . Notably, expression level of EZH2 and BMI1 subunits of the PRC1/2 complexes was found to be elevated in a variety of solid tumors and have been implicated in tumor initiation and progression [17,18]. Interestingly, two previous studies have investigated the binding specificity of the PRC2 complex and reported the promiscuous binding of subunits such as EZH2 to a wide variety of RNAs [19,20]. However, the PRC2 complex exhibits a more selective RNA-binding behavior that may be partially mediated by the EED subcomponent . Nevertheless, both studies highlighted the importance of RNA binding for the recruitment and subsequent regulation of EZH2 activity.
A strong consensus for the essential role of lncRNAs as scaffolds for the recruitment of epigenetic regulators and hence, epigenetic regulation was further demonstrated in studies involving mouse polycomb proteins, in which the addition of RNase abrogated the ability of CBX7 binding to heterochromatin globally across the genome . In addition, related studies have identified putative PRC2-binding motifs that consist of GC-rich hairpin structure on lncRNAs, suggesting the importance of RNA secondary structures, on top of sequence conservation, for their functions in physiological and pathological conditions .
Regulation of mRNA processing
In addition to transcriptional control of gene expression, regulation at the post-transcriptional level is also crucial as evident in miRNA-mediated silencing of target genes. Post-transcriptional processing includes mRNA splicing, editing and export. Of particular importance is the presence of nuclear paraspeckles, subnuclear bodies located within the interchromatin space [22,23]. Nuclear paraspeckles are crucial for the regulation of gene expression by acting as temporal storage sites for mRNA transcripts for further processing prior to their export into the cytosol for translational initiation . Studies have revealed the involvement of lncRNAs such as MALAT1 and NEAT1 in mRNA processing by identifying the localization and abundance of these lncRNAs in these nuclear paraspeckles. In addition, knockdown experiments have demonstrated a critical role of MALAT1 in the recruitment of SR splicing factors, including SRF1 and SC35, to the nuclear paraspeckles . Mechanistically, MALAT1 is capable of modulating the phosphorylation of these splicing factors, thereby titrating the levels of phosphorylated and dephosphorylated pools and indirectly regulating alternative splicing of mRNA transcripts . In cancer, MALAT1 and NEAT1 transcripts are found to have elevated expression in a variety of tumors and their expression has been correlated with poor prognosis and disease outcome . The exact mechanism by which they contribute to tumorigenesis is, however, yet to be fully understood.
lncRNAs have also been implicated in the regulation of mRNA transcript stability through direct RNA–RNA interactions . The fundamental basis of this interaction lies in the sequence homology between the lncRNAs and mRNAs, resembling the regulation of mRNA expression by miRNAs. lncRNAs containing Alu repeats are capable of hybridizing to mRNAs, forming transient dsRNA molecules that serve as a signal for destabilizing factors such as STAU1 to mediate mRNA degradation .
Apart from modulating mRNA processing and stability, lncRNAs can regulate gene expression post-transcriptionally by acting as molecular decoys that sequester specific transcripts, proteins or miRNAs that mediate gene expression. By doing so, these lncRNAs are capable of titrating the levels of gene products, either resulting in an increase or depletion depending on the type of molecules they sequester. An example of an lncRNA decoy is growth arrest-specific 5 (GAS5), which was recently shown to confer glucocorticoid resistance in cells . Structural analysis of its secondary structure revealed an RNA stem-loop motif that is mimetic to the DNA motif of the hormone response elements (HREs) present in the promoter region of glucocorticoid-responsive genes . By competitively binding to the DNA-binding domain of glucocorticoid receptors, GAS5 is capable of preventing the physical interaction of the receptors with the HREs, leading to the transcriptional repression of glucocorticoid-responsive genes and ultimately regulating steroid hormone activity in cells .
Another mechanism of sequestration occurs when lncRNAs act as decoys for microRNAs. RNA transcripts, which contain microRNA response elements (MREs) for the same microRNAs, can act as endogenous sponges by competing for these shared microRNAs . In this model, all RNA transcripts including lncRNAs and protein-coding mRNAs may function as competing endogenous RNAs (ceRNAs) that bind and sequester a common pool of miRNAs, thus exerting an added trans-acting regulatory effect to gene expression. In the context of cancer, it has been shown that the pseudogene of the well-characterized PTEN tumor suppressor gene, PTENP1, can function as a bona fide tumor suppressor by actively competing for microRNA-binding sites with PTEN . More importantly, the expression level of PTENP1 was found to be lower in prostate and colorectal cancers, indicating the possibility of a genomic loss of the locus during tumor progression . Similarly, lncRNA FER1L4 was reported to function as a ceRNA for the retinoblastoma RB1 transcript, a well-established tumor suppressor gene in gastric cancer . Consistently, both the expression of FER1L4 and RB1 were significantly altered in gastric cancer tissues compared with paracancerous tissues . This competing endogenous RNA activity may thus provide new insights into the novel regulatory function of ncRNAs and 3′-UTRs of protein-coding genes.
Despite the noncoding definition of lncRNAs, several studies have reported that lncRNAs can contain putative small open reading frames (ORFs) that are translated into peptides of <100 amino acids [34,35]. A recent study revealed the translation of a novel peptide termed ‘small regulatory polypeptide of amino acid response’ (SPAR), which is encoded by LINC00961 . Conserved between mouse and human, SPAR down-regulation promotes the activation of mTORC1 that consequently stimulates muscle regeneration . As lncRNAs are expressed in a tissue-specific manner, these exciting findings highlight the potential of these hidden peptides in the tissue or organ-specific regulation of biological processes in mammals.
Clinical relevance of lncRNAs in human disease
Biomarkers for diseases
The key to a successful biomarker lies in its selective expression and the ability to be measured in an objective manner . Studies have identified lncRNAs that are tissue-specific and/or disease-specific. A notable example would be PCA3, a prostate-specific lncRNA that is remarkably overexpressed in prostate cancer. Critically, PCA3 is detectable in the urine samples of prostate cancer patients and hence is relatively noninvasive . To date, PCA3 has been successfully utilized as a biomarker in diagnostic assays and employed in the clinics . Another notable example is PR-lncRNA-1, which was found to be down-regulated in colorectal cancer . Functionally, PR-lncRNA-1 inhibits growth and promotes apoptosis of colorectal cancer cells by enhancing p53 transcriptional activation . PR-lncRNA-1 thus may have potential clinical applications as a biomarker for colorectal cancer.
Therapeutic targets in diseases
lncRNAs with highly specific expression patterns in certain tissues and diseases have garnered increasing interest as molecular targets for therapy. Although the clinical use of RNAi-based therapies is still in its infancy, similar approaches could be adapted to target lncRNAs whose expressions are dysregulated in diseases. For example, the lncRNA SAMMSON was found to be overexpressed in >90% of human melanomas and the silencing of SAMMSON induces cell death and sensitizes melanoma to MAPK inhibitors both in vitro and patient-derived xenografts . Hence, designing short hairpin RNAs or antisense oligonucleotides against SAMMSON may have a significant therapeutic value. Apart from SAMMSON, several lncRNAs have been reported to be dysregulated in various cancer types and may potentially be attractive therapeutic targets for these respective cancer types. A good example would be BCAR4, whose expression was found to be up-regulated in antiestrogen resistance breast cancer and promotes cancer cell proliferation and migration . Mechanistically, BCAR4 is capable of binding and activating transcription factors such as SNIP1 and PNUTS that ultimately drive the hedgehog/GLI2 transcriptional program . As such, targeting BCAR4 may yield a promising therapeutic effect in antiestrogen resistance breast cancer. Nevertheless, the underlying mechanisms of lncRNA-induced cellular effects as well as the safety and efficacy of lncRNA-based therapies would have to be fully investigated before they can have clinical viability.
Systematic annotation and characterization of lncRNAs
Previous studies have demonstrated the biological significance of several previously uncharacterized lncRNAs such as HOTTIP and HOTAIR [44,45]. However, the accurate characterization of lncRNAs can be complicated by the inherent difficulties in differentiating them from mRNAs and other RNA species, especially with the realization that lncRNAs do contain ORFs and can thereby encode functional peptides . Looking forward, the establishment of a robust, systematic method of annotating lncRNAs will be critical for the understanding of their importance in physiological and pathophysiological states. More importantly, the realization that RNA molecules are not just an intermediary between DNA and protein, but in fact multitaskers with both coding and coding-independent functions will have to be reinforced in order to further elucidate the transcriptome landscape.
Mapping the somatic mutations of lncRNAs
While there is a common consensus that lncRNAs are often dysregulated in pathological conditions, it is unclear as to whether lncRNAs undergo point mutations, insertions or deletions in the course of disease progression. Studies have hinted at the possibility of somatic mutations of lncRNAs in cancer: an ETV1 chromosomal translocation to a prostate-specific lncRNA, PCAT-14 was observed in a prostate cancer patient, hence generating an oncogenic fusion product contributing to tumor progression . Similarly, a reciprocal translocation between t(1;3)(q25;q27) resulted in an oncogenic GAS5-BCL6 chimeric transcript in a B-cell lymphoma patient . Despite the limited data on somatic aberrations of lncRNAs, this field of research will be of increasing clinical importance as evident in the case of well-established oncogenes such as KRAS, where there are no changes in its expression level regardless of its mutational status.
Classification of RNA structural domains
Besides the sequence-specific basis of lncRNA function, RNA structure also plays an important role in the mechanisms of lncRNA activity. As discussed earlier, RNA structural motifs are crucial to the recognition and binding of effector proteins such as the repressive PRC2 complex. An example of a well-studied RNA structure is the formation of a hairpin consisting of a stem-loop-stem design. This classical hairpin structure is fundamental to the synthesis of miRNAs . Despite the evident relationship between lncRNA domains and functions, the global profiles of lncRNA structures have yet to be characterized and understood. Furthermore, lncRNAs with highly diverse sequences have been reported to form similar secondary structures, thereby supporting the hypothesis that RNA secondary structure plays a pivotal role in the formation of RNA domains . To this end, a previous study has revealed novel regulatory features arising from RNA secondary structures following an in vivo, high-throughput genome-wide profiling method, structure-seq . In addition, sophisticated computational algorithms have been designed and improved to better predict RNA folding and structures . Experimentally, the development of PARS-Seq and Frag-Seq techniques have allowed for the global evaluation of RNA structures in an unbiased manner [52,53]. By subjecting RNA samples to specific RNases that cleave at selective structural positions, transcripts can subsequently be processed and sequenced to identify nucleotide locations where putative secondary structures will be preferentially formed [52,53]. This will enable the elucidation of the overall structural landscape of RNA transcripts and identification of the key structural domains that are critical for their functions. In addition, techniques such as sequencing of psoralen cross-linked, ligated and selected hybrids (SPLASH) and psoralen analysis of RNA interactions and structures (PARIS) enable the in vivo identification of alternative RNA structures as well as the variety and dynamics of long-range RNA–RNA interactions [54,55]. These approaches unveiled the inherent complexity of RNA folding and aid in the better understanding of RNA architectures and their effect on cell biology.
Over the past decade, advancements in high-throughput sequencing platforms have enabled the rapid discovery of various ncRNA species, shedding light on the complexity of the intricate and yet dynamic process of gene regulation. In particular, lncRNAs have received increasing recognition as crucial mediators of gene regulation and cellular processes. In the context of cancer, the expression levels of lncRNAs are often dysregulated and this perturbation may drive tumorigenesis by disrupting key cellular processes. Future studies that deepen our understanding of the biological roles and properties of lncRNAs will enable researchers to harness their potential as valuable diagnostic and therapeutic targets for human diseases.
competing endogenous RNAs
growth arrest-specific 5
hormone response elements
long noncoding RNAs
microRNA response elements
open reading frames
parallel analysis of mRNA structure-sequencing
small regulatory polypeptide of amino acid response
X-inactive specific transcript
We apologize to all colleagues whose work could not be cited due to space constraints. We thank Tay lab members for critical reading of the manuscript. Y.T. was supported by a Singapore National Research Foundation Fellowship, a National University of Singapore President's Assistant Professorship and the RNA Biology Center at CSI Singapore, NUS, from funding by the Singapore Ministry of Education's Tier 3 [grant number MOE2014-T3-1-006].
The Authors declare that there are no competing interests associated with the manuscript.