Unr (upstream of N-ras) is a post-transcriptional regulator of gene expression, essential for mammalian development and mutated in many human cancers. The expression of unr is itself regulated at many levels; transcription of unr, which also affects expression of the downstream N-ras gene, is tissue and developmental stage-dependent and is repressed by c-Myc and Max (Myc associated factor X). Alternative splicing gives rise to six transcript variants, which include three different 5′-UTRs. The transcripts are further diversified by the use of three alternative polyadenylation signals, which governs whether AU-rich instability elements are present in the 3′-UTR or not. Translation of at least some unr transcripts can occur by internal initiation and is regulated in a cell-cycle-dependent manner; binding of PTB (polypyrimidine tract-binding protein) and Unr to the 5′-UTR inhibits translation, but these are displaced by heterogeneous nuclear ribonucleoproteins C1/C2 (hnRNPC1/C2) during mitosis to stimulate translation. Finally, Unr is post-translationally modified by phosphorylation and lysine acetylation, although it is not yet known how these modifications affect Unr activity.
Unr [upstream of N-ras; also known as cold shock domain (CSD) containing protein E1 (CSDE1)] is a highly conserved RNA-binding protein that contains five copies of the CSD. Unr functions in post-transcriptional control of gene expression , including mRNA-specific roles in the stimulation of IRES (internal ribosome entry site)-mediated translation initiation [2–5], inhibition of translation [6–9], translation-coupled mRNA degradation [10,11] and mRNA stabilization . In addition to these mRNA-specific functions, Unr binds a large subset of human mRNPs (messenger ribonucleoprotein; Swagat Ray, Pól Ó Catnaigh, Nigel Dyer, Sascha Ott and Emma Anderson, unpublished data), suggesting that Unr may also have a more general role in regulating gene expression. At the level of the organism, the expression and activity of Unr is clearly important; homozygous knockout of the unr gene in mice leads to embryonic lethality  and protein-affecting mutations in Unr have been found in more than 20 human cancers [integrative oncogenomics (IntOGen) and catalogue of somatic mutations in cancer (COSMIC) v73 databases]. This review focuses on the mechanisms by which Unr expression and function in mammalian cells is determined, from transcription and RNA processing to translation and post-translational regulation.
Transcriptional regulation of the unr/N-ras locus
The unr gene was originally identified by virtue of its proximity to the N-ras oncogene, reported to be separated by 130 bp in mice . Southern blotting suggested the presence of one copy of unr in mice and two copies in humans, only one of which was linked to N-ras . However, sequencing of the human genome has revealed only one intact unr gene in humans, linked to N-ras on chromosome 1 (Figure 1), although sequences aligning to the 5′- and 3′-UTRs of unr can be found on chromosomes 4, 5, 7, 10 and X (Pól Ó Catnaigh and Emma Anderson, unpublished observations). A combination of classical cDNA cloning experiments [15,16] and genomic sequencing data have revealed that there is overlap between the human unr and N-ras genes, with the most 5′ transcription start site of N-ras located just 46 bp downstream of the most 3′-polyadenylation signal of unr (Figure 1). This unusually close gene linkage is evolutionarily conserved and can be traced back to the divergence of reptilian and bird lineages , suggesting that this tandem arrangement is important for the expression of unr, N-ras or both genes. The two genes do not share a promoter, with the N-ras promoter located partly within the unr gene. Thus it might be expected that transcription of unr could negatively affect transcription of N-ras since RNA polymerase II activity on the unr gene may displace proteins bound to the N-ras promoter. To address this in vivo, unr and N-ras mRNA levels were measured in a range of tissues from wild-type and heterozygous unr +/–mice. Approximately 50% reduction in unr mRNA levels correlated with a 20%–65% increase in N-ras mRNA levels in the tissues tested, suggesting that unr transcription has a moderate interfering effect on N-ras transcription, although no phenotypic effect was observed in the heterozygous mice . This suggests that although somatic mutations that reduce the level of unr expression are found in some cancers, the concomitant increase in N-ras expression is unlikely to be the oncogenic driver.
Human unr/N-ras locus on chromosome 1
Transcription of unr, on the other hand, is not affected by changes in N-ras expression; insertion of MLV (murine leukaemia virus) between unr and N-ras genes in mice induced B-cell lymphomas due to N-ras overexpression whereas unr expression was unchanged . Unr is ubiquitously expressed across all tissues, although there are differences in the level of expression between tissues (7-fold higher in mouse testis than liver) and during development. For example, expression of unr in the mouse testis peaks ∼3 weeks after birth, but in the small intestine it decreases from birth until it is undetectable in this tissue in adult mice . Although little work has been carried out directly on regulation of unr transcription, unr was identified in a screen for genes regulated by c-Myc. c-Myc and its binding partner Max (Myc associated factor X) were shown to bind the unr promoter directly, leading to transcriptional repression . Interestingly, Unr has been shown to have a positive effect on c-Myc mRNA translation , so the combination of transcriptional and translational control may contribute to auto-regulatory feedback loops for both Unr and c-Myc.
Human genome and EST (expressed sequence tag) sequencing projects have identified six alternatively spliced transcript variants of unr (Figure 2A) that vary in the inclusion of exons 2, 3 and 6 (out of a total of 21 exons). Transcript variants 1 and 2, which encode 798 and 767 amino acid proteins respectively (protein isoforms 1 and 2; Figure 2B), represent the first cDNAs identified and differ in the inclusion or exclusion of exon 6 (originally denoted exon 5 ), which encodes 31 amino acids between CSDs 1 and 2. Subsequently, two further pairs of unr transcripts (+/–exon 6 in each pair) were identified. Transcript variants 4 and 3 are the longest and contain exon 3, which is missing in other transcripts. Exon 3 contains 9-nt of 5′-UTR sequence and 138 nt of coding sequence, hence transcripts 4 and 3 encode 46 amino acid N-terminally extended proteins (protein isoforms 4 and 3; Figure 2B) relative to isoforms 1 and 2. Transcript variants 5 and 6 are the shortest and are missing exons 2 and 3. Exon 2 contains entirely 5′-UTR sequence, hence transcripts 5 and 6 have much shorter 5′-UTRs than the other transcripts but encode the same protein isoforms (1 and 2) as transcripts 1 and 2. An early study carried out to look at the relative abundance of transcripts 1 and 2 in rat tissues used ribonuclease protection assays to distinguish between transcripts that included or excluded exon 6. They found 10-fold higher levels of transcripts lacking exon 6 in all tissues tested, apart from the brain, where there was roughly equal abundance . In terms of whether the protein isoforms have differential activity, isoform 2, lacking the 31 amino acids of exon 6, was shown to be more active in stimulating translation from the human rhinovirus type 2 (HRV-2) IRES than isoform 1 . However, nothing is known about the relative abundance or tissue distribution of the more recently identified splice variants of unr or whether the different protein isoforms have specific or redundant functions.
Structure of the unr transcript variants and protein isoforms
Further heterogeneity of unr transcripts is introduced by the presence of three alternative polyadenylation signals within the 3′-UTR, which give rise to transcripts containing 3′-UTRs of approximately 200, 900 or 1250 nt . Northern blotting of mouse tissue samples with differential probes suggested that the two more distal polyadenylation signals, giving rise to transcripts between 3.8 and 4.2 kb in length, are preferentially utilized in most tissues, but that a 3.2 kb transcript arising from use of the proximal polyadenylation signal, is equally abundant in the testis . The 3′-UTR of unr contains five AU rich elements (AREs) , RNA sequences which are known to confer deadenylation-dependent instability on the transcript . The unr AREs are all located between the first and the second polyadenylation signal so that the more abundant longer 3′-UTR isoforms all contain these instability determinants. Expression of unr mRNA, with or without its 3' UTR, in vitro and in unr null mouse embryonic stem cells, showed that unr mRNA had a short half-life of 1.5 h, but that removal of its 3′-UTR increased the half-life 3-fold . However, addition of the unr 3′-UTR to a luciferase reporter did not confer instability on the transcript, so unr 5′-UTR and/or coding sequences, or indeed Unr protein, are also implicated in the instability of unr mRNA .
The regulation of translation of unr mRNA has been studied by two groups who looked at the 5′-UTR of transcripts 1 and 2 (although lacking the first 66-nt according to the longest EST in the NCBI database, giving a 446-nt 5′-UTR). This relatively long 5′-UTR, predicted to contain stable secondary structure (ΔG=–105.2 kcal/mol) , was shown to harbour IRES activity in a dicistronic assay in vitro and in a number of cell lines [6,24]. The IRES activity was shown to be approximately half that of the EMCV (encephalomyocarditis virus)  or c-Myc  IRESs used as positive controls. These groups also showed that translation from the unr IRES was inhibited by the binding of PTB (polypyrimidine tract-binding protein)  and by Unr itself , enabling feedback control of unr expression. Unr is expressed at higher levels during G2/M phase of the cell cycle and stimulates cyclin-dependent kinase (CDK)11 IRES activity during G2/M to produce the CDK11p58 isoform . It was subsequently shown that during mitosis, hnRNPC1/C2 are released from the nucleus and bind to the unr 5′-UTR, displacing PTB and Unr, stimulating nascent Unr translation . The 5′-UTR of unr transcript variants 3 and 4 is 9-nt longer at the 3′-end than the 5′-UTR used in the studies above. Although a minor difference, further studies will be needed to determine whether the efficiency of translation is affected by the slightly different context of the start codon or potential changes to secondary structure and/or long-range RNA–RNA interactions. The 5′-UTR of transcript variants 5 and 6 (missing exons 2 and 3), however is much shorter at 127 nt and doesn't contain the PTB-binding sites present in the longer 5′-UTRs of transcripts 1–4 , suggesting that translational regulation of these transcripts may be quite different.
Additional features of the 5′-UTR of unr transcripts 1–4 include two short upstream ORFs (uORFs) of 8 and 10 amino acids, ∼300 nt upstream of the start codon. The effect of these short uORFs on unr translation has not been investigated; it may be that internal ribosome recruitment bypasses any regulatory effect of the uORFs and/or they may contribute to repression of unr translation when the IRES is inactive. We are currently investigating the relative level of translation from each of the transcript variants in conjunction with the level of expression of each variant to determine their contribution to Unr protein levels in vivo.
Following translation, protein levels and activity may be regulated by post-translational modifications (PTMs), by interaction with specific protein-binding partners and complexes or by localization to specific sub-cellular sites, perhaps as a consequence of PTMs.
Modified peptides of Unr have been identified in several proteomic screens for PTMs; three serine phosphorylation sites (Ser116, Ser123 and Ser514; numbering according to protein isoform 1; Figure 3A) were identified by high resolution MS (mass spectrometry) of phosphopeptide-enriched lysates from cells arrested in mitosis [26,27]. Ser116 and Ser123 are encoded by exon 6, between CSDs 1 and 2, so are only present in protein isoforms 1 and 4; Ser514 is encoded by exon 15, between CSDs 3 and 4 and is present in all isoforms. Global analysis of lysine acetylation in a myeloid leukaemia cell line by MS of immunoaffinity purified lysine-acetylated peptides revealed that Unr can be acetylated at position Lys81  (Figure 3A), within CSD1. Compared with phosphorylation, which adds negative charge and is usually found in unstructured regions of proteins, acetylation neutralizes lysine's charge and is usually found in structured protein regions.
Sites of PTM of Unr
Use of online databases (www.ubpred.org and www.abgent.com/sumoplot) suggests that Unr has a number of other sites that have the potential to be modified by the addition of ubiquitin or SUMO (small ubiquitin-like modifier). The most likely sites of ubiquitination and SUMOylation and their probability of modification are shown in Figure 3(B), but these have yet to be confirmed experimentally. Ubiquitination is often associated with proteasome-dependent degradation, although mono-ubiquitination does not target proteins to the proteasome, but acts to regulate protein activity or sub-cellular location. SUMOylation does not directly target proteins to the proteasome, but does regulate protein activity and trafficking.
Unr has been purified with different protein-binding partners in different contexts; for example with Unrip [Unr interacting protein; also known as STRAP (serine threonine receptor-associated protein)] on the HRV-2 IRES  and with poly(A)-binding protein (PABP1) on the c-fos  and pabp1  mRNAs, as part of mRNA-specific complexes. Unr may have different functions when bound to its different protein partners, but it is currently unknown how Unr is partitioned into different protein complexes.
Early sub-cellular fractionation studies showed that the majority of Unr was found in the cytoplasmic fraction, specifically the microsomal and soluble fractions rather than plasma membrane or mitochondrial fractions . Confocal immunofluorescence microscopy shows a diffuse cytoplasmic localization of Unr in a number of cell lines including MCF-7  and HeLa (Swagat Ray, Pól Ó Catnaigh and Emma Anderson, unpublished data). However, Unr can be recruited to specific sub-cellular structures under certain conditions; for example, Unr is one of many RNA-binding proteins recruited to cytoplasmic stress granules in response to oxidative stress in HeLa cells , presumably sequestering it from its cytosolic activity.
The effect of PTMs on Unr is yet to be determined, but we speculate that they may contribute to which proteins Unr interacts with and/or where Unr localizes within the cell.
Unr, an RNA-binding protein essential for mammalian development, has been shown to regulate a number of cellular processes, such as mitosis, differentiation and apoptosis . Mutations in the unr gene have also been linked to many cancers, further highlighting the importance of appropriate Unr expression and activity. Unr expression is regulated at the level of transcription, RNA processing and translation. Unr is also post-translationally modified and interacts with different sub-cellular complexes in response to changing cellular conditions, although further work will be required to investigate how these modifications and interactions control Unr function.
This work was supported by the Biotechnology and Biological Sciences Research Council [grant number BB/J001791/1] to E.C.A.
AU rich element
cold shock domain
expressed sequence tag
heterogeneous nuclear ribonucleoproteins C1/C2
human rhinovirus type 2
internal ribosome entry site
poly(A)-binding protein 1
polypyrimidine tract-binding protein
serine threonine receptor associated protein
small ubiquitin-like modifier
upstream of N-ras
Unr interacting protein
Translation UK 2015: Held at the University of Aberdeen, U.K., 7–9 July 2015.