Alternative splicing is universally accredited for expanding the information encoded within the transcriptome. In recent years, several tightly regulated alternative splicing events have been reported which do not lead to generation of protein products, but lead to unstable mRNA isoforms. Instead these transcripts are targets for NMD (nonsense-mediated decay) or retained in the nucleus and degraded. In the present review I discuss the regulation of these events, and how many have been implicated in control of gene expression that is instrumental to a number of developmental paradigms. I further discuss their relevance to disease settings and conclude by highlighting technologies that will aid identification of more candidate events in future.
Principles of splicing
Post-transcriptional splicing of RNA is an essential and precisely regulated mechanism that ensures removal of introns and ligation of exons within multi-exon eukaryotic genes in order to produce a translatable mRNA. The spliceosome, a dynamic RNA–protein complex, carries out this splicing of pre-mRNA. Specifically, five snRNAs (small nuclear RNAs) and their associated RBPs (RNA-binding proteins), collectively termed snRNP (small nuclear ribonucleoprotein) complexes, are recruited to cis-elements in the pre-RNA sequence via base pairing to the snRNAs. The cis-elements that are recognized by the snRNPs at exon/intron and intron/exon junctions consist of the 5′ splice site (AG/GURAG), 3′ splice site (a polypyrimidine tract followed by acceptor AG dinucleotide) and the branch point sequence (YURAY) (Figure 1a). Once an snRNP is recruited to its corresponding cis-element it performs a specific role in the splicing reaction at that site (reviewed comprehensively in ). In addition to the snRNPs, over 100 spliceosome-associated RBPs also recognize the same or additional cis-elements in the RNA and influence snRNP recruitment. For example, U2AF65 recognition of the poly-pyrimidine tract is required for recruitment of the U2 snRNP to the branch point sequence at early stages of splicing [2,3] (Figure 1a). Accordingly, the spliceosome is constantly re-modelled as splicing proceeds, allowing additional snRNPs and RBPs to interact with the RNA after previous steps have been completed .
Regulated use of cis- and trans-acting factors directs splicing and can lead to various types of alternative-splicing events
Splicing requires the presence of the 5′ splice site, 3′ splice site and branch-point sequence being located in appropriate spatial relation to one another [1,4]. However, correct combinations of these cis-acting elements do not always lead to a constitutive splicing event. Instead, alternative splicing is used to vastly expand the repertoire of mRNAs generated from a single transcript. Indeed, 95% of all human genes are expected to be alternatively spliced through selective use of cassette exons, use of alternative 5′ splice sites or 3′ splice sites, or through intron retention [5,6] (Figure 1b).
Alternative splicing is regulated by additional trans-acting RBPs, often with tissue-specific expression patterns and/or under post-translational control, that recognize certain cis-elements surrounding regulated junctions [7–10] (Figure 1a). This includes short SE (splicing enhancer) or SS (splicing silencer) motifs located within exons (ESE and ESS) or introns (ISE and ISS) that promote either exon inclusion or exclusion [11,12]. For example, the SR (serine/arginine) repeat-rich proteins are a conserved family of genes that typically interact with enhancer sites and positively influence snRNP recruitment to the neighbouring splice junctions . SR protein binding to an exon has been demonstrated to enhance recruitment of the U1 snRNP to the 5′ splice site  and U2 snRNP to upstream branch point region [15,16], and are required for the assembly of the U4/U6 and U5 tri-snRNP on the pre-mRNA at latter stages of splicing . In opposition, among others, are the hnRNPs (heterogenous ribonucleoproteins), which tend to bind silencer sequences and promote skipping of exons . This is typically achieved by restricting access of other splicesome-associated components to cis-elements in the RNA. For example, hnRNP C has a higher affinity to uridine-rich tracts than U2AF65, but is less tolerable of interspersed cytidines . Composition of the poly-pyrimidine tract will subsequently determine if hnRNP C blocks U2AF65 binding to the site and hence U2 snRNP recruitment.
In addition to enhancing the coding information contained within the transcriptome, it has become apparent that production of unstable mRNA isoforms via alternative splicing can be used by the cell as a strategy to quantitatively regulate gene expression. This can be through alternative splicing events that result in targeting of the transcript for NMD (nonsense-mediated decay). This is most commonly referred to as AS-NMD (alternative splicing-coupled NMD) or, alternatively, RUST (regulated unproductive splicing and translation). In addition, this can be achieved by controlling nuclear–cytoplasmic localization of certain transcripts within the cell via regulated intron retention.
NMD is an RNA surveillance pathway that targets PTC (premature termination codon)-containing transcripts for degradation during translation. Recognition of these transcripts involves both nuclear and cytoplasmic events  (Figure 2a). First, during pre-mRNA splicing, the EJC (exon–junction complex) is deposited on mature exon–exon junctions. This acts as a positional memory of the splicing event and is additionally required for correct nucleocytoplasmic shuttling of the mRNA [20,21]. Among others, two proteins present in the EJC deposited in the nucleus are the upstream frameshift family members UPF2 and UPF3, which are critical to later NMD activation. After shuttling to the cytoplasm the initial rounds of mRNA translation appear to clear EJC components from the transcript [22–24]. Once a premature termination codon is recognized, the UPF1-containing SURF (SMG1, UPF1, ERF1 and ERF3) complex is deposited on the mRNA at the site of the terminating ribosome [25,26]. Should this be more than ~50 nucleotides from a proceeding EJC complex that has not been cleared, then UPF1 interacts with UPF2 and UPF3, leading to UPF1 phosphorylation and, as a consequence, translational repression through target degradation . This NMD activation can be through inclusion of a PTC-containing exon [28–30], presence of an upstream ORF [31,32], skipping of a section of the 3′UTR  or through retention of a PTC-containing intron in a transcript which remains actively shuttled to the cytosol [28,33–35] (Figure 2b). A further observation is that longer 3′UTRs are more susceptible to NMD, which is likely determined by the distance between the termination codon and poly-A tail [36,37].
NMD targets PTC-containing transcripts for degradation during translation
Until recently it was expected that the main role of NMD was to clean up errors in splicing, and to eliminate both nonsense and frame-shift mutations that lead to production of damaging truncated proteins. However, this view has changed in lieu of findings demonstrating that up to a third of human transcripts are predicted to be NMD targets [38–40], and many conserved cassette exons contain PTCs in all three frames to suggest a specific role in triggering NMD [28,39]. Most importantly, many alternative splicing events of PTC-containing exons or within 3′UTRs appear to be both highly conserved and dynamically regulated. Collectively it points towards a regulatory role of NMD, in addition to the surveillance role, in which alternative splicing and translation is coupled to NMD to regulate gene expression [38,41,42]. Furthermore, it is expected that AS-NMD represents a particularly pervasive means by which gene expression can be regulated [38,43].
Although the deliberate production of aberrant mRNA isoforms with low stability at first seems counterproductive, using AS-NMD as means of regulating gene expression has a number of potential benefits to the cell. First, it offers a means by which the expression levels of a specific gene can be fine-tuned without having to interfere with the global activity of transcription factors or translational regulators, many of which bind hundreds, if not thousands, of sites across the genome/transcriptome [10,44–46]. Secondly, continual production of mRNA that is typically just a single event away from generating a translatable mRNA isoform offers a means to generate a rapidly responding molecular switch with which to respond to external cues . Finally, it can permit translation into a protein by directing transcription degradation after just a single round of translation. In doing so, NMD acts as a brake on protein expression as the message, in effect, self-destructs [35,45,46].
What is the meaning of this nonsense?
Since its initial suggestion, several AS-NMD candidate events have been identified. Following previous reports demonstrating that one member of the SR protein family regulates its own expression through coupling splicing to NMD , Lareau et al.  demonstrated that in fact all members of the SR protein family contained ultraconserved elements that were alternatively spliced in order to direct NMD. This included examples of PTC-containing exons, alternative splicing of the 3′UTRs and retention of PTC-containing introns. It is expected that the presence of these ultraconserved elements represents a negative-feedback loop since stabilization of NMD products through translational inhibition led to decreases in coding variants. Indeed, negative feedback has been confirmed for a tissue-specific hnRNP family member and splicing factor, PTBP1 (polypyrimidine tract-binding protein 1), which regulates its own levels through PTBP1-dependent skipping of exon 11 to create an AS-NMD transcript . PTBP1 is a regulator of alternative splicing, mRNA stability and localization, and this autoregulation is expected to fine-tune PTBP1 protein to appropriate levels.
Many other splicing factors similarly contain highly conserved regions consistent with generating NMD products [49–51], some of which demonstrate autoregulation or are implicated in negative-feedback loops [44,52–56]. This includes components of the core splicing machinery [56,57]. Collectively this suggests AS-NMD has a crucial role across evolution in modulating expression levels of many splicing factors, perhaps to regulate tissue-specific splicing patterns via regulation of mRNA stability , or to permit rapid response to external cues .
However, AS-NMD is not limited to RBPs. Network analysis following silencing of NMD proteins identifies transcription factors, signalling proteins and metabolic proteins as other protein families with regulation of gene expression via NMD [58,59]. Moreover, many NMD components are also NMD sensitive to suggest autoregulation of the mechanism itself [37,60].
In recent years, several studies have progressed beyond identification of just the regulated sites and successfully distinguished the quality control aspects of NMD from bona fide AS-NMD to demonstrate direct functional consequences of specific events [29,35,44,61]. For example, the transition from epithelial to mesenchymal states is an important step in both development and the progression of epithelial tumour metastasis. Valacca et al.  have dissected a role for AS-NMD in this transition in which phosphorylation status of the RBP, Sam68 (Src-associated in mitosis 68 kDa protein), regulates splicing of the 3′UTR of the SR protein SF2 (splicing factor 2) in response to extracellular cues. Production of an NMD-sensitive and unstable SF2 transcript, following ERK (extracellular-signal-regulated kinase)-mediated Sam68 phosphorylation, leads to a reduction in the inclusion of exon 11 in the RON proto-oncogene. This generates a constitutively active form of RON that promotes this epithelial to mesenchymal transition.
PTBP1 and PTBP2 are two functionally related proteins which regulate alternative splicing of many overlapping and unique exons . The two proteins display opposing expression patterns during neuronal development. With the exception of in neurons, PTBP1 is ubiquitously expressed. In contrast, PTBP2 is restricted to post-mitotic neurons, despite RNA transcripts being found in neural progenitor cells . A switch in usage from PTBP1 to PTBP2 occurs during neuronal differentiation, and this is expected to account for ~25% of the changes in alternative splicing that occur during this fate-determining transition. Importantly this switch is regulated by AS-NMD. Both genes contain homologous PTC-containing exons, inclusion of which is promoted by PTBP1. In the case of PTBP1 this is as part of the autoregulatory feedback mechanism to limit PTBP1 levels , whereas in PTBP2 this forms the molecular switch implicated in neuronal differentiation [44,62]. Specifically, during neuronal differentiation, elevated miR-124 levels lead to down-regulation of PTBP1 and the subsequent skipping of the PTC-containing PTBP2 exon 10 [44,63]. This results in active translation of PTBP2 and elegantly demonstrates how AS-NMD of a single exon can lead to far-reaching effects on a population of genes. This transition may be even further strengthened by a general reduction of NMD activity which inhibits cell proliferation, inhibits TGFB (transforming growth factor β) signalling and drives expression of NMD-targeting miRNAs to reinforce this decision .
Another neuronal-specific regulator of alternative splicing, NOVA, had previously been demonstrated to regulate inclusion of a number of cassette exons and poly-A site choice by binding to YCAY clusters [8,64,65]. By focusing in on nuclear binding events, NOVA has now been shown to bind YCAY clusters deep within long introns while in the nucleus, and regulate multiple AS-NMD events that are ultimately important in determining the steady-state expression of host transcripts through regulation of their stability . This regulation is dynamic since seizure induction, which results in NOVA shuttling to the cytoplasm, led to corresponding changes in inclusion or skipping of these sites and changes to host gene levels as mRNA stability is modified. Indeed, these are present in several transcripts linked to seizure development such as DLG3 (discs large homologue 3) , SLC4a10 (solute carrier family 4, sodium bicarbonate transporter, member 10)  and SLC4a3 . Importantly, several of these sites are cryptic events not apparent in current gene annotations. This demonstrates the efficiency of NMD in eliminating these transcripts while additionally highlighting that many AS-NMD events may be presently unknown due to limitations in methods used to identify them.
Finally, a particularly elegant use of AS-NMD has been demonstrated in the local control of protein synthesis in neurons, which aids axon guidance during development . During embryonic development the commissural neurons of the spinal cord are first attracted to the ventral midline, cross, and then are repelled from the ventral midline in order carry on their intended course of projection . This switch from attraction to repulsion is mediated by cues from the glial-containing floor plate , and involves a switch from Robo3.1 to Robo3.2 expression in the growth cone after ventral midline crossing . The two closely related transcripts coding these variants differ only by the presence of one retained intron in the Robo3.2 transcript. This introduces a PTC and makes Robo3.2 an NMD-sensitive transcript. Colak et al.  elegantly demonstrate in mouse samples that, unlike Robo3.1, which is translated in the cell body, substantial Robo3.2 mRNA is transported along axons and locally translated in response to floor plate cues after ventral midline crossing. NMD components are additionally associated with the Robo3.2 transcripts and found enriched at the growth cone, suggesting that local regulation of RNA stability by NMD may also be involved. Indeed, this leads to an excellent exploitation of the NMD machinery since, upon initiation of Robo3.2 translation, it is expected that only a single round of translation occurs before the transcripts are targeted for degradation. This is sufficient to drive adequate Robo3.2 expression to direct repulsion, perhaps one protein per transcript, but not so much that over-repulsion ensues. It is expected that such local control of AS-NMD events will be important for other axonal guidance events given the elevated expression of NMD components in growth cones of other neurons [35,70].
Regulation of AS-NMD
AS-NMD must be regulated on a cell-type specific basis in order to ensure appropriate gene expression levels of regulated targets are met. The limitation of AS-NMD to a single round of translation is one manner in which this could be achieved . However, recent work using CLIP (cross-linked immunoprecipitation) from the group of Phillip Sharp has shed additional light on the fine-tuning mechanism of AS-NMD that permits different steady-state protein concentrations . CLIP analysis of the RNA targets of the RBP, Rbfox2 (RBP, fox-1 homologue 2), revealed an enrichment around cassette exons which had reduced splice site strength, and a preference for binding events located downstream of exons to lead to their inclusion. However, several strongly bound sites failed to reveal corresponding changes in inclusion when looking at RNA-seq data generated following Rbfox2 silencing. Closer inspection revealed that a significant proportion of these bound sites neighboured PTC-containing cassette exons, inclusion of which would lead to instability and degradation of the host transcript, and consequently underestimations of the extent of regulation in RNA-seq data. Many of the genes with these identified events were RBPs, again hinting at the important role of AS-NMD in the regulation of RBP expression levels. Crucially, however, it was found that the protein expression levels of Rbfox2 could modulate the threshold of previously documented autoregulation of AS-NMD events in both Tia1 and Ptbp2 leading to corresponding changes in total gene expression [44,52,72]. This importantly sets a precedent for the re-analysis of other CLIP datasets to look for similar genome-wide regulation of AS-NMD events by other RBPs.
Further to this, as direct NMD targets, at least seven components of the NMD pathway (UPF1, UPF2, UPF3B, SMG1, SMG5, SMG6 and SMG7) are subject to feedback regulation that is exerted and regulated in both a cell-type and developmental manner [60,73]. Moreover, this feedback could be separated into UPF3B-sensitive and -insensitive branches, corroborating the idea that there are multiple branches of the NMD pathway which have requirements for different combinations of NMD factors [74,75]. The purpose of this feedback is expected to ensure correct buffering of NMD components in response to external stimuli. Supporting this, it was demonstrated that three of these regulated genes (SMG1, SMG5 and SMG6) are rate-limiting to NMD in general, and blocking the autoregulated up-regulation of one of these following UPF1 depletion severely hampered the NMD response . In contrast, overexpression of one of these rate-limiting factors, in order to induce overactive NMD, resulted in reduced cell proliferation. This supports the hypothesis that maintaining expression of NMD components at cell-type specific requirements, and tight feedback regulation of this pathway, are critical for normal cellular homoeostasis [30,60].
Intron retention mediated regulation of gene expression:
In addition to AS-NMD, co-ordinated intron retention can also regulate gene expression. Although AS-NMD may ensue if the transcript is efficiently exported to the cytosol and recognized by NMD machinery (Figure 2b) [28,33–35], in other cases export is inhibited leading to nuclear retention and degradation (Figure 2C) [76–79]. Although the precise mechanism by which transcripts with retained introns are degraded needs further study, it appears that both components of the nuclear exosome (e.g. Dis3, Exosc10 and Exosc9) and nuclear pore-associated proteins (e.g. Tpr) are involved . Intron retention is less well understood than mechanisms underlying AS-NMD of cassette exons, but appears to be dependent on a high GC content of introns, nucleosome density and weak splice sites and linked to reduced availability of splicing components [34,80,81]. Increasing evidence suggests it could be of comparable importance and prevalence to AS-NMD. Indeed, up to 15% of human protein-coding genes have evidence of such events , although many are in 3′UTRs and may yet prove to be processed as AS-NMD targets.
The importance of intron retention is highlighted by two key reports. In leucocyte differentiation pathways the differential regulation of 86 intron retention events leads to the NMD of a collection of functionally related genes linked to leucocyte function at specific developmental time points . Meanwhile, in the case of neuronal differentiation, PTBP1 represses a number of 3′UTR intron removal events which leads to their nuclear retention and degradation in an NMD-independent fashion. Following the PTBP1 to PTBP2 switch [44,63], these are removed produce translation competent transcripts that are exported to the cytosol and then participate in the neuronal differentiation program . In fact PTBP1 may have a more general role in retaining pre-mRNA in the nucleus , together with the U1 snRNP and U2AF65 . These two studies clearly identify the coupling of alternative splicing to nuclear retention and degradation as a powerful means of regulating gene expression, and it is predicted that other examples will surface in due course. For a more extensive review on intron retention, please see .
Unstable mRNA isoforms and disease
Splicing events that result in unstable mRNA isoforms have clear relevance to disease settings and may represent novel therapeutic targets. As many as a third of mutations leading to genetic diseases are due to mutations that create unstable NMD-sensitive transcripts [43,85–87]. Among others, this includes α-thalassaemia , β-thalassaemia , retinitis pigmentosa , ataxia telangiectasia , spinal muscular atrophy  and numerous cancers [93–96]. Accordingly, drugs which promote PTC read-through are attracting considerable attention as potential therapies [97–99], although these are likely to be specific to individual cases .
In addition, reports have demonstrated disease-relevant disruption of AS-NMD events involved in autoregulation of certain genes. FUS is an ALS (amyotrophic lateral sclerosis)- and FTLD (fronto-temporal lobar degeneration)-associated RBP which has been shown to control its own expression through regulated skipping of exon 7, leading to an NMD-sensitive transcript [101,102]. However, discordant regulation of this exon was reported with three pathogenic mutations that all lead to cytoplasmic accumulation of FUS. This led to higher levels of inclusion of the exon and the subsequent self-promotion of the cytoplasmic localization. This autoregulation could be re-induced in mutants through use of exon-skipping antisense oligonucleotides, which importantly led to restoration of nuclear FUS localization, suggesting this may be a useful target for ALS/FTLD should appropriate delivery strategies be developed. Similar disruption to the autoregulation of another ALS/FTLD-associated protein, TDP-43 (TAR DNA-binding protein-43), is also expected to be pathogenic [103,104], although this remains to be definitively confirmed.
Finally, genetic perturbations to NMD pathway components lead to genome-wide effects on mRNA stability and disruption of gene regulation that contributes to disease. Evident of this, naturally occurring mutations in the UPF3B gene are a direct of cause intellectual disability and other mental disorders [105–107]. This is expected to affect only UPF3B-sensitive NMD targets in one branch of the NMD pathway [74,75]. Indeed only approximately 5% of human genes are affected, whereas the protein product of its paralogue, UPF3A, is stabilized in order to compensate .
The discussed examples make it clear that regulated splicing to produce unstable mRNA isoforms has a fundamental role in regulating gene expression genome-wide and adds an added dimension to regulation of the transcriptome by alternative splicing. It is expected that growing appreciation of the best methods to identify these events will dramatically expand the known cases in the future. Indeed, the precise scale of unstable alternative mRNA isoforms remains unclear. This is partly because much genomic analysis focuses on reference annotations rather than additionally exploring novel splicing events where new candidates may be found. Indeed, recent reports have identified many cryptic splicing events that are tightly repressed and therefore not present in current annotations, but which are clearly under dynamic regulation [18,29,72]. Such cryptic splice sites are widespread across the genome . It will therefore be interesting to see how many of these cryptic sites are used in additional AS-NMD or intron-retention events as appropriate sequencing approaches to detect them are used.
The discovery of unstable alternative mRNA isoforms is made harder by the fact that they are efficiently degraded and difficult to detect with methods such as whole-cell RNA-seq . Re-evaluation of current genome annotations using functional genomics data has recently expanded our knowledge of the transcriptome in the fruitfly [110,111]. This has increased the known number of regulated exons in this species, many of which are candidates for regulation by NMD, and it is hoped that similar analysis will be as fruitful in more complex species. This will allow new co-ordinates for mapping of next generation sequencing data that will incorporate many new AS-NMD candidate events. Further to this, AS-NMD and intron retention events are carried out in the nucleus and respectively degraded rapidly in the cytoplasm or nucleus. Therefore the use of nuclear RNA-seq will help enrich for these characteristic events before their degradation.
As an alternative, CLIP-based studies have repeatedly revealed novel cryptic sites that are implicated in the regulation of gene expression via generation of unstable mRNAs [18,29,72]. This is because CLIP can cross-link interactions between nuclear RBPs and the low abundance unstable targets before their degradation. The result is that CLIP studies of candidate regulatory proteins can hugely facilitate cryptic transcript detection. Numerous CLIP studies have now been undertaken , and it is expected that re-analysis of existing data will reveal many more regulated splicing events that lead to unstable mRNA isoforms with both biological and disease relevance.
RNA UK 2014: An Independent Meeting held at Low Wood Hotel, Windermere, U.K., 24–26 January 2014. Organized and Edited by Niki Gray, Gracjan Michlewski and Steve West (University of Edinburgh, U.K.).
amyotrophic lateral sclerosis
alternative splicing-coupled NMD
fronto-temporal lobar degeneration
polypyrimidine tract-binding protein
premature termination codon
RBP, fox-1 homologue 2
Src-associated in mitosis 68 kDa protein
splicing factor 2
solute carrier family 4
small nuclear RNA
small nuclear ribonucleoprotein
SMG1, UPF1, ERF1 and ERF3
I thank Jernej Ule and Andrea D’Ambrogio for helpful discussion and a critical reading of the paper during preparation.
Funding from the European Research Council supported this work.