Alternative pre-mRNA splicing is frequently used to expand the protein-coding capacity of genomes, and to regulate gene expression at the post-transcriptional level. It is a significant challenge to decipher the molecular language of tissue-specific splicing because the inherent flexibility of these mechanisms is specified by numerous short sequence motifs distributed in introns and exons. In the present study, we employ the glutamate NMDA (N-methyl-D-aspartate) R1 receptor (GRIN1) transcript as a model system to identify the molecular determinants for a brain region-specific exon silencing mechanism. We identify a set of guanosine-rich motifs that function co-operatively to regulate the CI cassette exon in a manner consistent with its in vivo splicing pattern. Whereas hnRNP (heterogeneous nuclear ribonucleoprotein) A1 mediates silencing of the CI cassette exon in conjunction with the guanosine-rich motifs, hnRNP H functions as an antagonist to silencing. Genome-wide analysis shows that, while this motif pattern is rarely present in human and mouse exons, those exons for which the pattern is conserved are generally found to be skipped exons. The identification of a similar arrangement of guanosine-rich motifs in transcripts of the hnRNP H family of splicing factors has implications for their co-ordinate regulation at the level of splicing.
Alternative pre-mRNA splicing diversifies the expression of most human protein coding genes, but these regulatory mechanisms are poorly understood [1–3]. A central problem in post-genome biology is to understand the molecular codes that allow for intricate adjustments and co-ordination of splicing patterns on a global scale, and how this inherent flexibility allows for splicing errors in conjunction with human disease (Figure 1).
The problem of inherent flexibility
Splicing decisions are made in the context of the spliceosome, which is assembled by the stepwise association of pre-mRNA with the U1, U2 and U4/U5/U6 small nuclear ribonucleoprotein particles. The spliceosome is central to exon and intron recognition, to catalytic activity and to the coupling of the splicing process with transcription, 3′-end formation and nuclear mRNA export. The recognition of an internal cassette exon, which is the most common type of splicing decision, involves the strengths of the adjacent 3′- and 5′-splice sites, as well as numerous control sequences as distinct from the splice sites [6,7]. These control sequences include exonic and intronic splicing enhancers (ESEs and ISEs) and silencers (ESSs and ISSs) . Splicing factors that play important roles in recognizing these control sequences include the serine/arginine-rich (SR) and hnRNP (heterogeneous nuclear ribonucleoprotein) families of protein factors. The SR splicing factors generally play important roles in mediating splicing enhancement through the recognition of ESE motifs, whereas the hnRNP splicing factors play diverse regulatory roles in enhancement and silencing . Exon silencing of many transcripts is mediated by the polypyrimidine tract-binding protein (or hnRNP I), which recognizes UCUU and (UC)n sequence motifs or by hnRNP A1, which recognizes UAGGG[U/A] motifs . In contrast, splicing enhancement is mediated by the hnRNP H protein family, which recognizes GGGA motifs  or by CUGBP and ETR3-type factors, which recognize (GU)n motifs . SR and hnRNP splicing factors, which vary in different cell types, are supposed to play crucial roles in the regulation of tissue-specific splicing decisions by shifting the balance of control towards enhancement or silencing (see e.g. [13–15]).
A pattern of guanosine-rich sequence motifs directs splicing silencing of the CI cassette exon
The CI cassette exon of the glutamate NMDA (N-methyl-D-aspartate) R1 receptor (GRIN1) transcript was the subject of the present study because of its strong tissue specificity, which involves prominent exon inclusion in the forebrain and exon skipping in the hindbrain  (Figure 2). The NI cassette exon, which lies upstream in the same pre-mRNA transcript, is spliced by a large reciprocal pattern. Glutamate NMDA receptors, which are expressed as complexes of NR1 and NR2 subunits, play important roles in learning and memory functions in the brain. Alternative splicing of the CI cassette exon of the NR1 subunit regulates the localization of the receptor and its mode of trafficking to the synapse [17,18]. We demonstrated previously in co-expression assays that the RNA-binding protein, NAPOR/CUGBP2, mediates enhancement of the CI cassette exon and silencing of the NI exon . These functional characteristics together with its forebrain-enriched expression suggest that NAPOR/CUGBP2 is a good candidate for an in vivo splicing regulator with dual functionality.
Tissue specificity and working models for splicing regulation of the NI and CI cassette exons of GRIN1 pre-mRNA
This study was initially motivated by the desire to understand why the CI cassette exon, which has strong adjacent splice sites, is prominently skipped in the hindbrain and why a splicing activator is required for exon inclusion in the forebrain. Theoretically, a strong silencing mechanism imposed on an otherwise strong exon would explain this apparent dilemma. To address this problem, K. Han (a graduate student in my laboratory) evaluated the in vivo splicing pattern of the CI cassette exon after site-directed mutagenesis of predicted ESE motifs as well as sequences in the downstream intron. Of the seven ESE motifs tested, six of these behaved similar to true enhancers since 1–3 bp changes in these motifs increased exon skipping. These motifs, which are specific for the splicing factors, ASF/SF2, SC35 and SRp40, are shown schematically in Figure 2. In one case, however, the predicted ASF/SF2 motif, CGUAGGU , behaved similar to a strong splicing silencer. Mutations within the UAGG sequence of the ASF/SF2 motif resulted in strong exon inclusion indicating that the primary role of this motif is to silence splicing. UAGG motifs have been well-characterized in other systems as signals for hnRNP A1-mediated silencing [20–24]. A second UAGG motif in the CI cassette exon was also tested and found to play an important role in strengthening the silencing effect. Additional experiments examined the roles of intronic sequences downstream of the CI cassette exon. A GGGG motif, which is positioned immediately adjacent to the 5′-splice site of the CI cassette exon was shown to play an important role in the silencing mechanism. The identification of an intronic GGGG motif as a component of the silencing mechanism was surprising, since studies in other systems have demonstrated that guanosine-rich sequences in downstream introns generally play enhancing roles in transcripts such as β-globin . In addition, the downstream control region of the c-src transcript contains guanosine-rich sequences, which play important roles in the enhancement of neuron-specific inclusion of the NI exon .
Our results suggested a mechanism in which two exonic UAGGs and the GGGG adjacent to the 5′-splice site, function co-operatively to silence the CI cassette exon. To investigate further this model, we asked how the number and position of these motifs affect splicing silencing. Different sets of splicing reporters with a GGGG adjacent to the 5′-splice site were constructed so that the UAGGs were varied in number and position within the exon. These results clearly showed that a single UAGG in the exon contributed to weak silencing, whereas two UAGGs produced strong silencing. In contrast, the effects of position were modest. When a third UAGG motif was introduced into the exon, exon skipping increased to nearly 100%. Finally, the disruption of the entire set of UAGG and GGGG motifs in and adjacent to the CI cassette exon allowed for 100% exon inclusion in agreement with the idea that these motifs enforce silencing of an otherwise strong exon.
Silencing of the CI cassette exon is mediated by hnRNP A1, and antagonized by hnRNP H through the guanosinerich motifs
Which protein factors interact directly with the UAGG and GGGG motifs, and what are their roles in splicing silencing? Proteins interacting directly with the exonic UAGG motif at position 91 and with the GGGG motif were identified by UV cross-linking and immunoprecipitation analysis using radiolabelled RNA substrates and HeLa nuclear extracts. These experiments showed that hnRNP A1 interacts with the UAGG motif, as expected, whereas proteins of the hnRNP H family interact prominently with the GGGG motif. Subsequently, co-expression of each of these splicing factors with the CI cassette exon splicing reporter was tested to assess in vivo roles of these factors in splicing silencing. Although hnRNP A1 was found to play a role in silencing, as expected, hnRNP H and F were found to enhance exon inclusion. These experiments also demonstrated a silencing role for a distal region of the downstream intron, which contains several UAGG motifs, and which is required for hnRNP A1-mediated silencing.
Genome-wide analysis of UAGG and GGGG motifs in human and mouse exons
To identify additional transcripts that harbour exons with UAGG and GGGG motifs, and to determine how frequent these motifs are associated with skipped exons throughout the genome, we performed a genome-wide analysis in collaboration with G. Yeo and C. Burge at the Massachusetts Institute of Technology. A gene database containing approx. 96000 orthologous human and mouse exons was searched for the motif pattern (GGGG in the first 10 bases of the intron and ≥1 UAGG in the exon). Although 200 exons of each species were found to contain the motif pattern, in only 19 exons was the motif pattern conserved in sequence and position in human and mouse orthologous exons (0.2% overall). Thus we conclude that the conserved motif pattern is quite rare.
In the dataset of 19 exons with the motif pattern, it was surprising to find transcripts that encode two known splicing factors, hnRNP H1 and H3. Exon 5 of hnRNP H1 and exon 3 of hnRNP H3 are identical in length (139 bp) and nearly identical in sequence with an exonic UAGG and a 5′-splice site GGGG motif. Evidence that these are skipped exons is based on EST and cDNA evidence, and reverse transcriptase–PCR analysis in a variety of human and mouse tissues. These results suggest that the expression of the hnRNP H protein family may be negatively regulated by hnRNP A1, since exon skipping causes a frameshift early in the coding sequence. Furthermore, these exons may be subject to positive autoregulation by hnRNP H family members based on the effects of these proteins on CI cassette exon inclusion. The analysis of 16 examples from the dataset of 19 exons using reverse transcriptase–PCR approaches demonstrated that a high rate of exon skipping is associated with these exons.
To probe further the association of the guanosine-rich silencing motif pattern and exon skipping, we used an independent computational approach. In this experiment, the dataset of 96000 human exons was sorted into two groups, one with and one without the motif pattern. Exons in the dataset were restricted to ≤250 bp to represent typical exon lengths. Next, ESTs and cDNAs corresponding to skipped exons were mapped on to each of these datasets. Exons containing UAGG and GGGG motif pattern showed a significantly higher frequency of exon skipping (18.8%) compared with exons lacking the motif pattern (4.6%). This computational approach also showed a significant association of exon skipping for the reciprocal arrangement of motifs (≥1 GGGG exonic motif and UAGG in the first 10 bases of the intron).
Summary and prospects
The present study defines a pattern of UAGG and GGGG motifs involved in silencing the CI cassette exon of the GRIN1 transcript, which can be used to identify other skipped exons in the human genome. Each predicted ESE motif within the CI exon was confirmed to have an enhancing role except for the ASF/SF2 motif, CGUAGGU, which harbours the silencer, UAGG. Two exonic UAGGs and the intronic GGGG motif function co-operatively to impose silencing on the CI cassette exon. In our assays, silencing was shown to involve the repressive function of the well-known splicing factor, hnRNP A1, and hnRNP H was shown to counteract this effect. Thus the ratio of hnRNP H and A1 proteins may be an important factor in determining the in vivo patterns of CI cassette exon inclusion in the brain. Sequence motifs involved in splicing repression as mediated by hnRNP A1 have been well studied in a variety of transcripts. Whereas the presence of an exonic UAGG is a relatively common feature of exons silenced by hnRNP A1, a GGGG motif adjacent to the 5′-splice site has not been reported previously. Whether silencing of the CI cassette exon involves a direct or indirect interaction of hnRNP A1 with the GGGG motif will need to be addressed in future experiments.
When this analysis was extended using bioinformatics to explore the wider role of the silencing motif pattern in the genome, a conserved pattern of UAGG and GGGG motifs was found to be a rare sequence feature of human and mouse exons. It is intriguing, however, that the group of exons containing the conserved pattern is associated with a significantly higher frequency of exon skipping compared with exons lacking these motifs. In general, it would not be expected to find a perfect correlation between exon skipping and the presence of a silencing motif code, since an opposing mode of regulation, such as that specified by ESEs and/or ISEs, may mask silencing in particular cells or tissues.
The prevalence of alternative splicing events in the human genome calls for an understanding of these mechanisms of control. Results from the present study suggest that it may be generally useful to identify co-regulated exons using searches for biochemically defined sequence motif patterns. These results may have additional value in interpreting the effects of some types of splicing abnormalities associated with inherited mutations or disease pathologies.
Genes: Regulation, Processing and Interference: A Focus Topic at BioScience2004, held at SECC Glasgow, U.K., 18–22 July 2004. Edited by I. McEwan (Aberdeen, U.K.), B. White (Glasgow, U.K.), S. Graham (Glasgow, U.K.), S. Roberts (Manchester, U.K.), A. Sharrocks (Manchester, U.K.), D. Black (Organon, U.K.), S. Newbury (Oxford, U.K.), J. Sayers (Sheffield, U.K.) and A. Lloyd (University College London, U.K.).
This report is based on the collaborative study by K. Han, G. Yeo, C. Burge and P. Grabowski. I thank my co-authors for their contributions to this work.