The alternative splicing of human genes is dependent on SR proteins, a family of essential splicing factors whose name derives from a signature C-terminal domain rich in arginine–serine dipeptide repeats (RS domains). Although the SRPKs (SR-specific protein kinases) phosphorylate these repeats, RS domains also contain prolines with flanking serines that are phosphorylated by a second family of protein kinases known as the CLKs (Cdc2-like kinases). The role of specific serine–proline phosphorylation within the RS domain has been difficult to assign since CLKs also phosphorylate arginine–serine dipeptides and, thus, display overlapping residue specificities with the SRPKs. In the present study, we address the effects of discrete serine–proline phosphorylation on the conformation and cellular function of the SR protein SRSF1 (SR protein splicing factor 1). Using chemical tagging and dephosphorylation experiments, we show that modification of serine–proline dipeptides broadly amplifies the conformational ensemble of SRSF1. The induction of these new structural forms triggers SRSF1 mobilization in the nucleus and alters its binding mechanism to an exonic splicing enhancer in precursor mRNA. These physical events correlate with changes in the alternative splicing of over 100 human genes based on a global splicing assay. Overall, these studies draw a direct causal relationship between a specific type of chemical modification in an SR protein and the regulation of alternative gene splicing programmes.
Most genes contain introns that are removed in a process known as mRNA splicing. This chemical transformation occurs at the spliceosome, a macromolecular complex composed of five RNA–protein complexes (snRNPs (U1–6)) and over 100 protein subunits . In the latter group, the SR proteins [splicing factors containing a C-terminal domain rich in arginine–serine repeats (RS domain)] are essential factors that promote the splicing of precursor mRNA in SR protein-deficient S100 cell extracts . Because the SR protein family consists of 12 members [SRSF1–12 (SR protein splicing factors 1–12)] and any individual member can complement S100 extracts, it was thought originally that SR proteins are functionally redundant [3,4]. However, individual knockouts of several SR proteins (SRSF1–3) were found to be embryonically lethal in mice implying that each SR protein may have a specialized function [5–7]. Although SR proteins play numerous roles in mRNA processing , their ability to select 5′ and 3′ splice sites within precursor mRNA positions them as key regulators of alternative splicing, a process whereby a single gene produces multiple, unique mRNA isoforms [9,10]. Despite this premier function, how SR proteins guide splice-site selection events is still poorly understood. All SR proteins contain one or two small RNA recognition motifs (RRMs) that are flanked by a signature C-terminal domain rich in arginine–serine dipeptide repeats (RS domains). RS domain phosphorylation is necessary for formation of primitive complexes of the spliceosome where the splice sites are initially established . However, dephosphorylation drives formation of the fully mature, active spliceosome . Thus, correlating specific modes of RS domain phosphorylation and SR protein function is important for understanding gene processing and its broader impact on cell function.
SR proteins are phosphorylated by two major protein kinase families–SRPKs (SR-specific protein kinases) and CLKs (Cdc2-like kinases). SRPKs are present in both the cytoplasmic and nuclear compartments although their function is best understood in the cytoplasm. SRPKs phosphorylate multiple arginine–serine dipeptide repeats in RS domains, a process that facilitates binding of cytoplasmic SR proteins to the SR-transportin for nuclear import . Peptide mapping and immunofluorescence studies reveal that the RS domain of the prototype SR protein SRSF1 can be divided into two functional regions–RS1 and RS2 [14–16]. SRPK1 rapidly phosphorylates eight serines in RS1, directing the splicing factor to the nucleus where it largely resides in speckles (Figure 1A). In comparison, CLKs are strictly localized to the nucleus and have very unique substrate specificities . Like SRPKs, CLKs phosphorylate Arg–Ser dipeptides but, unlike SRPKs, also modify Ser–Pro dipeptides [14,18]. In addition, SRPKs and CLKs may display region-specific differences within the RS domain. CLK1 releases SRSF1 from nuclear speckles only when RS2 is present, suggesting that the kinase may specifically modify RS2 and regulate subnuclear localization  (Figure 1A). Since RS2 contains a mixture of both Arg–Ser and Ser–Pro dipeptides, it is currently unknown whether the effects of CLK1 on SRSF1 nuclear localization are due to one or both phosphorylation modes. Furthermore, CLK expression causes very specific changes in splicing of the E1a mini-gene compared with SRPK . Such findings indicate that differences in phosphorylation specificities may correlate not only with SR protein subnuclear localization but also with changes in splicing outcomes.
Formation of hyper-phosphorylated SRSF1 using CLK1
The CLK and SRPK families employ very different mechanisms for RS domain phosphorylation. SRPKs use a docking groove to feed Arg–Ser dipeptide repeats into the active site, a process that occurs in a directional, semi-processive manner for SRSF1 [20–22]. In comparison, CLKs lack this docking groove and instead possess an intrinsically disordered N-terminal extension that directly contacts the RS domains of SR proteins and draws them into the active site . Although SRPK1 rapidly phosphorylates RS1 in SRSF1, it can also phosphorylate Arg–Ser dipeptides in RS2. However, the net rate of the latter reaction is ~100-fold lower and is not important for cytoplasmic–nuclear distribution of SRSF1 . In comparison, CLK1 has no preference for either segment, modifying both RS1 and RS2 with equal efficiency  (Figure 1A). Previous studies showed that CLK1 phosphorylation induces a mobility shift for SRSF1 on SDS/PAGE . This shift does not occur with SRPK1 and is dependent on the phosphorylation of the Ser–Pro dipeptides in RS2 . These findings reveal that CLKs use a novel mechanism to expand their substrate specificities from strictly phosphorylating Arg–Ser repeats to also modifying Ser–Pro dipeptides conserved in all SR proteins. However, it is still not clear whether the unique cellular function of CLKs compared with SRPKs is the result of dipeptide-specific (Arg–Ser versus Ser–Pro) or region-specific (RS1 versus RS2) phosphorylation in the RS domain.
Although phosphorylation is critical for splicing, a molecular understanding of how residue-specific modifications impact SR protein conformation and interactions with exonic splicing enhancers (ESEs) in the spliceosome has been very difficult to define. Indeed, very little is known about the structure of any full-length SR protein and its phosphorylation state. SR proteins have very high isoelectric points (≥11) making phosphopeptide mapping impossible by traditional 2D isoelectric focusing methods. Furthermore, SR proteins display poor solubility and aggregate at high concentrations. The latter feature, while vexing for traditional structural approaches, may be important for biological function as SR proteins engage in protein–protein interactions within the spliceosome [25–27]. In comparison with these limitations, the NMR structures of several RRMs have been solved. The RRMs from SRSF1 and SRSF2 reveal a traditional RNA binding motif with a four-stranded β sheet packed against two α helices [28,29]. However, in stark contrast with these well-folded domains, the RS domains of SR proteins possess numerous disorder-promoting amino acids and are thus considered intrinsically disordered . The lack of 3D data on a complete SR protein is likely due to solubility limitations imposed by the disordered RS domain. These inherent drawbacks present a problem for traditional molecular approaches and demand alternative methods for viewing the conformation of the SR protein in the context of its critical biological function.
Given the special substrate specificities of CLKs, we wished to understand the roles of Arg–Ser versus Ser–Pro phosphorylation in this kinase family and evaluate their effects on SR protein function. Although highly enriched in Arg–Ser stretches, SR proteins also contain multiple Ser–Pro dipeptides scattered throughout their RS domains (Supplementary Figure S1). To analyse how these two dipeptide types influence SR protein biological activity we studied the prototype SRSF1. Based on chemical chelation and dephosphorylation experiments, we found that the Ser–Pro phosphorylation significantly increases the conformational heterogeneity of SRSF1. This broadening of the conformational ensemble induces changes in the relationship of the RS domain to its N-terminal RRMs and induces co-operative binding of SRSF1 to an ESE. Immunofluorescence studies indicate that the Ser–Pro dipeptides are necessary for CLK1-induced dispersion of nuclear speckles. Using a RASL-seq assay to monitor global splicing effects [31,32], we found that the Ser–Pro dipeptides in SRSF1 control the alternative splicing of more than 100 genes. The data suggest a model in which proline-directed phosphorylation uniquely mobilizes SRSF1 from storage in nuclear speckles to active sites of splicing where it engages precursor mRNA and broadly alters exon usage.
MATERIALS AND METHODS
Adenosine triphosphate (ATP), 3-(N-morpholino)propan-esulphonic acid (MOPS), Tris (hydroxymethyl) aminomethane (Tris), MgCl2, NaCl, EDTA, acetic acid, lysozyme, DNAse, RNAse, Phenix imaging film, BSA and liquid scintillant were obtained from Fisher Scientific. γ32P-ATP was obtained from NEN Products, a division of PerkinElmer Life Sciences. Protease inhibitor cocktail and LysC were obtained from Roche. MnCl2 (10×) and 10× PMP buffer [500 mM HEPES, 100 mM NaCl, 20 mM DTT, 0.1% Brij 35, pH 7.5] was obtained from NEB. The Ron ESE RNA [AGGCGGAGGAAGC] was purchased from Integrated DNA Technologies, Hybond ECL nitrocellulose blotting membrane was from Amersham, Bio-Dot microfiltration apparatus from Bio-Rad), KinaseMax™ Kit from Ambion, and Zip-Tips C4 from Millipore.
Phosphorylation reactions: manual mixing
Substrate phosphorylation by SRPK1 and CLK1 was carried out in the presence of 100 mM MOPS (pH 7.4), 10 mM free Mg2+ and 5 mg/ml BSA, at 23°C according to previously published procedures  unless otherwise stated. For single turnover experiments, reactions were carried out with 1 μM enzyme and 0.2 μM SRSF1 and 100 μM 32P-ATP (4000–8000 cpm·pmol−1). For phosphatase experiments, 0.2 μM SRSF1 was pre-phosphorylated with 1 μM enzyme and 100 μM 32P-ATP (4000–8000 cpm·pmol−1) in the presence of 50 mM HEPES (pH 7.5), 10 mM NaCl, 2 mM DTT, 0.01% Brij 35, 10 mM free Mg2+, for 1 h, and the dephosphorylation reaction was then initiated with 1 μM PP1γ and 10 mM free Mn2+. All reactions were carried out in a total volume of 10 μl and quenched with 10 μl SDS/PAGE loading buffer. Phosphorylated SRSF1 was separated from unreacted 32P-ATP on a 10% SDS/PAGE or a 10% Phos-tag SDS/PAGE by running for 1 h at 170 V (SDS/PAGE) or 2 h at 100 V (Phos-tag SDS/PAGE). Protein bands were cut from the dried gel and quantified on the 32P channel in liquid scintillant. The total amount of phosphoproduct was then determined by considering the specific activity (cpm/min) of the reaction mixture.
RNA binding experiments
The Ron ESE (AGGCGGAGGAAGC) was labelled with 32P-ATP using the KinaseMax 5′ labelling kit. Labelled RNA (10 nM) and SRSF1 proteins (various concentrations) were then incubated for 30 min at room temperature in 100 mM MOPS (pH 7.2), 10 mM free Mg2+, 5 mg/ml BSA, 75 mM NaCl, 10% glycerol and 0.2 U/μl RNase inhibitor in a total volume of 25 μl. After incubation, samples were bound to a 0.45-μm nitrocellulose blotting membrane using a Bio-Rad Bio-Dot Apparatus and washed three times with 400 μl of wash buffer (20 mM Tris, pH 7.5, 100 mM NaCl). Filter spots for each sample were cut from the membrane and counted.
Mass spectrometric and LysC proteolysis experiments
MALDI-TOF analyses were carried out using a Voyager DE-STR spectrometer. cSR(3SAP) (1 μM) was phosphorylated using SRPK1 or CLK1 (300 nM) and ATP (0.1 mM) in 50 mM MOPS (pH 7.4) and 10 mM free Mg2+ for 2 h in a total volume of 100 μl at room temperature. Reactions were then quenched with acetic acid (5%), desalted with Zip-Tip C4 and eluted with 80% acetonitrile and 2% acetic acid. Unphosphorylated sample controls were prepared in the same manner without ATP. The matrix solution consisted of sinapinic acid in 70% acetonitrile and 0.1% TFA. Final matrix solution pH was 2.0. For the proteolysis experiments, cSR(3SAP) was phosphorylated with either SRPK1 or CLK1 and 32P-ATP (100 μM) and then treated with the protease LysC (100 ng) for 2 h at 37°C. The N- and C-terminal fragments were excised and counted.
Live cell imaging using confocal microscopy
For live cell imaging, HeLa cells were seeded and transfected in a 35-mm glass bottom dish. Forty-eight hours after transfection, live imaging was performed on cells expressing GFP-tagged SRSF1 and SR(3SAP) with and without myc-tagged CLK1. Confocal (and phase-contrast) images were acquired using an Olympus FV1000 with a 488-laser line. Images were linearly analysed and pseudo-coloured using ImageJ analysis software.
Identification of alternatively spliced genes using RASL-seq
Three 60 mm plates each were transfected with 2 μg of either GFP-SRSF1 or GFP-SR(3SAP). Total RNA was isolated using Biomiga RNA isolation kit and RASL-seq libraries were prepared as previously described . The data were processed using MS-excel and statistical analysis was performed using a paired T-test (P< 0.05, fold changes of isoform ratios between groups > 1.5). To validate several spliced gene forms, reverse transcription was performed with RNA used for RASL-seq samples using a Qiagen one step RT-PCR kit in a 25-μl reaction mixture. The cDNA was then resolved on 2% agarose gel and imaged using a Bio-Rad gel doc system.
CLK1 catalytic function is not affected by RS domain pre-phosphorylation
Since SR proteins are phosphorylated by cytoplasmic SRPKs prior to nuclear entry and CLK phosphorylation, we wished to address whether SRPK1 pre-phosphorylation alters the kinetic behaviour of the subsequent CLK1 reaction. To address this we studied both reactions using SRSF1 (Figure 1A). We showed that CLK1 phosphorylates SRSF1 at ~18 sites, inducing a gel shift from 35 to ~38 kDa on SDS/PAGE (Figure 1B, upper panel). To determine whether SRPK1 affects the formation rate of the maximally phosphorylated state, we pre-phosphorylated RS1 at just over 10 sites using low, catalytic SRPK1 before adding CLK1 (Figure 1B, middle panel). Based on prior mapping studies, all serines in RS1 and a few additional serines in the N-terminal segment of RS2 are modified under this total phosphoryl content . Incorporation of 32P into SRSF1 (normalized to the total substrate concentration) is plotted as a function of time and fitted to an exponential function (Figure 2B, bottom panel). Upon addition of CLK1, the hyper-phosphorylated state is obtained at the same rate with or without SRPK1 pre-phosphorylation (t1/2 ~15 min). A slightly higher level of phosphorylation is attained with SRPK1 pre-phosphorylation, which could be due to more efficient Arg–Ser phosphorylation by SRPK1 compared with CLK1. We showed in replicate experiments that the generation of the hyper-phosphorylated species is obtained at similar rate constants (data not shown). These findings indicate that CLK1 can phosphorylate RS2 with the same efficiency regardless of SRPK1 pre-phosphorylation.
Phospho-mapping of SRPK1- and CLK1-phosphorylated SR(3SAP)
Ser–Pro dipeptide phosphorylation induces conformational changes in SRSF1
To further investigate SRSF1 phosphorylation, we used a dinuclear phosphate-tagging complex (Phos-tag) to analyse the samples in Figure 1B. Phos-tag is expected to retard phospho-protein migration on SDS/PAGE in direct proportion to the number of phosphates added. Indeed, we observed a large increase in observed molecular weight for SRSF1 from 35 to 70 kDa upon SRPK1 treatment (Figure 1C, upper gel). However, CLK1 addition to this SRPK1 pre-phosphorylated protein led to unexpected results. As the 70-kDa band corresponding to SRPK1-phosphorylated SRSF1 disappears, a dispersed collection of species ranging from 50 to 90 kDa develops. Thus, CLK1 phosphorylation does not induce a discrete increase in observed molecular weight. This phenomenon also occurs in the absence of SRPK1 pre-phosphorylation (Figure 1C, lower gel). Whether SRSF1 is phosphorylated solely by CLK1 or by sequential kinase reactions, SRSF1 displays similar band dispersion between 50 and 90 kDa. In all cases, these new species migrate above the hypo- and hyper-phosphorylated forms observed without the Phos-tag reagent. These findings suggest that CLK1-phosphorylated SRSF1 adopts multiple conformational states that can be tracked chemically.
To determine whether the additional CLK1-induced forms result from Ser–Pro phosphorylation, we studied single alanine mutants at all three serines flanking the prolines in RS2. Upon CLK1 treatment, all three mutants [SR(227), SR(237), SR(238)] displayed multiple forms on Phos-tag SDS/PAGE between 50 and 70 kDa (Figure 1D). Since no single Ser–Pro dipeptide is responsible for the dispersion, we focused our attention on a triple mutant that substitutes all serines flanking prolines in RS2 with alanines [SR(3SAP)]. Unlike the wild-type substrate, SR(3SAP) did not undergo a gel shift on SDS/PAGE (Figure 1E, upper gel). Most importantly, CLK1 phosphorylation of SR(3SAP) did not induce dispersion upon chemical tagging with Phos-tag (Figure 1E, lower gel). Also, CLK1 addition to SRPK1-phosphorylated SR(3SAP) did not lead to dispersion although a slight increase in mobility was observed with time (Figure 1F). Overall, these findings indicate that CLK1 phosphorylation of Ser–Pro dipeptides induces additional conformational states of SRSF1.
Ser–Pro and Arg–Ser phosphorylation occur independently in RS2
To determine whether increased conformational heterogeneity in SRSF1 is due only to Ser–Pro phosphorylation, we mapped phosphorylation sites in SR(3SAP) using an engineered footprinting strategy  to determine whether CLK1 still modifies Arg–Ser dipeptides in RS2. In this method, a unique Arg-to-Lys mutation in the centre of the RS domain, near the RS1/2 boundary, is introduced (Figure 2A). Furthermore, five Lys-to-Arg mutations in RRM2 are included so that upon treatment with the protease LysC, N- and C-terminal fragments including mostly RS1 and RS2 can be distinguished on SDS/PAGE. We then made three alanine mutations in the Ser–Pro cluster in RS2 to generate the cleavable substrate cSR(3SAP). To evaluate whether mutation of the Ser–Pro dipeptides alters the RS domain, we first monitored the phosphoryl content of cSR(3SAP) as a function of either SRPK1 or CLK1 phosphorylation using MALDI-TOF. We found that both kinases added 14 phosphates on to cSR(3SAP) after a 2-h incubation period, values that match the total number of available serines flanking arginines (or lysine) in the RS domain (Figures 2B and 2C).
To map phosphorylation sites we completely phosphorylated cSR(3SAP) with SRPK1, treated it with LysC and resolved all fragments on SDS/PAGE (Figure 2D). We observed a fragmentation pattern similar to those observed previously for an SRSF1 cleavage substrate lacking the triple serine mutant . Three major peptides corresponding to the N- and C-terminal fragments (19 and 5 kDa) and an intermediate band (24 kDa) resulting from incomplete proteolysis in the RS domain were obtained. By excising and counting the bands we obtained an N/C ratio (ratio of generation of N- and C-terminal fragments) of ~2 (N/C=1.8), which is consistent with nine phosphates in the N-terminal fragment and five phosphates in the C-terminal fragment. To confirm their identities, we performed Ni-resin pull-down experiments to show that the C-terminal fragment contains the His-tag as expected and the N-terminal fragment does not (Figure 2E). We repeated the LysC cleavage experiment using CLK1-phosphorylated cSR(3SAP) to show that both the N and C bands are generated at the same ratio (N/C=2.2) (Figure 2F). Since CLK1 phosphorylates cSR(3SAP) to the same extent as SRPK1 (Figures 2B and 2C), we conclude that the N- and C-terminal halves of the RS domain in the mutant are equally phosphorylated by both kinases. For these assays, SR proteins are incubated with kinase for 2 h, a time sufficient to modify not only the fast SRPK1 sites in RS1 but also the slower SRPK1 sites in RS2. Under these conditions, all Arg–Ser dipeptides in the RS domain are modified by SRPK1. Overall, the LysC proteolysis studies indicate that CLK1 is able to phosphorylate Arg–Ser dipeptides in RS2 even when the serines flanking the prolines are removed.
RS domain flexibility correlates with dephosphorylation efficiency
Having demonstrated that Ser–Pro phosphorylation is responsible for conformational heterogeneity (Figure 1), we wished to determine whether these structural changes affect RS domain function. Since SR protein dephosphorylation controls spliceosome maturation [12,33], factors that regulate phosphatase activity towards the RS domain are relevant for the splicing mechanism. We treated SRSF1 and SR(3SAP) with catalytic amounts of CLK1 to maximally phosphorylate their respective RS domains and then monitored PP1-dependent dephosphorylation. We found that CLK1-phosphorylated SRSF1 was more rapidly dephosphorylated than CLK1-phosphorylated SR(3SAP) (Figure 3A), suggesting that the RS domain is more accessible to PP1 upon Ser–Pro phosphorylation. To confirm that the observed differences between SRSF1 and SR(3SAP) are the result of Ser–Pro phosphorylation, we performed the same experiments using SRPK1-phosphorylated SR proteins and found that PP1 dephosphorylated SRSF1 and SR(3SAP) with similar efficiencies (Figure 3B). Overall, these findings suggest that Ser–Pro dipeptide phosphorylation induces dynamic structural changes in SRSF1 that enhances the ability of PP1 to dephosphorylate the RS domain.
Ser–Pro phosphorylation by CLK1 alters PP1 activity towards SRSF1
Phosphorylation promotes co-operative SRSF1 binding to RNA
We next wished to address whether CLK1-induced increases in conformational flexibility modify RNA interactions. To address this we studied the phosphorylation-dependent interaction of SRSF1 and SR(3SAP) with the Ron ESE using a filter-binding assay . SRSF1 (either unphosphorylated or phosphorylated with catalytic amounts of SRPK1 or CLK1) was incubated with the ESE and the amount bound determined in single filter spot analyses . We found that although alanine substitution of serines in the Ser–Pro dipeptides displayed a slight improvement in half-maximal saturation (K0.5), the SRPK1-phosphorylated SR proteins bound the ESE identically (Figures 4A and 4B and Table 1). In the absence or presence of SRPK1 phosphorylation, SRSF1 and SR(3SAP) displayed similar co-operativity (N=1.1–1.3) (Figure 4D) which could be due to protein-protein interactions that facilitate ESE binding. Since the ESE is short (13mer), we do not anticipate that multiple SR proteins bind a single RNA molecule. Surprisingly, CLK1 phosphorylation led to very large increases in co-operative SRSF1 binding (Figure 4C and Table 1). This increase in binding co-operativity for SRSF1 upon CLK1 phosphorylation is similar to a previous report . We now show that this mechanism shift is due to Ser–Pro phosphorylation since CLK1 treatment of SR(3SAP) leads to reduced levels of co-operative RNA association, similar to that for SRPK1-phosphorylated SRSF1 (Figures 4C and 4D and Table 1). These findings indicate that although CLK1 phosphorylates numerous serines in the RS domain, discrete phosphorylation of Ser–Pro dipeptides enhances co-operative association with RNA. This effect suggests that local changes in RS domain conformation induced by CLK1 impact the accessibility of the neighbouring RRMs.
Effects of phosphorylation on SRSF1 and SR(3SAP) binding to the Ron ESE
|N||Kd (nM)||K0.5 (nM)||N||Kd (nM)||K0.5 (nM)|
|N||Kd (nM)||K0.5 (nM)||N||Kd (nM)||K0.5 (nM)|
N, cooperativity constant for the SR protein.
CLK1-dependent nuclear mobilization of SRSF1 is mediated by Ser–Pro dipeptides
We showed previously that CLK1 induces a change from speckles to a diffuse localization of SRSF1 in the nucleus only in the presence of the RS2 segment . This change could be due either to increases in bulk phosphorylation (both Arg–Ser and Ser–Pro phosphorylation) or to discrete Ser–Pro phosphorylation. To address this, we initially expressed a GFP-tagged form of SRSF1 (GFP-SRSF1) in HeLa cells and showed, as expected, that the SR protein localizes largely to nuclear speckles (Figure 5). In comparison, CLK1 expression leads to speckle dissolution and diffuse GFP-SRSF1 localization in the nucleus. These results performed on live HeLa cells are similar to previous findings using fixed cells . To address whether this phenomenon is due to the presence of Ser–Pro dipeptides, we studied the localization of GFP-SR(3SAP) which replaces serines flanking prolines in the RS2 segment with alanine. We found that although GFP-SR(3SAP) localizes in speckles, CLK1 expression does not cause dissolution of these speckles (Figure 5). These findings indicate that CLK1-induced mobilization of SRSF1 in the nucleus is mediated directly by the presence of Ser–Pro dipeptides in the RS2 segment of the RS domain.
Effects of Ser–Pro dipeptides on SRSF1 sub-nuclear localization
Ser–Pro dipeptides in SRSF1 regulate the alternative splicing of numerous genes
Ser–Pro dipeptides are present in all SR proteins but their role in splicing is not appreciated. To understand whether the serines flanking prolines in SRSF1 can induce site-specific exon inclusion or exclusion on a genome-wide basis, we conducted an oligonucleotide-mediated RNA annealing, selection and ligation assay with high throughput sequencing (RASL-seq) to target 5531 annotated genes that are conserved between humans and mice and undergo alternative splicing [31,32]. In this method, over 14000 probes directed towards known alternative splice junctions were annealed with the goal of identifying the ratio of short and long mRNA forms. Using HeLa cells, we overexpressed GFP-tagged SRSF1 and SR(3SAP) separately in triplicates isolated mRNA and conducted RASL-seq. As a control, we first performed Western blots of HeLa cell lysates to show that GFP-SRSF1 displays broader dispersion on Phos-tag SDS/PAGE compared with GFP-SR(3SAP), consistent with Ser–Pro phosphorylation in the wild-type SR protein (Figure 6A). We observed less dispersion in the HeLa lysates compared with our in vitro assays (Figure 1), suggesting that the larger GFP tag might alter mobility. Performing RASL-seq we detected ~1000 events that expressed both mRNA isoforms (short and long) in HeLa cells. We computed the ratio of short/long isoforms for both SRSF1 and SR(3SAP). We then calculated the ratios of short/long isoforms in SRSF1 overexpressed cells relative to SR(3SAP) with a cutoff ≥1.5 to pool a set of genes where exon exclusion was predominant to give enrichment in short isoforms (Figure 6B), and with a cutoff ≤0.66 to pool a set of genes where exon inclusion was predominant in long isoforms (Figure 6C).
Ser–Pro dipeptides regulate alternative gene splicing
The analysis of biological triplicates resulted in 37 genes where the short/long ratio was >1.5 and P value <0.05 and 84 genes where the short/long ratio was ≤0.66 and P value <0.05 (Figure 6D and Supplementary Figure S2). CLK1 is known to induce its own splicing by generating a truncated mRNA . Interestingly, we detected this alternative splice variant in the set of 37 genes. Furthermore, Tra2α, an SR-like protein that facilitates exon inclusion , was enriched in the pool of 84 genes. The observation of these genes among our data set supports the validity of our experimental approach. Our data suggest that CLK1 and Tra2α exon inclusion are mediated by SRSF1 and regulated by CLK1 phosphorylation of proline-directed serines in SRSF1.
To further confirm the validity of the RASL-seq assay, we conducted RT-PCR on a panel of genes from both exon inclusion/exclusion sets. CLK1, CCDC50 (coiled coil domain containing protein 50) which associates with cytoskeleton and mitotic apparatus, and CCAR1 (cell cycle division and apoptosis regulator protein 1) were tested for enrichment of short isoforms. We found an enrichment of short (green asterisks) over long isoforms in cells expressing SRSF1 compared with SR(3SAP) (Figure 6E). We also tested TRA2Aα, GOLPH3L (Golgi phosphoprotein 3 like), and CYSTB (cytospin B) also known as SPECC1 (sperm antigen with calponin homology and coiled-coil domains 1) for enrichment of long isoforms. We found an enrichment of the long (red asterisks) over short isoforms in cells expressing SRSF1 compared with the mutant SR(3SAP) (Figure 6F). To better understand the biological process spectrum affected by these two sets of genes, we employed the Panther bioinformatics tool . Using a pie chart, we display the CLK1-mediated effects of Ser–Pro phosphorylation in SRSF1 as a function of gene categories (Figures 6G and 6H). In particular, we observed that the effects extend to a very broad array of gene classes rather than concentrate on a limited range of biological processes. These findings show that Ser–Pro phosphorylation in SRSF1 by CLK1 is involved in alternative splicing and, thus, identifies it as one new layer of the splicing regulatory process.
RS domains in SR proteins can vary in length from 50 to over 300 residues and contain numerous Arg–Ser dipeptides in different configurations (Supplementary Figure S1). Although much attention has been devoted to Arg–Ser phosphorylation, the effects of Ser–Pro phosphorylation have not been characterized even though all RS domains contain two to six of these dipeptides. Prolines can play a specialized structural role since they can populate two discrete conformational states defined by cis/trans rotations about the peptide bond. These conformational states are thought to be generally important for protein folding but are now becoming relevant in signal transduction pathways . Phosphorylation of Ser/Thr–Pro dipeptides can alter conformation and present a recognition signal for ubiquitin-dependent degradation of proteins involved in tumour growth and suppression . Furthermore, Pin1, a prolyl isomerase that specifically isomerizes phosphorylated Ser/Thr–Pro dipeptides, offers a tunable mechanism for either promoting or blocking a protein for degradation . Overall, since Ser–Pro dipeptides are conserved in RS domains (Supplementary Figure S1), it is very possible that protein kinases directed towards these serines could profoundly impact both the conformation and biological activity of the SR proteins. In the present study, we built upon these observations and asked whether specific Ser–Pro phosphorylation catalysed by nuclear CLK1 could alter the conformation and splicing activity of an SR protein.
Altering the structural landscape of an SR protein
The SR protein SRSF1 has three Ser–Pro dipeptides in the RS2 segment of the RS domain whose phosphorylation state is uniquely controlled by the CLK family of kinases (Figure 1A). Although we showed previously that Ser–Pro phosphorylation of SRSF1 leads to a small migration shift on SDS/PAGE , the nature of the structural change responsible for this effect is not defined. We now present three independent lines of evidence indicating that Ser–Pro phosphorylation by CLK1 expands the SRSF1 conformational ensemble. First, although SRPK1 modification leads to a discrete migratory shift on Phos-tag SDS/PAGE proportional to its phosphorylation state (~15 phosphates), the minimal, secondary phosphorylation of only three Ser–Pro dipeptides by CLK1 causes broad molecular weight dispersion. These findings suggest that the phosphate-chelating agent is sensitive to numerous RS domain conformations induced by CLK1. Interestingly, this phenomenon is not observed with SRPK1 suggesting that this kinase does not induce a similar degree of RS domain heterogeneity. Secondly, the rate of RS domain dephosphorylation by PP1 is enhanced by CLK1 suggesting increased RS domain flexibility and phosphatase access upon Ser–Pro dipeptide modification. Thirdly, Ser–Pro phosphorylation alters the mechanism whereby SRSF1 interacts with an ESE, suggesting indirect effects of the RS domain on RRM function. Overall, the combined data show that phosphorylation of CLK1-specific sites as opposed to only SRPK1-specific sites induces flexibility and growth in the SRSF1 conformational ensemble.
Linking CLK1-specific phosphorylation to RRM function
RS domain phosphorylation plays a critical role in regulating SR protein function at numerous levels. For example, phosphorylation promotes SR protein transport from the cytoplasm to nuclear speckles [13,39]. Conversely, dephosphorylation facilitates the export of intron-less mRNA, establishing a phosphorylation–dephosphorylation cycle important for mRNA delivery to the cytoplasm . In the spliceosome, RS domain phosphorylation has been shown to promote interactions of SRSF1 with the 70K protein subunit of the U1 snRNP, a critical initiator of 5′ splice-site recognition . Phosphorylation also facilitates attachment of the U2 snRNP and U4/U6·U5 tri-snRNP in the spliceosomal E complex . These functions are dependent on RRM recognition of ESEs in the initial mRNA transcript. Our observations that Ser–Pro as opposed to Arg–Ser phosphorylation controls the co-operativity of SRSF1 binding to an ESE suggests that CLK1 can serve as a switch to regulate concentration-dependent RNA association. Other studies indicate that SR proteins compete for the concentration-dependent inclusion of certain exons, a process that affects alternative splicing [41,42]. Therefore, post-translational modifications that impact the SR protein–RNA bound state are expected to modify splice-site recruitment. Our findings draw a connection between specific Ser–Pro phosphorylation in an RS domain and the associative function of neighbouring RRMs in an SR protein.
Mobilizing SRSF1 in the nucleus
SR proteins are stored in speckles, membrane-free nuclear structures that house splicing factors, and released to nearby sites for co-transcriptional splicing . Phosphatase inhibitors promote whereas kinase inhibitors block this activity suggesting that increases in phosphorylation are critical for SR protein mobilization and splicing . The RS1 segment is required for SRSF1 transport to nuclear speckles and SRPKs are likely to phosphorylate this region in the cytoplasm . Given their strict nuclear localization, CLKs may be the catalysts that induce SR protein mobilization from speckles to splicing sites. Indeed, CLK expression causes an increase in the nucleoplasmic distribution of SR proteins coupled with the loss of speckles, a process reversed by a CLK-specific inhibitor [15,44,45]. Although we showed that this process requires RS2 in SRSF1 , it does not inform as to whether the CLK-induced dispersion of speckles is due to Arg–Ser or Ser–Pro dipeptides since both are present in this region of the RS domain (Figure 1A). Also, speckle diffusion could be an effect of bulk increases in phosphorylation of all dipeptide types since both CLKs and SRPKs are present in the nucleus. SRPKs are known to associate with U1 snRNP and the U4/U6-U5 tri-snRNP in the splicesome  whereas CLKs have been observed in nuclear speckles and at sites of active gene transcription . Our immunofluorescence data support a model in which CLK1-induced phosphorylation at Ser–Pro sites rather than bulk increases in Arg–Ser phosphorylation elicit SR protein nuclear mobility. This mobilization could reflect a necessary movement of SR proteins from speckles to sites of active splicing. Overall, these findings indicate that the specialized effects of nuclear CLK1 on SRSF1 are not due to regiospecific phosphorylation (RS1 versus RS2) but rather are the result of dipeptide-specific phosphorylation.
Proline-directed serine phosphorylation alters SRSF1-induced splicing patterns
Since the early observations that PP1 and its inhibitors can block various stages of spliceosomal development [12,33], it has been assumed that SR protein phosphorylation state could regulate alternative gene splicing. Indeed, CLK expression has been shown to affect the splicing of some individual genes including PKC (protein kinase C) β, Tau and Tra2β1 [48–50]. Furthermore, CLK1 has been shown to auto-regulate itself through alternative splicing promoting a truncated, inactive form . In this new study, we wished to identify whether CLK-specific Ser–Pro sites in an SR protein are responsible for these splicing changes and then determine the scope of these modifications on a global level. Using RASL-seq methods, we found that the splicing of over 100 genes is altered by the phosphorylation of the Ser–Pro dipeptides in SRSF1. In addition to laying a foundation for the special role of proline-directed phosphorylation, these studies may offer useful connections to human disease. For example, two of the genes identified in our assay have been linked to non-syndromic deafness (CCDC50) and juvenile myelomonocytic leukaemia (CYSTB) [51,52]. In the future, it will be important to express other members of the SR protein family, map their SRPK/CLK phosphorylation sites and ask whether specific Ser–Pro dipeptides play a specialized role similar to that for SRSF1.
We show, for the first time, that a specific type of RS domain phosphorylation directed at Ser–Pro dipeptides alters the conformation, nuclear mobility and splicing function of an SR protein. We found that proline-directed phosphorylation by nuclear CLK1 broadens the conformational ensemble of SRSF1 and directs the SR protein away from its storage site in speckles. Increases in nucleoplasmic distribution are accompanied by enhancements in the co-operative binding of SRSF1 to mRNA and dephosphorylation rate of the RS domain, essential steps for both early and late stages of spliceosomal development. These modifications to SRSF1 function correlate directly with changes in the alternative splicing of over 100 human genes. These studies, thus, define a direct link between a specific type of chemical modification catalysed by the CLK family of kinases and SR protein-directed splicing changes on a broader genomic level.
Malik Keshwani, Brandon Aubol, Chen-Ting Ma and Jinsong Qui performed all the experiments in the study. Xiang-Dong Fu, Patricia Jennings and Laurent Fattet participated in discussions and helped with data analysis. Joseph Adams and Malik Keshwani planned the experiments and wrote the paper.
We thank Dr Gourisankar Ghosh for critical reading of the manuscript and helpful suggestions. We also thank Jennifer Stowe for help in analysing RASL-Seq data and Dr. Kun-Liang Guan for the Phos-tag reagent.
This work was supported by the NIH [grant numbers GM67969, GM67969S1 and GM98528 (to J.A.A.)]; NIH [grant number GM52872 (to X.-D.F.)]; and NIH [grant number GM101467 (to P.A.J.)].