An ERK5–KLF2 signalling module regulates early embryonic gene expression and telomere rejuvenation in stem cells

The ERK5 MAP kinase signalling pathway drives transcription of naïve pluripotency genes in mouse Embryonic Stem Cells (mESCs). However, how ERK5 impacts on other aspects of mESC biology has not been investigated. Here, we employ quantitative proteomic profiling to identify proteins whose expression is regulated by the ERK5 pathway in mESCs. This reveals a function for ERK5 signalling in regulating dynamically expressed early embryonic 2-cell stage (2C) genes including the mESC rejuvenation factor ZSCAN4. ERK5 signalling and ZSCAN4 induction in mESCs increases telomere length, a key rejuvenative process required for prolonged culture. Mechanistically, ERK5 promotes ZSCAN4 and 2C gene expression via transcription of the KLF2 pluripotency transcription factor. Surprisingly, ERK5 also directly phosphorylates KLF2 to drive ubiquitin-dependent degradation, encoding negative feedback regulation of 2C gene expression. In summary, our data identify a regulatory module whereby ERK5 kinase and transcriptional activities bi-directionally control KLF2 levels to pattern 2C gene transcription and a key mESC rejuvenation process.


Introduction
Embryonic Stem Cells (ESCs) can self-renew or differentiate along any lineage in the adult body, a property known as pluripotency [1]. The fundamental regulatory mechanisms which govern pluripotency are therefore of intense interest to exploit pluripotent cells in regenerative therapeutics [2]. Cellular signalling network activity plays a critical role in ESC decision-making, by implementing specific gene expression signatures to define developmental choice. Although many signalling networks relevant for ESC decision-making have been identified, it remains a challenge to understand the molecular and biological functions of critical ESC signalling pathways in regulating pluripotency and lineage-specific transcriptional networks [3].
Recently, ERK5 (also known as BMK1) [4,5] was identified as a key regulator of the transition between naïve and primed pluripotency in mouse ESCs (mESCs) [6]. ERK5 signalling promotes the naïve state by driving the expression of pluripotency genes [6], although the wider molecular targets and biological functions of the ERK5 pathway in mESCs have not been identified. In this regard, ERK5 uniquely encodes a kinase domain and a putative transcriptional activation domain [7], both of which are required to support pluripotency gene expression [6]. Therefore, current data suggest that catalytic and transcriptional functions of ERK5 are to likely play a role in mediating ERK5 function in mESCs ( Figure 1A). However, ERK5 substrates and wider transcriptional networks have not yet been explored in this context.  Here, we use state-of-the-art quantitative proteomics to identify the ERK5-responsive proteome in mESCs. Within a specific cohort of ERK5-dependent proteins is ZSCAN4, a key member of a network of early embryonic 2-cell stage specific (2C) genes [8] that promotes the attainment of naïve pluripotency [9,10] and stem cell 'rejuvenation' in vitro [11]. We show that ERK5 induction of the pluripotency transcription factor KLF2 mediates the expression of Zscan4 and other 2C genes. Furthermore, ERK5 signalling and ZSCAN4 induction promote telomere elongation, a key process that contributes to mESC rejuvenation. Unexpectedly, we find that ERK5 also directly phosphorylates KLF2 at dual pSer/Thr-Pro motifs, which recruits a Cullin family E3 ubiquitin ligase to promote KLF2 ubiquitylation and proteasomal degradation. KLF2 phosphorylation thereby enables a negative feedback loop to suppress the expression of Zscan4 and other 2C genes. In summary, our data provide molecular insight into ERK5 kinase and transcriptional functions in mESCs, which directionally modulate KLF2 levels to pattern early embryonic 2C gene transcription and a key mESC rejuvenation process.

Results
Quantitative proteomic profiling identifies ERK5 regulated proteins in mESCs ERK5 signalling drives transcription of naïve pluripotency genes in mESCs ( Figure 1A) [6]. However, the wider impact of the ERK5 pathway on gene expression and proteome dynamics in mESCs remains uncertain. To tackle this question, we set out to systematically identify proteins whose expression is regulated by ERK5 signalling. To this end, we developed a quantitative proteomics workflow employing complementary strategies to specifically activate and inhibit ERK5 signalling in mESCs ( Figure 1B). As the ligands that activate ERK5 in mESCs are not yet known, we employed a constitutively active mutant of the specific upstream kinase MEK5 (MEK5DD) [12,13] to specifically activate ERK5, and the selective ERK5 inhibitors XMD8-92 [14] and AX15836 [15] to inhibit ERK5 kinase activity.
As proof-of-principle, we demonstrate highly sensitive manipulation of ERK5 activity using C-terminal autophosphorylation and resulting retarded electrophoretic mobility as a readout [13]. MEK5DD expression in mESCs activates ERK5, and this is reversed by treatment with the selective ERK5 inhibitors XMD8-92 and AX15836 ( Figure 1C). Importantly, the related MAP kinases ERK1/2 and p38 are not significantly activated or inhibited by modulation of the ERK5 pathway ( Figure 1C). Furthermore, ERK5 activation is accompanied by translocation to the nucleus ( Figure 1D) and increased expression of the pluripotency transcription factor KLF2 ( Figure 1E). Previous work identifies Klf2 as an ERK5 target gene [6,[16][17][18][19], where Klf2 transcription is driven by the ERK5 transcriptional activation domain and ERK5-mediated phosphorylation of MEF2 transcription factors at the Klf2 promoter [16]. Thus, these data confirm that our experimental approach selectively modulates ERK5 kinase activity and gene expression in mESCs.  These defined conditions for ERK5 activation and inhibition allowed us to conduct quantitative proteomic profiling to elucidate the ERK5-regulated proteome in mESCs. This experimental approach identified a total of 8732 proteins, of which 7639 were quantified by at least two unique peptides. The abundance of 66 proteins changes >0.5 (log 2 ) upon ERK5 activation by MEK5DD ( Figure 1F), indicating that ERK5 signalling selectively regulates proteome dynamics. Furthermore, ERK5 inhibition by AX15836 largely suppresses proteins that are induced upon ERK5 activation ( Figure 1G), indicating a key role for ERK5 kinase activity. Interestingly, AX15836 has recently been shown to both inhibit ERK5 kinase activity and drive paradoxical activation of the transcriptional activation domain [20], which may explain the partial effect of AX15836 in reversing induction of some ERK5 induced proteins. KLF2, a key transcriptional target of ERK5 signalling [6,[16][17][18][19], is significantly induced in response to ERK5 pathway activation as expected ( Figure 1F,G). Amongst proteins whose expression is significantly induced by ERK5 signalling are ZSCAN4D and ZSCAN4F ( Figure 1F,G), which are expressed in the early embryonic two-cell stage (2C) and play a central role in mESC genome stability and rejuvenation [8][9][10][11]21].
The ERK5 pathway drives the expression of ZSCAN4 and other 2C-stage genes As a functional connection between ERK5 signalling and ZSCAN4 has not been reported, we first set out to validate the role of ERK5 in regulating ZSCAN4 expression. Analysis of ZSCAN4 protein levels by immunoblotting confirms that ERK5 activation by MEK5DD induces ZSCAN4 expression, and this is reversed by treatment with AX15836 ( Figure 2A). This mirrors the regulation of known ERK5 pathway target KLF2 (Figure 2A), confirming ZSCAN4 as a novel protein target of the ERK5 signalling pathway in mESCs.
We then sought to determine the mechanism by which ERK5 signalling drives increased ZSCAN4 protein levels. As ERK5 signalling plays a key role in transcriptional regulation of stem cell-specific genes, we tested whether ERK5 activation induces expression of the Zscan4 gene cluster. Activation of ERK5 by MEK5DD induces expression of the known target gene Klf2 ( Figure 2B). Similarly, ERK5 activation induces expression of Zscan4 mRNA ( Figure 2B), suggesting that ERK5 promotes transcription of the Zscan4 gene cluster. Interestingly, ERK5 activation also induces expression of further genes that are specifically expressed at the early embryonic 2C stage, Zfp352 and Tdpoz3 ( Figure 2B), suggesting that ERK5 may have a more general function in regulating early embryonic gene expression in mESCs.
Finally, we employed Erk5/Mapk7 −/− mESCs [22] to confirm the role of ERK5 in the regulation of Zscan4/ 2C genes. Erk5 −/− mESCs expressing MEK5DD express a basal level of Zscan4, Zfp352 and Tdpoz3 mRNAs ( Figure 2C). The levels of Zscan4 and Tdpoz3 mRNA are strongly increased by expression of wild-type ERK5, in comparison with a kinase inactive mutant (D200A) of ERK5 (ERK5 KD; Figure 2C). Interestingly, Zfp352 mRNA is increased by both wild-type ERK5 and ERK5 KD ( Figure 2C), suggesting possible kinase independent functions of ERK5 in Zfp352 gene regulation. Taken together, our data indicate that ERK5 signalling can regulate the expression of early embryonic genes in mESCs, including the stem cell rejuvenation factor ZSCAN4.
ERK5-dependent ZSCAN4 induction drives telomere elongation ZSCAN4 promotes mESC rejuvenation at least in part by promoting telomere maintenance [11,21]. Therefore, we asked whether ERK5 signalling to ZSCAN4 impacts on telomere length in mESCs. To quantify average telomere length, we employed an assay that determines the ratio of telomeric repeats to non-telomeric DNA [23]. As proof of principle for this approach, we performed a comparison of telomere length in mESCs cultured in MEK1/2 and GSK3 inhibitors (2i) [24] and those cultured in LIF/FBS. mESCs cultured in 2i have been shown to have shortened telomeres compared with LIF/FBS conditions [25]. Indeed, the telomere:non-telomere ratio is significantly lower for 2i mESCs than LIF/FBS mESCs ( Figure 2D), confirming that this assay accurately reports perturbations in telomere length.
We then used this assay to investigate the function of ERK5 signalling in regulating telomere length. ERK5 activation by constitutively active MEK5DD induces a small but statistically significant increase in telomeric: non-telomeric ratio, which is not observed following co-treatment of mESC with the selective ERK5 inhibitor AX15836 ( Figure 2E). Overexpression of ZSCAN4 in mESCs increases the telomeric:non-telomeric ratio ( Figure 2F), confirming the key function of ZSCAN4 in regulating telomere length in this context. These data indicate that ERK5 pathway activation drives an increase in telomere length, and this is associated with ZSCAN4 induction.

The ERK5-KLF2 transcriptional axis controls Zscan4/2C gene expression
A key question arising from our results concerns the mechanism by which ERK5 signalling drives ZSCAN4 expression to promote telomere maintenance. As shown previously, a major transcriptional target of ERK5 in mESCs is the KLF2 transcription factor ( Figure 1F). Therefore, we tested the hypothesis that transcriptional induction of KLF2 is a critical mechanism by which ERK5 signalling drives the expression of Zscan4 and other 2C genes. To this end, we used CRISPR/Cas9 gene editing to generate mESCs in which expression of wild-type KLF2 is suppressed (Klf2 Δ/Δ mESCs). In these cells, ZSCAN4 is expressed at a low level, and this is robustly enhanced by the expression of wild-type KLF2 ( Figure 3A). Expression of Zscan4, Zfp352 and Tdpoz3 mRNAs is also increased by re-introduction of wild-type KLF2 ( Figure 3B), strongly suggesting that ERK5 signalling regulates expression of ZSCAN4/2C genes via KLF2-dependent transcriptional induction. Consistent with this notion, induction of ZSCAN4 expression by ERK5 activation is blunted in Klf2 Δ/Δ mESCs when compared with wild-type mESCs expressing endogenous levels of KLF2 ( Figure 3C). However, ERK5-dependent ZSCAN4 induction is not completely abolished in Klf2 Δ/Δ mESCs ( Figure 3D), consistent with possible residual expression of full-length KLF2 and/or a truncated KLF2 variant ( Figure 3C). Taken together, our results indicate that the ERK5-KLF2 axis promotes expression of Zscan4 and other 2C early embryonic genes.

ERK5 directly phosphorylates KLF2 at multiple pSer/Thr-Pro motifs
We have demonstrated that KLF2 is a transcriptional target of the ERK5 pathway in the regulation of ZSCAN4/2C genes. However, we next explored whether ERK5 might regulate KLF2 via other mechanisms. KLF2 comprises a predicted N-terminal MAP kinase docking motif and phosphomotifs that can be phosphorylated by CMGC family kinases, particularly the MAP kinase family ( Figure 4A). This raised the intriguing possibility that ERK5 may additionally regulate KLF2 by direct phosphorylation. We therefore tested whether recombinant ERK5 phosphorylates recombinant KLF2 in vitro. ERK5 which is phosphorylated and activated by MEK5 in insect cells and then purified, directly phosphorylates recombinant GST-KLF2 in vitro ( Figure 4B). GST-KLF2 phosphorylation is ablated by pre-treatment of the kinase assay with the selective ERK5 inhibitor XMD8-92 ( Figure 4B), indicating that trace contaminant kinases are not responsible for KLF2 phosphorylation. Activated ERK5 does not efficiently phosphorylate GST-ATF2 19-96, confirming that the observed phosphorylation is specific for KLF2 and not the GST tag ( Figure 4C). Furthermore, we show that ERK5 phosphorylates KLF2 to high stoichiometry of up to 0.5 pmol phosphate/pmol protein ( Figure 4D), as might be expected for a bona fide ERK5 substrate. Error bars represent mean ± SEM. Statistical significance was determined by student t-test. (C) Erk5/Mapk7 −/− mESCs were transfected with MEK5DD and either empty vector, wild-type ERK5 or kinase inactive (D200A) ERK5. mRNA levels of Klf2, Zscan4, Tdpoz3 and Zfp352 were determined by qRT-PCR. Klf2 induction is used as a positive control. Each data point represents one biological replicate calculated as an average of two technical replicates (n = 3). Error bars represent mean ± SEM. Statistical significance was determined by student t-test. (D) Genomic DNA from mESCs maintained in LIF/FBS or 2i media was subjected to qPCR using primers against the telomeric repeats (T) and a single locus control region (S). T/S ratio was calculated to give the average relative telomere length. Each data point represents one biological replicate calculated as the average of three technical replicates (n = 7). A power calculation was used to determine sample size, which was randomly selected from biological replicates. Error bars represent mean ± SEM. Statistical significance was determined by student t-test. (E) Genomic DNA from mESCs transfected with either empty vector or MEK5DD (48 h) and treated with either DMSO or 10 mM AX15836 (24 h prior to lysis) was collected and subjected to qPCR using primers against the telomeric repeats (T) and a single locus control region (S). T/S ratio was calculated to give the average relative telomere length. Each data point represents one biological replicate calculated as the average of three technical replicates (n = 22). A power calculation was used to determine sample size, which was randomly selected from biological replicates. Error bars represent mean ± SEM. Statistical significance was determined by student t-test. (F) Genomic DNA from mESCs transfected with either empty vector or ZSCAN4 (48 h) was collected and subjected to qPCR using primers against the telomeric repeats (T) and a single locus control region (S). T/S ratio was calculated to give the average relative telomere length. Each data point represents one biological replicate calculated as the average of three technical replicates (n = 7). A power calculation was used to determine sample size, which was randomly selected from biological replicates. Error bars represent mean ± SEM. Statistical significance was determined by student t-test.
We next sought to identify the KLF2 sites which are phosphorylated by ERK5 in vitro. To this end, we subjected ERK5 phosphorylated KLF2 to tryptic digestion and HPLC analysis. Radioactive tracing identifies two major peaks of KLF2 phosphorylation ( Figure 4E). Edman sequencing shows that the first peak of ERK5 phosphorylation corresponds to a dual phosphorylated Thr171/Ser175 peptide found within the KLF2 transcriptional repression domain, which contains the two Ser/Thr-Pro motifs characteristic of CMGC kinase substrates ( Figure 4A). The second peak is a singly phosphorylated Thr243/Ser247 peptide consisting of two further Ser/ Thr-Pro motifs towards the KLF2 DNA binding domain ( Figure 4F). Interestingly, ERK5 primarily phosphorylates Ser247 but displays little activity towards Thr243 ( Figure 4F), in contrast with phosphorylation of the Thr171/Ser175 motif. Therefore, ERK5 directly phosphorylates KLF2 on multiple, distinct Ser/Thr-Pro motifs characteristic of MAP kinase substrates. Wild-type (Klf2 +/+ ) and Klf2 Δ/Δ mESCs were transfected with either empty vector or MEK5DD (48 h) and treated with DMSO or 10 mM AX15836 (24 h prior to lysis). ERK5 activation was assessed by band-shift following ERK5 immunoblotting (P = phosphorylated active ERK5, UnP = unphosphorylated inactive ERK5), and by KLF2 induction. Expression levels of KLF2 and ZSCAN4 were determined by immunoblotting. ERK1/2 was used as a loading control. (D) ZSCAN4 levels from replicate wild-type (Klf2 +/+ ) and Klf2 Δ/Δ mESC samples represented in Figure 3C were quantified using Bio-Rad ImageLab software.
Each individual data point represents one biological replicate (n = 3).  We then tested whether ERK5 activity promotes KLF2 phosphorylation at these motifs in mESCs. Using an ectopically-expressed 3xFLAG-tagged KLF2 reporter, which is not subject to transcriptional regulation by ERK5, we monitored KLF2 phosphorylation using a KLF2 Ser175 phospho-specific antibody, which is a major site of ERK5 phosphorylation ( Figure 4F). Using this system, we observe that KLF2 is basally phosphorylated at Ser175 ( Figure 4G), consistent with KLF2 phosphorylation by other kinases such as ERK1/2 [26]. However, co-expression of WT-ERK5, but not a kinase inactive mutant (D200A) of ERK5 (ERK5 KD) drives an increase in KLF2 Ser175 phosphorylation ( Figure 4G). These data suggest that ERK5 can also phosphorylate KLF2 at Ser/Thr-Pro motifs in mESCs.

KLF2 phosphorylation drives Cullin E3 ligase recruitment and ubiquitin-dependent degradation
Our findings then raised the question of mechanism(s) by which ERK5 phosphorylation impacts on KLF2 function. To address this, we performed affinity purification mass spectrometry to identify proteins that specifically interact with KLF2 in a manner that is dependent on Ser/Thr-Pro motif phosphorylation. Within the cohort of proteins that are enriched in immunoprecipitates of FLAG-tagged wild-type KLF2 compared with KLF2 in which the dual Ser/Thr-Pro phosphorylation motifs are mutated (KLF2-4A) are FBW7, a F-box WD40 containing Cullin substrate adaptor and CUL1, a Cullin family RING-finger E3 ubiquitin ligase (CRL; Figure 5A). Previous studies implicate ubiquitylation in KLF2 regulation in mESCs [26], and FBW7 has been shown to mediate KLF2 ubiquitylation in endothelial cells [27,28]. We validate the interaction between HA-tagged FBW7 and FLAG-KLF2 ( Figure 5B) and confirm that mutation of either the Thr171/Ser175 or Thr243/Ser247 phosphorylation motifs (KLF2 2A-N or 2A-C respectively) or all four Ser/Thr-Pro sites (KLF2-4A) disrupts this interaction. This suggests that multiple KLF2 phosphorylation motifs are required for recruitment of FBW7, which is characteristic of phosphorylation dependent ubiquitylation by FBW7 CRL complexes [29]. These data therefore indicate that KLF2 phosphorylation promotes recruitment of a CRL E3 ubiquitin ligase. Interestingly, KLF2 Ser/Thr-Pro site mutants appear to show increased expression in mESCs ( Figure 5B), suggesting that they are less susceptible to proteasomal degradation.
We then tested the role of dual Ser/Thr-Pro motifs in driving KLF2 ubiquitylation using UBQLN1 Tandem Ubiquitin Binding Element (TUBE) resin to specifically enrich ubiquitylated proteins. FLAG-KLF2 is polyubiquitylated in mESCs, as demonstrated by the high-molecular weight species detected in HALO-TUBE pull-downs (KLF2-Ub n ; Figure 5C). Treatment with the broad-specificity deubiquitinase USP2 abolishes the KLF2-Ub n signal ( Figure 5C), confirming that these high-molecular weight species are poly-ubiquitylated KLF2. Furthermore, mutational inactivation of dual Ser/Thr-Pro phosphorylation motifs (KLF2-4A) abolishes KLF2 ubiquitylation ( Figure 5D), suggesting that ERK5 phosphorylation of KLF2 at Ser/Thr-Pro motifs promotes KLF2 ubiquitylation. KLF2 phosphorylation and ubiquitylation impacts on stability, as a cycloheximide timecourse indicates that wild-type KLF2 is a relatively short-lived protein compared with KLF2 in which the Ser/Thr-Pro motifs are mutated (KLF2-4A; Figure 5E,F). These data therefore implicate ERK5 not only in transcriptional induction of the Klf2 gene, but also in phosphorylation and resulting proteasomal degradation of the KLF2 protein product.  Part 1 of 2 (A) Wild-type (Klf2 +/+ ) or Klf2 Δ/Δ mESCs were transfected with either wild-type (WT) or T171A/S175A/T243A/S247A mutant (4A) 3xFLAG-KLF2. FLAG-KLF2 was isolated by FLAG IP, and interacting proteins identified by mass spectrometry. Spectral counting was used to rank proteins that preferentially interact with WT KLF2 (interaction ratio >1) or 4A KLF2 (interaction ratio <1). (B) mESCs were transfected with HA-FBW7 and either empty vector or the indicated 3xFLAG-KLF2 construct. WT = wild-type, 2A N = T171A/S175A, 2A C = T243A/S247A, 4A = T171A/S175A/T243A/S247A. FLAG-KLF2 was isolated by FLAG IP, and HA-FBW7 interaction determined by immunoblotting. ERK1/2 was used as a loading control. (C) mESCs were transfected ERK5 phosphorylation of KLF2 confers negative feedback regulation of ZSCAN4/2C gene expression KLF2 transcriptional induction by ERK5 signalling promotes Zscan4/2C gene expression. However, ERK5 also directly phosphorylates KLF2 to drive ubiquitylation and degradation. This prompts the hypothesis that KLF2 phosphorylation by ERK5 confers a negative feedback loop to temper Zscan4/2C gene expression. To test this notion, we investigated the impact of the dual Ser/Thr-Pro phosphorylation motifs on KLF2-dependent transcriptional induction of Zscan4/2C genes. As shown previously, expression of wild-type KLF2 drives expression of Zscan4, Zfp352 and Tdpoz3 mRNAs ( Figure 5G). However, mutation of the dual Ser/Thr-Pro motifs (KLF2-4A) further augments Zscan4, Zfp352 and Tdpoz3 expression, suggesting that KLF2 phosphorylation and ubiquitin-mediated turnover suppresses KLF2 transcriptional induction of Zscan4 and other early embryonic genes. Wild-type KLF2 and KLF2-4A are predominantly, but not exclusively, nuclear localised in mESCs ( Figure 5H), indicating that KLF2 phosphorylation does not impact on transcriptional activity by altering subcellular localisation. These results therefore suggest that ERK5 signalling promotes ZSCAN4 expression by KLF2 transcriptional induction, and KLF2 function is then tempered by a negative feedback loop dependent upon direct ERK5 phosphorylation and resulting ubiquitylation ( Figure 6).

Discussion
ERK5 is a unique kinase encoding both a kinase domain and a transcriptional activation domain. It was identified as a transcriptional activator of pluripotency genes in mESCs, including the Klf2 transcription factor [22]. However, other functions of ERK5 in regulating gene expression and proteome dynamics in mESCs have not been systematically explored. We use quantitative proteomics to identify the ERK5-dependent proteome, which unveils a function for ERK5 in driving expression of the ZSCAN4 rejuvenation factor and other early embryonic two-cell stage genes. We show that the mechanism by which ERK5 induces expression of these genes is via KLF2, which is a transcriptional target of the ERK5 signalling pathway.
The ERK5 pathway is subject to complex regulation that we hypothesise play a critical role in establishing the regulatory dynamics of ZSCAN4 expression. Previous work has shown that ERK5 is phosphorylated and activated by MEK5, resulting in ERK5 nuclear translocation [30,31]. Once localised to the nucleus, ERK5 induces KLF2 transcription via a dual mechanism involving members of the MEF2 family of transcription factors. This involves direct phosphorylation and activation of MEF2 [13] and transcriptional co-activation of MEF2 target genes via the ERK5 C-terminal transcriptional activation domain [7,16]. Recent work has shown that the ERK5 kinase inhibitor AX15836 both inhibits ERK5 kinase activity and drives paradoxical activation of the C-terminal transcriptional activation domain [20], adding to the complexity of ERK5 signalling. Curiously, we now report that ERK5 also directly phosphorylates KLF2, which in contrast to ERK5 phosphorylation of MEF2, negatively regulates KLF2 by promoting ubiquitylation and degradation. Furthermore, the presence of ERK5 phosphorylation sites within the KLF2 transcriptional repression domain suggests that ERK5 may also modulate KLF2 transcriptional function, although this hypothesis has not yet been tested. Thus, ERK5 activation and nuclear translocation presumably enables both ERK5 dependent transcriptional induction  of Klf2 and direct phosphorylation and negative regulation of largely nuclear KLF2 protein. We propose this generates a dynamic system of amplification followed by negative feedback, which could account for oscillatory ZSCAN4 expression observed in mESCs [21]. We have shown that ERK5 signalling to ZSCAN4 promotes telomere maintenance i.e. increased telomere length in mESCs. However, the mechanism by which ZSCAN4 drives telomere maintenance remains a key question. The ZSCAN4 network leads to global demethylation of the mESC genome [32], and reportedly mediates telomere maintenance by suppressing DNA methylation to enable telomere extension via homologous recombination [33]. This is driven by ubiquitylation and proteasomal degradation of the key maintenance DNA methyltransferase, DNMT1, in a mechanism dependent on the E3 ubiquitin ligase UHRF1 [33]. Therefore, it will be important to investigate the function of ERK5 signalling to ZSCAN4 in suppression of DNMT1 expression, which could underpin increased telomere elongation activity observed upon ERK5 pathway activation. Of interest, we have shown previously and in this paper that ERK5 signalling acts to suppress expression of de novo DNA methyltransferases DNMT3A and DNMT3B [6].
Finally, the mechanism by which KLF2 controls ZSCAN4 has not yet been established. KLF2 genome occupancy studies have not identified KLF2 binding sites in the ZSCAN4 gene cluster, although this genomic region is underrepresented in such datasets. Recent data shows that Kruppel-like factors (KZFP factors) related to KLF2 promote activation of transposable element-based enhancers during zygotic genome activation, which in turn induces expression of early embryonic genes during development [34]. An exciting possibility is that this transposable element-based mechanism may be responsible for the expression of early embryonic genes downstream of the ERK5-KLF2 signalling axis in mESCs.

Experimental procedures
Many reagents developed for this study are available by request at the MRC-PPU reagents & services website (https://mrcppureagents.dundee.ac.uk/).  The ERK5 pathway is subject to complex regulation that we propose plays a critical role in establishing the regulatory dynamics of ZSCAN4 expression. ERK5 activation drives Klf2 transcription, which drives ZSCAN4/2C gene expression. However, ERK5 also directly phosphorylates KLF2, which negatively regulates KLF2 function by promoting ubiquitylation and degradation. Thus, ERK5 both promotes KLF2 transcriptional induction and negatively regulates KLF2 via direct phosphorylation, generating a dynamic system of amplification followed by negative feedback that accounts for oscillatory ZSCAN4 expression observed in mESCs. mESC culture, transfection and lysis CCE mESCs were cultured on gelatin coated plates in media containing LIF, 10% FBS (Gibco), and 5% knockout serum replacement (Invitrogen) unless otherwise stated. mESCs were transfected with pCAGGS expression vectors using Lipofectamine LTX (Life Technologies), selected with puromycin after 24 h and cultured for the stated times. For CRISPR/Cas9, mESCs were transfected with pX335 and pKN7 (Addgene) and selected, then either lysed or clones isolated. Cell extracts were made in lysis buffer (20 mM Tris [ pH 7.4], 150 mM NaCl, 1 mM EDTA, 1% NP-40 [v/v], 0.5% sodium deoxycholate [w/v], 10 mM β-glycerophosphate, 10 mM sodium pyrophosphate, 1 mM NaF, 2 mM Na 3 VO 4 , and Roche Complete Protease Inhibitor Cocktail Tablets).

Quantitative proteomics
Protein extraction and digestion mESCs were washed in PBS and then lysed in 8.5 M urea, 50 mM ammonium bicarbonate ( pH 8.0) supplemented with protease inhibitors. Lysate was sonicated using Biosonicator operated at 50% power for 30 s on/off each on ice water bath for 5 min. The lysates were then centrifuged at 14 000 rpm for 10 min at 4°C and supernatants collected. Protein concentration of the lysate was determined by BCA protein assay. Proteins were reduced with 5 mM DTT at 55°C for 30 min and cooled to room temperature. Reduced lysates were then alkylated with 10 mM iodoacetamide at room temperature for 30 min in the dark. The alkylation reaction was quenched by the addition of another 5 mM DTT. After 20 min of incubation at room temperature, the lysate was digested using Lys-C with the weight ratio of 1 : 200 (Enzyme/lysate) at 37°C for 4 h. The samples were further diluted to 1.5 M Urea with 50 mM ammonium bicarbonate ( pH 8.0), and the sequencing-grade trypsin was added with the weight ratio of 1 : 50 (enzyme/lysate) and incubated overnight at 37°C. The digest was acidified to pH 3.0 by addition of TFA to 0.2% and gently mixed at room temperature for 15 min; the resulting precipitates were removed by centrifugation at 7100 RCF for 15 min. The acidified lysate was then desalted using a C18 SPE cartridge (Waters) and the eluate was aliquoted into 100 mg and dried by vacuum centrifugation. To check the digests, 1 mg of each sample was analysed by mass spectrometry prior to TMT labelling.
TMT labelling and high pH reverse phase fractionation 100 mg of peptide from each sample was re-suspended into 100 mM Triethylammonium bicarbonate buffer ( pH 8.5). 0.8 mg of TMT tag (Thermo) dissolved in 41 ml of anhydrous acetonitrile was transferred to the peptide sample and incubated for 60 min at room temperature. The TMT labelling reaction was quenched with 5% hydroxylamine. 1 mg of each labelled sample was analysed by mass spectrometry to assess the labelling efficiency before pooling. After checking the labelling efficiency, the TMT-labelled peptides were mixed together and dried by vacuum centrifugation. After drying, the mixture of TMT-labelled peptides was dissolved into 0.2% TFA and then desalted using a C18 SPE cartridge. The desalted peptides were subjected to orthogonal basic pH reverse phase fractionation, collected in 96-well plate and consolidated for a total of 20 fractions for vacuum dryness.

LC-MS/MS analysis
Each fraction was dissolved in 0.1% FA and quantified by Nanodrop. 1 mg of peptide was loaded on C18 trap column at a flow rate of 5 ml/min. Peptide separations were performed over EASY-Spray column (C18, 2 mm, 75 mm × 50 cm) with an integrated nano electrospray emitter at a flow rate of 300 nl/min. The LC separations were performed with a Thermo Dionex Ultimate 3000 RSLC Nano liquid chromatography instrument. Peptides were separated with a 180 min segmented gradient as follows: 7% ∼ 25% buffer B (80% ACN/0.1% FA) in 125 min, 25% ∼ 35% buffer B for 30 min, 35% ∼ 99% buffer B for 5 min, followed by a 5 min 99% wash and 15 min equilibration with buffer A (0.1% FA).
Data acquisition on the Orbitrap Fusion Tribrid platform with instrument control software version 3.0 was carried out using a data-dependent method with multinotch synchronous precursor selection MS3 scanning for TMT-9plex tags. The mass spectrometer was operated in data-dependent most intense precursors Top Speed mode with 3 s per cycle. The survey scan was acquired from m/z 375 to 1500 with a resolution of 120 000 resolving power with AGC target 400 000. The maximum injection time for full scan was set to 60 ms. For the MS/MS analysis, monoisotopic precursor selection was set to peptide. AGC target was set to 50 000 with the maximum injection time 120 ms. Charge states unknown and 1 or higher than 7 were excluded. The MS/MS analyses were performed by 1.2 m/z isolation with the quadrupole, normalised HCD collision energy of 37% and analysis of fragment ions in the Orbitrap using 15 000 resolving power with auto normal range scan starting from m/z 110. Dynamic exclusion was set to 60 s. For the MS3 scan, the MS3 precursor population from MS2 scan ranging from m/z 300-100 was isolated using the SPS waveform and then fragmented by HCD. The HCD normalised collision energy was set to 65. The MS3 scan were acquired from m/z 100 to 500 with a resolution of 50 000 and AGC target 50 000/. The maximum injection time for full scan was set to 86 ms.

Data processing and spectra assignment
Data from the Orbitrap Fusion were processed using Proteome Discoverer Software (version 2.2). MS2 spectra were searched using Mascot against a UniProt Mouse database appended to a list of common contaminants (10 090 total sequences). The searching parameters were specified as trypsin enzyme, two missed cleavages allowed, minimum peptide length of 6, precursor mass tolerance of 20 ppm, and a fragment mass tolerance of 0.05 Daltons. Oxidation of methionine and TMT at lysine and peptide N-termini were set as variable modifications. Carbamidomethylation of cysteine was set as a fixed modification. Peptide spectral match error rates were determined using the target-decoy strategy coupled to Percolator modelling of positive and false matches. Data were filtered at the peptide spectral match-level to control for false discoveries using a q-value cut off of 0.01, as determined by Percolator. For quantification, the signal-to-noise values higher than 10 for unique and razor peptides were summed within each TMT channel, and each channel was normalised with total peptide amount. Quantitation was further performed by adjusting the calculated P-values according to Benjamini-Hochberg. The significance regulated proteins with P-value <0.05 were further manually investigated with the standard deviations of biological replicates.
Immunofluorescence microscopy mESCs were seeded on gelatin-coated coverslips and transfected as required as described above. After 24 h, mESCs were fixed with 4% PFA (w/v) in PBS, before being permeabilised in 0.5% Triton X-100 in PBS (v/v) for 5 min at room temperature. Coverslips were then blocked with 1% Fish gelatin in PBS (w/v) and incubated with primary antibodies for 2 h at room temperature in a humid chamber. After three washes with PBS, secondary antibodies conjugated to fluorophores were diluted 1 : 500 in blocking buffer and incubated on coverslips for 1 h at room temperature in a humid chamber. Where cytoskeleton was being observed, Actin Red 555 reagent was added to the secondary antibody mix. After three washes with PBS, 0.1 mg/ml Hoescht was incubated with the coverslips for 5 min at room temperature in a humid chamber to stain nuclei. After three more washes with PBS, coverslips were mounted onto cover slides using Fluorsave reagent (Millipore). Images were taken using a Leica SP8 confocal microscope and processed using FIJI and Photoshop CSC software (Adobe).

RNA extraction and quantitative PCR
RNA was extracted using the OMEGA total RNA kit and reverse transcribed using iScript reverse transcriptase (Bio-Rad). qPCR was performed using TB Green Premix Ex Taq (Takara). The ΔCt method using Gapdh as a reference gene was used to analyse relative expression and the 2 −ΔΔCt (Livak) method used to normalise to control. Primers used are listed in Table 1.

Recombinant ERK5 kinase assay
200 ng pure active ERK5 was incubated with the indicated inhibitor in 50 mM Tris-HCl ( pH 7.5), 0.1 mM EGTA, and 1 mM 2-mercaptoethanol. The reaction was initiated by adding 10 mM magnesium acetate, 100 mM [γ-32 P]-ATP (500 cpm/pmol), and 5 mg of the indicated substrate (GST-KLF2 or GST) and incubated Once colourless, gel pieces were shrunk with acetonitrile for 15 min, the supernatant aspirated and gel pieces dried using a SpeedVac. Dry gel pieces were swollen with 25 mM Triethylammonium bicarbonate containing 5 mg/ml of Trypsin and incubated at 30°C overnight on a Thermomixer. After trypsinisation, an equivalent volume of acetonitrile was added to the digested gel pieces and incubated for 15 min more. Supernatant was transferred to a clean tube and dried in a SpeedVac. Further extraction of peptides from the gel pieces was achieved using 100 ml 50% acetonitrile with 2.5% formic acid, and incubated for 15 min at room temperature. This supernatant was combined with the dry peptides and again dried in a SpeedVac. Digestion removed >95% of the 32 P from the gel pieces to the recovered dried peptides. Peptides were fractionated on a HPLC column (Vydac C18) equilibrated in 0.1% (w/v) trifluoroacetic acid (TFA), using a linear acetonitrile gradient at a flow rate of 0.2 ml/min. 100 ml fractions were collected and analysed by LC-MS/MS. The data from this was searched using Mascot (matrixscience.com) to identify Phospho (Ser/Thr), Phospho (Tyr), Oxidation (Met) and Dioxidation (Met). Radioactive fractions were sent to solid-phase Edman degradation for phosphorylation site identification using an Applied Biosystems 494C sequencer of the peptide coupled to Sequelon-AA membranes (Applied Biosystems).

HALO-TUBE ubiquitin pull-down assay
HALO-tagged UBQLN1 Tandem Ubiquitin Binding Element (TUBE) (MRC-PPU R&S DU23799) is a tetramer of the UBQLN1 UBA ubiquitin binding domain (aa536-589) provided by Dr. A. Knebel (MRC-PPU, Dundee). A mutant TUBE incapable of binding ubiquitin (M557K, L584K) was also used. HALO-TUBE beads were prepared by washing 1 ml packed HALO resin three times with HALO wash buffer (50 mM Tris-HCl ( pH 7.5), 0.5 M NaCl, 1% Triton-X100 (v/v)), resulting in a 1 : 4 slurry. The slurry was combined with 7 mg of HALO-TUBE or HALO-mutant TUBE and incubated on a rotating wheel at 4°C overnight. After five washes in 10 ml HALO wash buffer and then in 10 ml HALO storage buffer (50 mM Tris-HCl ( pH 7.5), 150 mM NaCl, 0.1 mM EGTA, 270 mM Sucrose, 0.07% β-mercaptoethanol (v/v)), the beads were stored as a 20% slurry at 4°C and washed three times in lysis buffer immediately prior to use. To analyse ubiquitylation of a protein of interest, 10 ml washed packed beads were incubated with 1 mg clarified cell lysate for 3 h at 4°C with shaking. After three washes with lysis buffer, beads were analysed by SDS-PAGE and immunoblotting.

Telomere qPCR assay
Average telomere length was determined using qPCR and quantifying the ratio of telomeric DNA repeats to a single-copy gene (acidic ribosomal phosphoprotein PO (36B4) gene. Cells were seeded in 12-well plates, genomic DNA extracted using DNeasy Blood and Tissue Kit (QIAGEN) according to the manufacturer's instructions. Primers were supplied by Sigma-Aldrich and listed in Table 2.
Each reaction for the telomere portion of the assay included 12.5 ml SYBR Green PCR Master Mix (Takara), 300 nM each of the forward and reverse primers, 20 ng genomic DNA, and enough double-distilled H 2 O to yield a 25 ml reaction. Three 20 ng samples of each DNA were placed in adjacent wells of a 96-well plate. An automated thermocycler (Prism 7000 Sequence Detection System, Applied Biosystems) was used with the following reaction conditions: 95°C for 10 min followed by 30 cycles of data collection at 95°C for 15 s and a 56°C anneal-extend step for 1 min. Each reaction for the 36B4 portion contained 12.5 ml SYBR Green PCR Master Mix (Takara), 300 nM forward primer, 500 nM reverse primer, 20 ng genomic DNA and enough double-distilled H 2 O to yield a 25 ml reaction. Three 20 ng samples of each DNA were placed in adjacent wells of a 96-well plate. The thermocycler reaction conditions were: 95°C for 10 min, followed by 35 cycles of data collection at 95°C for 15 s, with 52°C annealing for 20 s, followed by extension at 72°C for 30 s. Telomere and 36B4 reactions were performed on separate plates. Therefore, to minimise variation due to location in plate, each sample was loaded into corresponding positions on each plate, such that the two plates had the same layout. Analysis was performed by subtracting 36B4 C t value from corresponding Telomere C t value and normalising to the average of the control conditions. These log ratios were then plotted using GraphPad Prism software, which was used to perform statistical tests. Original protocol for telomere qPCR protocol and data analysis is reported in [23].

Statistical analysis
Data are presented as the average with error bars indicating standard error of the mean (SEM). Statistical significance of differences between experimental groups was assessed using a Student's t-test. Differences in averages were considered significant if P < 0.05. Representative western blots are shown. Power calculations for telomere qPCR assays were performed using an online tool (http://powerandsamplesize.com/Calculators/ Compare-2-Means/2-Sample-Equality), with the following parameters: sample size nB = number of observed measurements for control group, Power = 0.80, P-value = 0.05, Group A mean = mean of observed measurements for test group, Group B mean = mean of observed measurements for control group, Standard deviation = calculated standard deviation based on observed measurements for test group, Sampling ratio = 1. Total sample size was then read off the graph at Power = 0.80 and observed test group mean. Total sample size was divided by two to give number of samples required for each group, and replicates were randomly selected for each sample group. A student t-test was performed on these sample groups to determine significance.