The mechanisms that regulate the nucleocytoplasmic localization of human deubiquitinases remain largely unknown. The nuclear export receptor CRM1 binds to specific amino acid motifs termed NESs (nuclear export sequences). By using in silico prediction and experimental validation of candidate sequences, we identified 32 active NESs and 78 inactive NES-like motifs in human deubiquitinases. These results allowed us to evaluate the performance of three programs widely used for NES prediction, and to add novel information to the recently redefined NES consensus. The novel NESs identified in the present study reveal a subset of 22 deubiquitinases bearing motifs that might mediate their binding to CRM1. We tested the effect of the CRM1 inhibitor LMB (leptomycin B) on the localization of YFP (yellow fluorescent protein)- or GFP (green fluorescent protein)-tagged versions of six NES-bearing deubiquitinases [USP (ubiquitin-specific peptidase) 1, USP3, USP7, USP21, CYLD (cylindromatosis) and OTUD7B (OTU-domain-containing 7B)]. YFP–USP21 and, to a lesser extent, GFP–OTUD7B relocated from the cytoplasm to the nucleus in the presence of LMB, revealing their nucleocytoplasmic shuttling capability. Two sequence motifs in USP21 had been identified during our survey as active NESs in the export assay. Using site-directed mutagenesis, we show that one of these motifs mediates USP21 nuclear export, whereas the second motif is not functional in the context of full-length USP21.
Ubiquitylation, the covalent attachment of one or more units of the 76-amino-acid peptide ubiquitin, is an important post-translational modification that regulates the levels, activity and localization of many cellular proteins . Ubiquitylation is a reversible modification, and the removal of ubiquitin from substrates is catalysed by a group of enzymes, termed deubiquitylating enzymes or DUBs (deubiquitinases) . By specifically removing ubiquitin chains, DUBs modulate the ubiquitylation status of a variety of proteins, and thereby contribute to regulate important cellular processes, such as gene expression , proliferation , repair of DNA damage  and apoptosis .
The human genome encodes approximately 90 DUBs, which can be classified into five different subclasses according to the structure of their catalytic domain . A minority of human DUBs have previously been relatively well characterized. For example, USP (ubiquitin-specific peptidase) 7 has been shown to regulate the activity of the tumour-suppressor proteins p53 , FOXO (forkhead box O)  and PTEN (phosphatase and tensin homologue deleted on chromosome 10) . The involvement of USP7 and other deubiquitinases in processes and pathways related to tumour development has raised the possibility that these proteins might represent interesting targets for anticancer therapy [11,12]. However, a significant hurdle towards the potential development of DUBs as therapeutic targets is the limited information still available on the basic biology of this family of proteins. In fact, despite recent advances, the physiological role and the mechanisms that regulate the function of most human DUBs are yet to be identified.
One of the aspects of DUB biology that remains poorly characterized is the regulation of their nucleocytoplasmic distribution, which, in turn, might critically regulate the accessibility of these enzymes to their substrates. In this regard, there is evidence that several DUBs may act on substrates located in both the nucleus and the cytoplasm. For example, USP21 catalyses the deubiquitylation of the nuclear histone H2A  and the cytoplasmic protein Rip1 (receptor-interacting protein 1) , but how the access of USP21 to its targets may be regulated is still unknown. Furthermore, it has been shown that DUB-mediated deubiquitylation may regulate the translocation of some substrates between the nucleus and cytoplasm . However, very little is known about the mechanisms that may control the nucleocytoplasmic distribution of DUBs themselves. A systematic localization analysis of the 20 DUBs encoded by the fission yeast Schizosaccharomyces pombe has been reported recently , but few studies have specifically addressed the localization of DUBs in higher organisms [17–19].
Transport of proteins between the nucleus and the cytoplasm is carried out by specialized receptors that bind their cargoes in one compartment and escort them through the nuclear pore complex before releasing them in the other compartment. Transport receptors recognize specific amino acid motifs that function as targeting signals in the cargo proteins . Thus import receptors recognize and bind NLSs (nuclear localization signals), whereas export receptors bind to NESs (nuclear export sequences). ‘Classical’ NLSs consist of one or two short stretches of basic amino acids , as exemplified by the SV40 (simian virus 40) large T antigen NLS (PKKKRKV). On the other hand, the best-characterized NESs, termed ‘leucine-rich’ NESs, generally consist of a stretch of hydrophobic amino acids with a poorly conserved ‘consensus’ pattern . This type of NES bridges the interaction of the cargo proteins with the export receptor CRM1 .
Some proteins can shuttle back and forth between the nucleus and the cytoplasm, either constitutively or in a regulated manner. Nucleocytoplasmic shuttling has long been recognized as an essential regulatory mechanism for many cellular proteins . Most shuttling proteins identified to date exit the nucleus by interacting with the export receptor CRM1. Therefore, we reasoned that a comprehensive survey of CRM1-binding motifs (i.e. leucine-rich NESs) in human DUBs might represent a feasible strategy to begin to elucidate, on a whole-family scale, the mechanisms that control the nucleocytoplasmic distribution of these proteins.
A common approach to NES identification begins by examining the primary sequence of the protein(s) of interest using informatics programs, such as NetNES , ELM  or NES Finder (http://res.erasmusmc.nl/fornerodlab/), to pinpoint amino acid motifs that might constitute a NES. However, it has been shown that the NES-binding domain of CRM1 can accommodate a large variety of sequences, differing in both sequence and size [27–30] and, as a result, in silico prediction of cNESs (candidate NESs) is a challenging task. In this context, it is important to note that, as far as we know, the accuracy of the different programs available for NES prediction has not yet been directly compared in a prospective manner.
To identify sequence motifs in human DUBs that might mediate their binding to CRM1, we used in silico prediction of cNESs in 85 human DUBs, followed by experimental testing of the predicted cNES in a nuclear export assay. With this two-step approach, we identified 32 novel functional NESs and 78 non-functional NES-like motifs in this protein family. This relatively large set of experimentally validated data allowed us to prospectively compare the accuracy of commonly used NES prediction programs, and to gain further insight into the amino acid features of nuclear export sequences. On the other hand, the present study reveals a subset of 22 human DUBs bearing one or more sequence motifs that might mediate their binding to the export receptor CRM1, raising the possibility that the function of some of these enzymes may be regulated by active nucleocytoplasmic shuttling. In fact, using fluorescently tagged proteins, we demonstrate the ability of YFP (yellow fluorescent protein)–USP21 and GFP (green fluorescent protein)–OTUD (OTU-domain-containing) 7B to shuttle between the nucleus and the cytoplasm. Finally, by using site-directed mutagenesis, we show that CRM1-dependent nucleocytoplasmic shuttling of YFP–USP21 is mediated by one of the novel NESs identified in our global survey. These findings reveal a mechanism that may regulate the dynamic access of this enzyme to its previously described nuclear and cytoplasmic substrates [13,14].
NES prediction using bioinformatic analysis
The amino acid sequence of each of 85 human DUBs  was retrieved from NCBI (http://www.ncbi.nlm.nih.gov/protein) and submitted for analysis to three different web-based NES prediction servers: NetNES (http://www.cbs.dtu.dk/services/NetNES/), ELM (http://elm.eu.org/links.html) and NES Finder (http://res.erasmusmc.nl/fornerodlab/). Both ELM and NES Finder identify candidate NESs as linear sequence motifs resembling the ‘consensus’ NES, although the specific amino acid pattern defining a candidate NES differs between the two programs. The output of these programs are short amino acid sequences predicted to be cNESs. The NetNES program, on the other hand, combines neural networks and hidden Markov models to calculate a ‘NES score’ for each amino acid in the protein under analysis. The output of the NetNES program is a graphical representation of the NES score for each residue. Residues with a score exceeding the threshold value pre-determined by the program (0.5) are predicted to be part of an export sequence .
Plasmids, cloning procedures and site-directed mutagenesis
In order to evaluate the activity of the predicted DUB cNESs, a series of 110 expression plasmids based on the pRev(1.4)–GFP export assay reporter  were constructed. To this end, double-stranded DNA fragments encoding a 19-amino-acid sequence, encompassing the amino acids predicted as cNES and several flanking residues, were generated by annealing of two complementary oligonucleotides and extension with Klenow enzyme (Fermentas). These DNA fragments were cloned into the pRev(1.4)–GFP vector (a gift from Dr Beric Henderson, Westmead Institute for Cancer Research, University of Sydney, Sydney, Australia) using the BamHI/PinAI restriction sites, and confirmed by DNA sequencing. Plasmids encoding several full-length DUBs were generously provided by Dr Rene Bernards (Netherlands Cancer Institute, Amsterdam, The Netherlands) (USP1), Dr Pier Paolo di Fiore (University of Milan, Milan, Italy) (USP3), Dr Roger Everet (University of Glasgow, Glasgow, U.K.) (USP7), Dr David Barford (Institute of Cancer Research, London, U.K.) (CYLD) and Dr Paul C. Evans (Imperial Collage London, London, U.K.) (OTUD7B), or obtained from the laboratory of Dr John W. Harper (Harvard Medical School, Boston, U.S.A.) through Addgene (USP21). USP3, USP7 and USP21 cDNAs were amplified by PCR using high fidelity Pfu UltraII fusion HS DNA polymerase (Stratagene) and subcloned into pEYFP-C1 (Clontech) as HindIII/BamHI (USP3 and USP7) or XhoI/EcoRI (USP21) fragments.
The QuikChange® Lightning Site-Directed Mutagenesis Kit (Stratagene) was used to introduce NES-inactivating mutations into full-length YFP–USP21. The presence of the mutations was confirmed by DNA sequencing.
Cell culture, transfection and LMB (leptomycin B) treatment
HeLa cells were grown in Dulbecco's modified Eagle's medium, supplemented with 10% fetal bovine serum, 100 units/ml penicillin and 100 μg/ml streptomycin (all from Invitrogen). The cells were seeded on to sterile glass coverslips in 12-well trays 24 h before transfection. Transfections were carried out with FuGENE®6 (Roche Diagnostics) following the manufacturer's protocol. LMB (Apollo Scientific) was added to the culture medium to a final concentration of 6 ng/ml for the indicated period of time.
Fluorescence microscopy analysis
The cells expressing proteins tagged with GFP or YFP were fixed with 3.7% formaldehyde in PBS for 30 min, incubated with Hoechst 33285 (Sigma) to visualize the nuclei, washed with PBS and mounted on to microscope slides using Vectashield (Vector) 24 h after transfection. Slides were examined using a Zeiss Axioskop fluorescence microscope, and images were taken with a Nikon DS-Qi1Mc digital camera and the NIS-Elements F software.
In vivo nuclear export assay
The nuclear export assay was carried out essentially as reported previously . Briefly, pRev(1.4)–GFP-based plasmids containing the candidate DUB NESs were transfected into HeLa cells. Survivin NES  and the empty pRev(1.4)–GFP were used as positive and negative controls respectively. At 24 h post-transfection, cells were treated for 3 h with either 10 μg/ml CHX (cycloheximide; Sigma) plus 5 μg/ml ActD (actinomycin D; Sigma) or with CHX alone. CHX is added to ensure that cytoplasmic GFP arises from nuclear export and not from new protein synthesis, whereas ActD allows the detection of weak NESs by preventing nuclear import mediated by Rev. The subcellular localization of the fluorescent proteins was determined in at least 200 cells per sample. Using the scoring system described in the original paper , the activity of the functional NESs was rated between 1+ and 9+. The CRM1-dependence of the functional NESs identified in the nuclear export assay was confirmed by adding LMB (6 ng/ml) to the transfected cells.
Sequence alignment and modelling
Multiple sequence alignment of DUB cNESs was carried out with ClustalW2 , and edited to account for conserved hydrophobic positions.
The Figures showing the structure of USP21 bound to ubiquitin were prepared with PyMOL (http://www.pymol.org).
In silico prediction of candidate NESs in human DUBs
In order to identify sequence motifs that might mediate the binding of human DUBs to the export receptor CRM1 we took the two-step approach illustrated in Figure 1(A): in silico prediction of cNESs, followed by experimental testing of a subset of the predicted cNESs.
In silico prediction of cNESs in human DUBs
In the first step, the primary amino acid sequence of 85 human DUBs was analysed using three web-based NES prediction programs: NetNES , ELM  and NES Finder. As summarized in Figure 1(B), and detailed in Supplementary Table S1 (at http://www.BiochemJ.org/bj/441/bj4410209add.htm), 75 cNESs were identified in the amino acid sequence of human DUBs using the NetNES program, 71 using the ELM program and 351 using the NES Finder program. In total, 428 different cNESs were identified in the analysis. Even considering that these programs use different approaches for NES prediction (see the Experimental section), the limited degree of coincidence in the cNESs predicted in human DUBs by NetNES, ELM and NES Finder was remarkable. As an example of this lack of agreement, Figure 1(C) shows the six different cNESs predicted in USP1. The amino acid motifs within this protein identified as cNESs by the different programs did not overlap. Only 63 of the 428 cNES motifs identified were predicted by more than one program, and only six were predicted by the three programs.
Experimental testing of DUB cNESs
In order to evaluate the activity of cNESs, we used the pRev(1.4)–GFP-based nuclear export assay . Rev(1.4)–GFP is a chimaeric protein resulting from the fusion of an export-deficient mutant of the HIV Rev protein and GFP, which localizes to nucleoli. In the assay, active NESs are identified on the basis of their ability to mediate export of Rev(1.4)–GFP to the cytoplasm. ActD, which disrupts nucleoli and blocks nuclear import mediated by Rev NLS, can be added to reveal the activity of weaker NESs. Besides identifying active export sequences, the assay allows for the comparison of the level of activity of the different NESs, based on the degree of cytoplasmic accumulation of the Rev(1.4)-NES–GFP chimaeric proteins in the presence/absence of ActD.
The high number of candidate DUB NESs predicted in silico made it impractical to attempt the functional testing of all of them. Thus we aimed to test the 63 candidate NES predicted by more than one program (Figure 1B), and a subset of the cNESs predicted by single programs. This subset included, on one hand, the 16 sequences predicted only by NetNES with a ‘NES score’ higher than 0.7. On the other hand, we selected the first 16 sequences predicted only by ELM, and the first 16 sequences predicted only by NES Finder from the list of predicted DUB cNESs (Supplementary Table S1). In addition, the recent report that USP10 may undergo CRM1-dependent nuclear export , prompted us to test three cNESs predicted by the NES Finder program in this protein.
Despite repeated attempts, four sequences could not be cloned in the pRev(1.4)–GFP vector and, thus, the final number of DUB cNESs tested was 110. Figure 2 shows representative examples of the results obtained in the export assay. The pRev(1.4)–GFP empty vector was used as a negative control, and a NES identified previously in Survivin  was included as a positive control. As shown in Figure 2, the DUB cNESs FIN6, ELM33 and ELM3, but not FIN1, were capable of relocating Rev(1.4)–GFP to the cytoplasm. This cytoplasmic relocation was efficiently reverted by the specific CRM1 inhibitor LMB. These results, therefore, validated the sequences FIN6, ELM33 and ELM3 as active CRM1-dependent NESs. Using the nuclear export assay scoring system, ELM3 was assigned the highest export activity (9+), whereas FIN6 (3+) and ELM33 (7+) were weaker NESs in this experimental setting. Supplementary Table S2 (at http://www.BiochemJ.org/bj/441/bj4410209add.htm) provides a detailed account of the results obtained with the 110 DUB cNESs tested, including the export activity score of those sequences that tested positive in the assay. In total, 32 of the sequences tested were validated as functional NESs, whereas 78 sequences (including the three USP10 cNESs) were inactive and therefore represent non-functional NES-like motifs.
Experimental testing of DUB cNESs using the Rev(1.4)–GFP nuclear export assay in HeLa cells
Accuracy of NES prediction programs
Table 1 summarizes the results of the export assay in relation to the program(s) that predicted each cNES. Of note, the percentage of positive hits among the 110 cNESs tested was considerably higher for those cNESs predicted by multiple programs than for those predicted by a single program (39.3% compared with 16.3%).
|Program(s)||Number of cNESs predicted||Export assay results: positive/tested (%)||Estimated total positive|
|Total multiple programs||63||24/61 (39.3)||24|
|Total single program||365||8/49 (16.3)||32|
|Program(s)||Number of cNESs predicted||Export assay results: positive/tested (%)||Estimated total positive|
|Total multiple programs||63||24/61 (39.3)||24|
|Total single program||365||8/49 (16.3)||32|
In order to use these results to compare the accuracy of the prediction programs, we first extrapolated the percentage of experimentally validated NESs in each category to estimate the number of positive hits expected if the 428 predicted cNESs had been tested (‘Estimated total positive’ column in Table 1). Next, a two-by-two contingency matrix was generated for each program, with four possible outcomes for a given sequence (Figure 3A). A sequence predicted as a cNES by the program constitutes a TP (true positive) if it was positive in the export assay, and a FP (false positive) if it tested negative in the export assay. A sequence not predicted as a cNES by the program (but predicted as a cNES by at least one of the other two programs) constitutes a TN (true negative) if it was negative in the export assay and a FN (false negative) if it tested positive in the export assay. The sensitivity, specificity and PPV (positive predictive value) of each program were calculated as indicated in Figure 3(A). It must be noted that, in order to calculate these accuracy metrics, we needed to make the assumption that every potential NES in the amino acid sequence of human DUBs was predicted as a cNES by at least one of the programs (i.e. that all of the possible TNs and FPs were taken into account).
Accuracy of the NetNES, ELM and NES Finder programs for NES prediction
The contingency matrices for the NetNES, ELM and NES Finder programs are shown in Figure 3(B). The results included in these matrices are derived from the ‘cNESs predicted’ and ‘Estimated total positive’ columns of Table 1. The accuracy metrics of the three programs calculated from these matrices are shown in Table 2. According to our estimation, the ELM program demonstrated the highest PPV (38%), whereas the lowest PPV (10.8%) corresponded with the NES Finder program (Table 2).
Amino acid sequence features of DUB cNESs
In an attempt to gain further insight into the sequence determinants of CRM1-dependent NESs, we used sequence alignment methodology to examine the amino acid features of the 110 DUB cNESs tested in the light of the NES consensus redefined recently on the basis of novel structural information .
Alignment of those sequences that tested positive in the nuclear export assay (Figure 4A) highlighted some of the common features of CRM1-binding motifs [27, 30, 35] described previously, including the presence of hydrophobic (frequently leucine) residues at four or five positions (Φ0–Φ4), and a prevalence of acidic residues in the vicinity of Φ0. Interestingly, our analysis revealed a further common characteristic of many DUB NESs: the hydrophobic nature of the residue previous to Φ2 in several of these sequences. We also noted that, although the presence of two residues between the Φ2 and Φ3 hydrophobic residues (Φ2-X2-Φ3 spacing) has been reported to be preferred , a significant proportion of validated DUB NESs showed a Φ2-X3-Φ3 spacing. In spite of sharing some common characteristics, five different subsets of positive DUB NESs with distinguishing features could be established. Thus 11 out of 32 positive NES (P-I group in Figure 4A) strictly followed the Φ0–Φ4 consensus defined by Güttler et al. , whereas a second group of ten sequences, also with a Φ0–Φ4 pattern (P-II), was characterized by an alternative Φ0-X3-Φ1 spacing instead of the reported Φ0-X2-Φ1. Seven other sequences formed a group of NESs lacking the leading hydrophobic residue Φ0 and therefore having a Φ1–Φ4 pattern (P-III). A single sequence, NET8, conformed to the Rev-type consensus (P-IV). Finally, two experimentally validated sequences, predicted only by the NetNES program, did not fit any of the consensus patterns mentioned above (P-V).
Multiple sequence alignment of DUB cNES motifs showing amino acid sequence features.
Not surprisingly, many of the predicted cNES sequences that tested negative in the export assay displayed several common NES features. In fact, 20 of the 78 non-functional NES-like motifs followed the Φ0–Φ4 consensus pattern, with either the Φ0-X2-Φ1 or the alternative Φ0-X3-Φ1 spacings (Figure 4B). Twelve of these sequences (N-I group) had proline residues between Φ1 and Φ3, which would prevent them from adopting the proper conformation for CRM1 binding [27,30], and would therefore explain their lack of export activity. However, there is not such an obvious reason for the lack of activity of the other eight cNESs with a Φ0–Φ4 pattern (N-II).
A subset of human DUBS bear active NESs
The 32 experimentally validated cNESs identified in our assay reveal a subset of 22 human DUBs bearing amino acid motifs with the potential to mediate their binding to the export receptor CRM1. Figure 5 shows the location of these motifs in the corresponding DUBs. A single NES was identified in most DUBs, but two large proteins, USP9X and USP24, were found to bear four and five NESs respectively. This observation is in line with previous findings that several independent NESs may contribute to the nuclear export of large proteins such as FANCA (Fanconi anaemia protein A) or pericentrin [36,37].
Human DUBs bearing active NESs
Several DUB NESs were located near the N-terminal (e.g. USP1 and USP15) or C-terminal (e.g. USP7 and OTUD4) ends of the protein. Other NESs, such as those identified in OTUD7A and OTUD7B, were located in the middle of the protein. The location of these sequences within the full-length protein may be a relevant feature, as it has been proposed that NESs located near the N- or C-terminal ends of a protein are more likely to be operational in their physiological context [28,30].
The presence of one or more active NESs in their primary sequence suggests that some of these DUBs may undergo active CRM1-mediated nuclear export. It is important to note, however, that the activity of the reported NESs, thus far validated in the context of the Rev(1.4)–GFP chimaeric protein, needs to be assessed in the context of the full-length DUB, to determine their physiological importance.
Effect of CRM1 inhibition on the localization of NES-bearing DUBs
In order to begin addressing the potential role of these sequences in the context of their cognate full-length proteins, we investigated the effect that inhibition of the CRM1-mediated export pathway has on the nucleocytoplasmic distribution of several NES-bearing human DUBs.
Given the limited availability of DUB-specific commercial antibodies validated for immunofluorescence, we decided to use GFP- or YFP-tagged versions of USP1, USP3, USP7, USP21, CYLD and OTUD7B to determine their nucleocytoplasmic distribution in transfected cells. As shown in Figure 6(A), GFP–USP1, YFP–USP3 and YFP–USP7 showed a prominent nuclear localization, whereas YFP–USP21, GFP–CYLD and GFP–OTUD7B were located predominantly in the cytoplasm. These results are consistent with previous reports describing the steady-state localization of some of these proteins [38–42]. Incubation with the specific CRM1 inhibitor LMB for 3 h did not alter the distribution of the nuclear DUBs or that of GFP–CYLD, which remained in the cytoplasm. In contrast, LMB induced a clear relocation of the other two cytoplasmic DUBs, YFP–USP21 and GFP–OTUD7B, to the nucleus, revealing their ability to shuttle between nucleus and cytoplasm, and indicating that CRM1-mediated export plays a role in maintaining the cytoplasmic localization of these two DUBs. A time-course, semi-quantitative analysis (Figure 6B) showed that YFP–USP21 displayed a more rapid and pronounced response to LMB than GFP–OTUD7B. This observation, and the findings reported previously that USP21 may deubiquitinate both nuclear and cytoplasmic substrates [13,14] prompted us to select this protein for further analysis.
Effect of CRM1 inhibition on the localization of NES-bearing human DUBs
CRM1-dependent nucleocytoplasmic shuttling of USP21 is mediated by its N-terminal NES
We aimed to identify the sequence determinants of USP21 nuclear export, focusing on the two active NES motifs identified in this protein by our global survey. As shown in Figure 7(A), one of these motifs (ELM20) is located close to the N-terminal end of the protein, whereas the second one (NET20) is located in the middle of the protein, embedded within its catalytic USP domain. The activity score of these motifs in the nuclear export assay was 4+ and 9+ respectively. Mutant versions of these motifs were generated, where the Φ3 and Φ4 residues were changed to alanine, and tested in the export assay (Figure 7B). The results confirmed that these mutations effectively abrogate the export activity of ELM20 and NET20 motifs (Figure 7B). Next, the corresponding point mutations (L142/144A to inactivate ELM20, and L313/315A to inactivate NET20) were separately introduced into full-length YFP–USP21 using site-directed mutagenesis (Figure 7C), and the nucleocytoplasmic localization of the mutant proteins was examined. As shown in Figure 7(C), the YFP–USP21L142/144A mutant displayed a predominantly nuclear localization, whereas the YFP–USP21L313/315A mutant was located in the cytoplasm, like the wild-type protein.
The N-terminal NES is necessary and sufficient for YFP–USP21 export
These results indicate that the N-terminal NES motif of USP21 (E134LGAALSRLALRPEPPTLR152) is both necessary and sufficient to ensure CRM1-mediated export of the protein, and represents a physiologically relevant NES. In order to also seek potential determinants of its nuclear entry, we examined the USP21 sequence for cNLSs (candidate nuclear localization signals) using the PSORTII  and cNLS Mapper  programs. Each program identified one cNLS near the amino terminal end of USP21 (Supplementary Figure S1 at http://www.BiochemJ.org/bj/441/bj4410209add.htm). However, deletion of a 70-amino-acid region encompassing both cNLSs reduced, but did not abrogate, LMB-induced nuclear entry of YFP–USP21 (71–565). This result suggests that, unlike its NES-mediated nuclear export, nuclear import of USP21 may be mediated by multiple sequence determinants. These may include N-terminal basic NLSs, but also potential binding domains for other NLS-containing proteins that may contribute to USP21 nuclear entry through a ‘piggyback’ mechanism.
Human DUBs are emerging as critical regulators of several important cellular processes [3–6], and potential therapeutic targets [11,12]. However, many aspects of DUB biology remain largely unknown, including the mechanisms that regulate their subcellular localization. As a first step towards characterizing these mechanisms, we report here in the present paper results of a family-wide survey of active CRM1-dependent NESs in human DUBs.
Using three web-based prediction programs to analyse the primary amino acid sequence of 85 human DUBs, we identified 428 sequence motifs as candidate NESs, 110 of which were subsequently tested using a nuclear export assay . To our knowledge, the present study represents the first functional analysis of a relatively large series of candidate NESs in a protein family. Of the cNESs tested, 32 of the 110 were active in the assay, whereas 78 sequences were inactive. Three of the candidate NES motifs found to be inactive (ELM4, FIN15 and NET37) encompassed sequences tested previously by other groups [17,18]. Remarkably, whereas the lack of activity of ELM4 and NET37 reported in the present paper was consistent with previous findings, a short amino acid sequence (VEVYLLELKL) included within the FIN15 motif was reported previously to be a weak, but active, NES .
Only 63 of the 428 cNES motifs identified in silico were predicted by more than one program, illustrating the serious challenge of bioinformatic NES identification. In this regard, the performance of currently available NES prediction programs (NetNES, ELM and NES Finder) has never been, as far as we know, prospectively compared. We carried out such a comparison by calculating the sensitivity, specificity and PPV of these programs. It is important to note that these accuracy metrics, although useful, have intrinsic limitations , and also that several assumptions needed to be made in order to calculate them. Therefore the values obtained should be regarded as an estimation. With these caveats in mind, our analysis points to ELM as the best-performing tool for NES prediction among the three tested in the present paper. Even when using ELM, however, our estimation is that less than 40% of the predicted cNES are active in a functional assay. Interestingly, we noticed a higher percentage of positive hits among the cNESs predicted by multiple programs than among those predicted by a single program. Thus combining the results of different prediction programs might be a simple approach to prioritize for testing those cNESs most likely to be functional.
The difficulty in identifying CRM1-dependent NESs in silico stems largely from the ability of the receptor to accommodate a large variety of sequences into its NES-binding hydrophobic groove. A recent study has led to the revision of the NES consensus motif to include five hydrophobic residues (Φ0–Φ4 pattern) . To gain further understanding on the characteristics of CRM1-dependent NESs, we examined the amino acid features of the novel set of 32 active NESs and 78 inactive NES-like motifs identified in human DUBs.
Consistent with the redefined consensus, most experimentally validated DUB NESs (21 out of 32) fit a Φ0–Φ4 pattern. Importantly, nearly half of the active NESs with this pattern showed a three amino acid spacing between residues Φ0 and Φ1. This Φ0-X3-Φ1 spacing would be compatible with the conformation adopted by the NES sequences from snurportin  and PKI , and might even favour a more canonical geometry of the first turn of the α-helix. Therefore we propose that the possibility of a Φ0-X3-Φ1 spacing should be included in the NES consensus. Further illustrating the ability of CRM1 to bind a wide range of sequences, 11 experimentally validated NESs failed to fit the Φ0–Φ4 consensus, having instead a Φ1–Φ4 pattern, a Rev-type pattern or fail to resemble any of these patterns. Conversely, several cNESs bearing a Φ0–Φ4 pattern tested negative. The inactivity of some of these motifs could be ascribed to the presence of proline residues between Φ1 and Φ3, as reported previously [27,30]. We speculate that the lack of activity of other NES-like motifs could be due to a combination of several ‘unfavourable’ features, such as having a suboptimal set of hydrophobic residues , and/or lacking acidic amino acids in the vicinity of Φ0. The 78 experimentally confirmed inactive NES-like motifs may be useful as a ‘negative control group’ in the future refining of NES prediction algorithms.
The identification of sequence motifs that might bridge their interaction with CRM1 raises the possibility that a subset of human DUBs may be actively exported to the cytoplasm by this receptor. To begin addressing this possibility, we evaluated the nucleocytoplasmic distribution of six GFP- or YFP-tagged human DUBs in the presence or absence of the specific CRM1 inhibitor LMB. USP1, USP3 and USP7 showed a prominently nuclear localization in both untreated and LMB-treated cells, whereas CYLD remained in the cytoplasm even in the presence of LMB. Although these observations do not necessarily rule out the possibility that these DUBs are exported by CRM1, they suggest that other mechanisms may be more relevant in determining their final nucleocytoplasmic distribution. These mechanisms may include rapid nuclear import of USP1, USP3 and USP7, and cytoplasmic retention of CYLD, probably mediated by its B box domain . Since human DUBs are known to be part of multiprotein complexes , retention in the nucleus or cytoplasm by interacting partners might be a common factor modulating their localization.
Importantly, LMB induced the nuclear relocation of YFP–USP21 and, to a lesser extent, of GFP–OTUD7B, thus identifying them as novel CRM1-dependent shuttling proteins. In particular, the rapid and pronounced nuclear accumulation of YFP–USP21 suggests that CRM1-dependent shuttling may be a relevant regulatory mechanism for this protein, facilitating the dynamic access of this enzyme to its nuclear and cytoplasmic substrates [13,14]. Two USP21 sequence motifs (ELM20, E134LGAALSRLALRPEPPTLR152 and NET20, D301AQEFLKLLMERLHLEINR319) tested positive in the nuclear export assay. By using site-directed mutagenesis in the context of the full-length protein, we show that the ELM20 motif constitutes a physiologically relevant NES, whose mutational inactivation disrupts USP21 nuclear export. In contrast, mutation of the NET20 motif, which is embedded within the USP21 catalytical USP domain, does not alter USP21 localization. The finding that the NET20 motif appeared to be irrelevant for the export of full-length USP21 was surprising, given the high activity score of this motif in the export assay. This experimental observation, however, can be easily reconciled with the available crystal structure of the USP21 USP domain bound to ubiquitin (PDB code 3I3T). As shown in Figure 7(D), the region corresponding to NET20 (residues 301–319) is relatively buried in the structure and partly occluded by the ubiquitin moiety. More importantly, as illustrated in Figure 7(E), several residues that should make contacts with CRM1 (such as Met310, corresponding with the Φ2 position of the NES consensus) are engaged in packing interactions within the hydrophobic core of the protein. Furthermore, interaction between USP21 Glu304 and ubiquitin Arg72 has recently been shown to be essential for USP21-mediated deubiquitylation . These interactions would, therefore, render the NET20 motif unable to mediate USP21/CRM1 binding. Although there is no structural information available for the N-terminal region of USP21 where the ELM20 sequence resides, the results of the present study suggest that this NES is exposed and available for CRM1 binding. These observations underscore the need to experimentally validate, on a case-by-case basis, the role that the NES motifs identified by our survey play in the localization of each DUB. It is conceivable that some of these motifs, like NET20, may be masked or fail to acquire the proper conformation for CRM1 binding in their original protein context. Others, like ELM20, will constitute physiologically relevant NESs.
In summary, the results of the present study provide novel basic information relevant to the field of NES definition and prediction, and provide a starting point for the analysis of the mechanisms that regulate the nucleocytoplasmic distribution of human DUBs.
green fluorescent protein
nuclear export sequence
nuclear localization signal
positive predictive value
yellow fluorescent protein
Iraia García-Santisteban participated in the design of the study and performed experiments. Sonia Bañuelos analysed and interpreted the data, and contributed to the writing of the paper. Jose Antonio Rodríguez participated in the design of the study, performed experiments, analysed and interpreted the data, and wrote the paper.
We thank Ana Zubiaga for her continuous support and encouragement, and J.L. Zugaza for critical comments on the paper. We appreciate the generous gift of plasmids from the following investigators: Dr Beric Henderson (University of Sydney, Sydney, Australia), Dr Rene Bernards (Netherlands Cancer Institute, Amsterdam, The Netherlands), Dr Pier Paolo di Fiore (University of Milan, Milan, Italy), Dr Roger Everet (University of Glasgow, Glasgow, U.K.), Dr David Barford (Institute of Cancer Research, London, U.K.), Dr Paul C. Evans (Imperial College London, London, U.K.) and Dr John W. Harper (Harvard Medical School, Boston, U.S.A.). We appreciate the technical support by the staff from the High Resolution Microscopy Facility (Sgiker-UPV/EHU) and the DNA Sequencing Facility (Sgiker-UPV/EHU).
This work was supported by the Basque Government Department of Industry [grant numbers SAIOTEK S-PE07UN17, S-PE09UN65 and ETORTEK BioGUNE2010 (to J.A.R.)] and the Spanish Government MICINN [grant number BFU2009-13245 (to J.A.R.)], and a fellowship from the Department of Education of the Basque Government (to I.G.-S.).