Post-translational modifications (PTMs) add regulatory features to proteins that help establish the complex functional networks that make up higher organisms. Advances in analytical detection methods have led to the identification of more than 200 types of PTMs. However, some modifications are unstable under the present detection methods, anticipating the existence of further modifications and a much more complex map of PTMs. An example is the recently discovered protein modification polyphosphorylation. Polyphosphorylation is mediated by inorganic polyphosphate (polyP) and represents the covalent attachment of this linear polymer of orthophosphate to lysine residues in target proteins. This modification has eluded MS analysis as both polyP itself and the phosphoramidate bonds created upon its reaction with lysine residues are highly unstable in acidic conditions. Polyphosphorylation detection was only possible through extensive biochemical characterization. Two targets have been identified: nuclear signal recognition 1 (Nsr1) and its interacting partner, topoisomerase 1 (Top1). Polyphosphorylation occurs within a conserved N-terminal polyacidic serine (S) and lysine (K) rich (PASK) cluster. It negatively regulates Nsr1–Top1 interaction and impairs Top1 enzymatic activity, namely relaxing supercoiled DNA. Modulation of cellular levels of polyP regulates Top1 activity by modifying its polyphosphorylation status. Here we discuss the significance of the recently identified new role of inorganic polyP.
Biological systems often use simple building blocks to create new functions. The two major macromolecular components of any living organism are the polymeric nucleic acids (DNA and RNA) and proteins. The most structurally simple but surprisingly also the least characterized polymer in biological systems is inorganic polyphosphate (polyP) [2–4]. This polymer is composed of a linear chain of tens to hundreds of orthophosphate residues linked by highly energetic phosphoanhydride bounds. PolyP has been found in extreme conditions such as volcanic condensate and therefore may have played a role in prebiotic evolution . Perhaps for this reason it was considered for many years a ‘molecular fossil’ with no obvious functions . However, the presence of this polymer in all three kingdoms of life, bacteria, archaea and eukaryote, together with the diverse enzymology regulating its metabolism, supports its biological importance. Intrinsic to its polymeric nature, polyP represents a phosphate buffer that can be elongated or degraded to balance the free phosphate cellular needs. Furthermore, the chelating property of polyP regulates cation homeostasis. Today, polyP's physiological importance is well documented. In mammals, polyP seems to play a role in blood coagulation [6–8]. In bacteria, polyP has chaperone-like activity , and regulates stress response, ion channels [10,11] and infectivity as demonstrated for Pseudomonas aeruginosa [10,11,12] and Helicobacter pylori . In trypanosomes, polyP metabolism has been recently characterized and as in pathogenic bacteria, trypanosomes with reduced polyP levels are significantly less virulent . In Saccharomyces cerevisiae, polyP is metabolically regulated by the levels of inositol pyrophosphates by an unknown mechanism  and was recently shown to regulate cellular functions by covalently attaching to proteins in a post-translational mechanism called protein polyphosphorylation .
From pyrophosphorylation to polyphosphorylation
Protein polyphosphorylation was discovered serendipitously by us while studying protein pyrophosphorylation . Previously, we demonstrated in vitro that the inositol pyrophosphate IP7 (or PP-IP5; diphosphoinositol pentakisphosphate) could transfer its β-phosphate into a pre-phosphorylated serine residue, generating a serine attached to two phosphate groups [16,17], thus protein pyrophosphorylation. When heterologously expressed in yeast, human target proteins of pyrophosphorylation showed a gel mobility shift on SDS/PAGE , but no direct evidence of this modification was obtained in vivo. In an attempt to directly detect pyrophosphorylation, we focused on the yeast target nuclear signal recognition 1 (Nsr1)  that is involved in rRNA biogenesis, and shuttle between the cytoplasm and the nucleus [19,20]. It contains a polyacidic serine (S) and lysine (K) rich (PASK) cluster possessing a total of 65 serine residues. When yeast protein extracts were analysed by SDS/PAGE, an interesting scenario emerged; in a mutant depleted of IP7 (kcs1Δ), Nsr1 migrated as a sharp band of similar size to its predicted molecular mass of 67 kDa; however, in wild-type (WT) yeast extracts, it had a smeary appearance migrating from ∼110 kDa to ∼180 kDa. Moreover, in a yeast mutant with elevated levels of IP7 (vip1Δ), Nsr1 showed an even more retarded mobility . This result was rather puzzling considering that the mobility shift observed for phosphorylated targets is normally modest . Even imagining all the 65 serine residues within the PASK domain to be pyrophosphorylated, it was still difficult to envisage such an increase in Nsr1 mobility. As mentioned above, there is a direct correlation between the metabolic levels of inositol pyrophosphate and polyP: kcs1Δ has reduced polyP levels, whereas vip1Δ has increased polyP compared to WT . With this in mind, we hypothesized that the dramatic mobility shift observed for these proteins might not be mediated by IP7 but by polyP. In this case, instead of Nsr1 being simply pyrophosphorylated, chains of inorganic phosphate would be added to the protein (Figure 1). To investigate this hypothesis, we looked at Nsr1 mobility in polyP metabolism mutants. Vtc4 is the yeast enzyme responsible for polyP synthesis . The vtc4Δ yeast knockout is devoid of polyP , but the IP7 levels are as WT . Nsr1 mobility was even slightly faster in vtc4Δ than in kcs1Δ, which provided the first evidence that polyP, not IP7 directly, was the molecule responsible for the Nsr1 mobility shift. Indeed, when polyP was added to vtc4Δ protein extracts or to purified unmodified Nsr1, a mobility shift became evident . The highly charged nature of polyP could induce the formation of tightly bound ionic complexes with proteins. Given the polyacidic, negatively charged nature of the N-terminus of the PASK domain, this type of ionic interaction with polyP is unlikely; in fact, protein extraction under harsh denaturing conditions maintained mobility shift, suggesting that polyphosphorylation is a covalent PTM . However, it is possible that not all polyP–protein interactions are of covalent nature; proteins with high isoelectric point or containing basic domains could establish electrostatic intermolecular interactions with the negatively charged polyP .
Model of two independent PTMs, serine-pyrophosphorylation and lysine-polyphosphorylation
Polyphosphorylation of topoisomerase affects its activity
Nsr1 is involved in the processing and maturation of ribosomal subunits, though its exact molecular role is not fully understood . However, some of its interacting partners are known. Nucleolin, the human orthologue to Nsr1, has been shown to bind to topoisomerase 1 (Top1) , an enzyme that relaxes supercoiled DNA by creating a single-strand nick, enabling resolution of supercoils that arise during DNA transcription and replication. This interaction is conserved in yeast; Nsr1 and Top1 interact through their N-terminal region, which in Top1, like in Nsr1, is also highly acidic. In fact, the Top1 PASK domain was shown to be a target of polyphosphorylation . When both proteins are polyphosphorylated, they fail to interact. Under WT conditions both proteins show a dispersed nuclear localization, which is more nucleolar in conditions where polyphosphorylation does not occur .
Charge-wise, DNA and RNA, like polyP, are highly negatively charged molecules. Mechanistically, Top1 transiently breaks the phosphodiester linkage in one DNA strand covalently attaching a tyrosine residue from its active site to DNA phosphate. The mechanism by which polyP is covalently attached to Top1 seems to bear some similarity to how Top1 binds to DNA. Polyphosphorylation results from a nucleophilic attack in this case of a lysine residue on an internal phosphate ester linkage of polyP. For this reason it is not difficult to envisage polyphosphorylation acting as a molecular switch, whereby nucleic acid interacting proteins would be prevented from binding to DNA/RNA when polyphosphorylated. Upon requirement, proteins would be depolyphosphorylated and associate with nucleic acids. In fact, we demonstrated that if Top1 is polyphosphorylated, it has no capacity to relax supercoiled DNA. What was not determined was whether the binding to DNA itself was also compromised. Interestingly, it has been known since the early 1940s that the nuclei of both yeast and mammalian cells are particularly enriched in non-histone proteins with polyacidic regions, containing serine and lysine residues, and whose function is yet to be determined [24–28]. These regions seem to fall into the spectrum of putative polyphosphorylation targets containing the PASK cluster. Some examples are members of the nucleoplasmin family of molecular chaperones ; the chromatin components HMG1 and 2; prothymosin; the transcription factor hUBF ; and Nsr1 itself . Moreover, polyP has been shown to induce the destabilization of chromatin–protein complexes and the subsequent activation of transcription [30,31], establishing a link between these complexes and polyP. The molecular mechanism by which polyP does this is not understood, but it is possible that it could be through polyphosphorylation of specific acidic protein complexes associated with chromatin.
Polyphosphorylation targets lysine residues
Phosphorylation is mostly associated with serine, threonine and tyrosine residues, forming a P–O ester bond (O-phosphorylation). However, there are a few reports on the existence of P–N bonds, through the phosphorylation of the basic residues such as lysine (N-phosphorylation). This is not surprising considering the physicochemical properties of lysine residues, which make it an amino acid subjected to the highest variety of PTMs . A few reported examples of phospholysine residues are a 100-kDa phosphoprotein identified in bovine liver cell extracts , histone H1 and non-histone acidic, highly phosphorylated and acid-resistant proteins detected from regenerating rat liver [34,35]. The histone H3, from quiescent rat endothelial cells, has also been shown to be phosphorylated at basic residues, but the exact chemical nature of these was not determined . Likewise, Nsr1 and Top1 were shown to be targeted on lysine residues by a modification that adds inorganic polyP (N-polyphosphorylation) . The low number of known N-phosphorylation or polyphosphorylation targets is associated with the fact that P–N bonds are acid-labile and therefore prone to spontaneous hydrolysis. This means that they are often overlooked under conventional p-analysis and new methodologies need to be applied to understand the extent and real scope of these modifications .
N-phosphorylation generates high-energy species that could act as catalysis intermediates. In this case one could hypothesize that polyphosphorylated lysine residues could also act as intermediates for the phosphorylation of serine, glutamic or aspartic acid residues known to be abundant within the PASK domain. Alternatively, considering that the interacting partners Nsr1 and Top1 are both polyphosphorylated, one of the proteins could trans-polyphosphorylate the other.
Polyphosphorylation was shown to be non-enzymatic, resulting from a nucleophilic attack on an internal phosphate ester linkage rather than on the attachment of the terminal phosphate . Its non-enzymatic nature raises questions of how this modification might be regulated. Despite being less understood than enzymatic modifications, many non-enzymatic modifications exist. Important examples are protein nitrosylation  and the covalent modification of specific cysteine or methionine residues by reactive oxygen species (ROS) . Until a few years ago, ROS were considered dangerous by-products of the mitochondrial respiratory chain, but they are now known to be signalling molecules that non-enzymatically modify several amino acid side chains [39,40]. Lysine residues are also non-enzymatically acetylated by reactive acetyl–thiol compounds . Such non-enzymatic modifications are unlikely to be the consequence of random events as they are often dependent on certain features: specific structural determinants and the half-lives of target proteins, and the local environment. An example of how the local environment is essential is seen during S-nitrosylation, where the target proteins need to be in close proximity to the NO-producing enzyme. Similarly, polyphosphorylation could be directly influenced by the local environment, through the presence of enzymes that synthetize or hydrolyse polyP, or that depolyphosphorylate the proteins. Nsr1 and Top1, being nuclear proteins, could be modified by polyP in their vicinity. This nuclear polyP could either be synthesized by the vacuolar transport complex (VTC) known to be present at the nuclear membrane  and/or could be acquired from the vacuole through nucleus–vacuole contact sites. Regulation could also occur at the hydrolysis level. PolyP phosphatases may regulate protein polyphosphorylation. There are two classes of phosphatases able to hydrolyse polyP itself: the exopolyphosphatases, which remove one orthophosphate group from the polyP terminus; and the endopolyphosphatases, which cleave the polymer chain internally, generating smaller polyP species. The S. cerevisiae protein Ppx1, of which Prune is the mammalian homologue , represents the prototypical exopolyphosphatase  and is considered the central enzyme of polyP metabolism . Yeast possesses at least two endopolyphosphatases: Ppn1, which requires proteolytic processing to become active , and diphosphoinositol polyphosphate phosphohydrolases (DIPPs), homologous to the mammalian Nudix hydrolases . Both recombinants Ppx1 and Ddp1 showed direct activity towards polyphosphorylated Nsr1 and Top1 . Ppn1 seems to be a major polyphosphatase , but there must be other polyphosphatases since the SDS/PAGE mobility shift can be easily lost during the extraction procedures, even in the absence of Ppn1. All these phosphatases are also active against free polyP and it remains to be seen if there are specific lysine polyphosphatases. The existence of specific phospholysine phosphatases has been reported, the majority of which showing general affinity for P–N bonds with only a few showing specificity for phospholysine . One of the broad spectrum lysine phosphatases is a 56-kDa protein from bovine liver and its human homologue LHPP (phospholysine phosphohistidine inorganic pyrophosphate phosphatase). These were shown to have inorganic pyrophosphatase, phosphohistidine and phospholysine activities [47–49]. Two further broad spectrum phospholysine phosphatases, 30 and 150 kDa, have been isolated from rat brain [50,51]. The 30-kDa, unlike the 150-kDa phosphatase, shows greater specificity towards phospholysine, not being capable of catalysing hydrolysis of phosphohistidine, phosphoarginine, phosphocreatine or any phosphoesters . Other more specific phospholysine phosphatases have been identified in different rat tissues, with the highest activity in the brain and spleen . In the future, it will be essential to determine if any of these or other phosphatases are capable of hydrolysing polyphosphorylated proteins, to study their localization and cellular regulation. Non-enzymatic hydrolysis of the polyP on target proteins should also be considered as an alternative model for how polyphosphorylation is regulated. The local environment could change rapidly in response to changes in basic metabolism. Since P-N bonds are acid labile, localized acidification could contribute to the removal of polyP from polyphosphorylated proteins.
We propose that polyphosphorylation represents a widespread new regulatory mechanism for acidic proteins. Despite not being confined to nuclear proteins (Azevedo and Saiardi, unpublished), the presence in the nucleus of a great number of acidic proteins with PASK-like domains suggests that polyphosphorylation might add an additional layer of regulation to nuclear signalling. In particular, polyphosphorylation might affect one of the most energy-consuming and tightly regulated processes in eukaryotic cells, rRNA biogenesis, where Nsr1 and Top1 play roles. Cell division or mass increase requires a great number of ribosomes. In eukaryotic cells, ribosome biogenesis adapts very rapidly to changes in the intracellular energy status and the environmental conditions. It involves the co-ordinated regulation of three RNA polymerases to produce four ribosomal RNAs and ∼80 ribosomal proteins , so it is not surprising that this is a tightly regulated process. Since polyP synthesis requires a large amount of ATP, the extent of protein polyphosphorylation might directly reflect the intracellular energetic levels. Despite the known roles of polyP in metazoans, the synthesis of polyP remains a mystery since orthologues of neither the bacterial nor yeast genes of polyP metabolism have been identified in their genomes . In the future, it will be important to identify this metabolic pathway in mammals and consequently establish the extent of protein polyphosphorylation in higher eukaryotes to further understand its biological role.
We thank Miranda Wilson and Thomas Livermore for careful reading of the manuscript.
This work was supported by the Medical Research Council (MRC) core support to the MRC/UCL Laboratory for Molecular Cell Biology University Unit [grant number MC_UU_1201814].
Inorganic Polyphosphate (polyP) Physiology: Held at Charles Darwin House, London, U.K., 7 September 2015.