The Tat (twin-arginine transport) pathway is a protein-targeting system dedicated to the transmembrane translocation of fully folded proteins. This system is highly prevalent in the cytoplasmic membranes of bacteria and archaea, and is also found in the thylakoid membranes of plant chloroplasts and possibly also in the inner membrane of plant mitochondria. Proteins are targeted to a membrane-embedded Tat translocase by specialized N-terminal twin-arginine signal peptides bearing an SRRXFLK amino acid motif. The genes encoding components of the Tat translocase were discovered approx. 10 years ago, and, since then, research in this area has expanded on a global scale. In this review, the key discoveries in this field are summarized, and recent studies of bacterial twin-arginine signal-peptide-binding proteins are discussed.
The efficient targeting of proteins to their sites of physiological function is an essential feature of all biological systems. The Tat (twin-arginine transport) system is a protein-targeting pathway found in all kingdoms of life, and the transport mechanism has been explored at the molecular level most extensively in bacteria and the chloroplasts of higher plants. The Tat system is also active in some archaea, and genetic analysis suggests that it is present in plant mitochondria. Moreover, although the Tat system is apparently not a feature of human physiology, recent genomic analysis points to the presence of this protein-targeting system in the mitochondria of at least one animal, albeit a sponge . Approx. 10 years ago, the discovery of the genes encoding this novel transport system [2,3], which was quickly followed by the verification, characterization, and naming of the ‘Tat pathway’ [4,5], generated much excitement because, unlike any other general protein transport pathway, this system has evolved for the transmembrane translocation of fully folded proteins across energy-transducing membranes.
In bacteria, the Tat pathway is central to a range of important cellular functions, since proteins targeted by this system have been shown to be involved in respiratory and photosynthetic electron-transfer chains, cell division and biogenesis of the cell envelope, quorum sensing and motility, symbiosis and pathogenesis [6,7]. In plant chloroplasts, the Tat system is equally important and is principally involved in the assembly of the oxygen-evolving complex and the cytochrome b6f complex within the thylakoid membrane [8,9]. The Tat pathway is clearly of prokaryotic origin, and recent breakthroughs have come from microbial research focused on the Gram-negative bacterium Escherichia coli, the Gram-positive bacterium Bacillus subtilis and members of the industrially and medically important actinomycetes. This review will attempt to integrate the key discoveries unearthed using these micro-organisms with the invaluable biochemical insights that are still being gained by studying Tat transport in plants, the original studies of which laid the foundations for what has become a huge global research effort.
The Tat protein transport system
In principle, all protein-targeting systems must rely on an amino acid ‘signaL' displayed by the passenger protein. Proteins exported by the Tat pathway are synthesized as precursors with N-terminal signal peptides bearing conserved SRRXFLK ‘twin-arginine’ amino acid sequence motifs . Some examples of twin-arginine signal peptides are given in Figure 1. All twin-arginine signal peptides have a common tripartite structure that includes a polar N-terminal (n-) region of variable length, a moderately hydrophobic (h-) region of 12–20 amino acids and a C-terminal (c-) region that often contains basic residues [10,11] (Figure 1A). The consensus motif is always located at the junction between the n- and h-regions . Although there are a handful of exceptions (see, e.g., [9,12,13]), the twin-arginine dipeptide is extremely well conserved and is central to the efficient operation of Tat signal peptides. This sequence conservation, when put in context with the more general tripartite structure of the signal peptide, has enabled bioinformaticians to develop useful tools to help identify twin-arginine signal peptides from genomic data [14,15]. Of course, in silico predictions should always be coupled with a reliable in vivo transport-activity assay, such as those engineered by Tullman-Ercek et al.  or Widdick et al. . The E. coli genome, for example, is predicted to encode 29 proteins bearing twin-arginine signal peptides, and, of these, 27 are genuine Tat-dependent export signals .
Twin-arginine signal peptides
All precursors synthesized with bona fide twin-arginine signal peptides are ultimately transported across a membrane by the twin-arginine translocation apparatus: a membrane-embedded ‘translocase’ that catalyses protein transport. Although the crystal structure of even a fragment of this large transport machinery is not yet forthcoming, 10 years of research has provided a wealth of biochemical, biophysical and genetic data. In general, three types of integral membrane protein have been identified as components of the Tat translocase (Figure 2): TatA (also called Tha4 and Tha9 in the plant chloroplast system), TatB (Hcf106 in plants), and TatC (for a more detailed discussion of Tat components, see other recent reviews, e.g. [18–20]). TatA- and TatB-type proteins are sequence-related and are predicted to share a common structure comprising an N-terminal transmembrane α-helix, followed by an adjacent (basic) amphipathic helix and an unstructured C-terminal region of variable length [21,22]. The TatC protein is the largest, most hydrophobic and most highly sequence-conserved Tat component. Its most common manifestation is as a polytopic protein with six membrane-spanning domains and both N- and C-termini on the cis side of the membrane [5,23,24]. However, gene fusions have been observed in some archaea that give rise to TatC proteins with 14 transmembrane helices (see Figure 2C), which reinforces genetic and biochemical evidence that the minimum TatC unit present in membranes is a homodimer [25,26]. Of these known Tat components, the TatC-type protein is ubiquitous in all prokaryotes and eukaryotes known or predicted to utilize the Tat transport system. In all biological systems proven experimentally to undergo Tat transport, a TatA-type protein is also essential for protein transport, and mutations in the genes encoding either TatC or TatA will impair transport of all associated traffic on the Tat pathway [4,5,27,28].
Components of the twin-arginine translocase
So how and what do each of these types of protein contribute to the transport process? The key biochemical studies have been carried out in bacteria (primarily in E. coli) and in chloroplasts. Two distinct high-molecular-mass complexes have been identified: a ‘transport channel module’ containing multiple TatA protomers [29,30] and a ‘signal-recognition module’ comprising a complex of TatB and TatC [31–33].
Early in the transport process, twin-arginine signal peptides are recognized by the TatBC (Hcf106-TatC in thylakoids) ‘signal-recognition complex’ [31,32]. Cross-linking studies of both E. coli and thylakoidal Tat systems have shown that the important twin-arginine dipeptide is close to the TatC protein, while TatB cross-links most readily to the hydrophobic h-region of the signal peptide [31,32,34]. TatBC varies in size between 370 and ∼650 kDa, depending on the experimental conditions, and comprises equimolar amounts of TatB and TatC [31,33,35–37]. Cysteine-scanning mutagenesis and disulfide cross-linking has shown that, within the TatBC complex, TatB is arranged as a higher-order homo-oligomer, probably at least a tetramer . Consistent with this, a similar study of the E. coli TatC protein has shown that TatC is organized at least as a homodimer and that each of the six transmembrane helices make self-contacts . Taken together, this suggests a tertiary structure in which TatC is located at the periphery of a central TatB bundle [26,38]. Negative-stain electron microscopy has also been used to study TatBC complexes from the Gram-negative bacteria E. coli, Salmonella enterica serotype Typhimurium and Agrobacterium tumefaciens . In each case, oval-shaped particles were observed that were ‘lobeD’, and the analysis of the area covered by each lobe suggested that each could represent a bundle of seven or eight transmembrane helices, enough to correspond to a single TatBC unit .
Under resting conditions, the TatA-comprised ‘transport channel module’ of the Tat translocase exists as a separate complex from TatBC. Low-resolution structural analysis utilizing electron microscopy shows toroid-shaped particles (quite distinct from the oval TatBC particles) with thick walls, channels through the middle and asymmetric ‘lids’ at one of the ends . These isolated TatA complexes were a heterologous mix of sizes. Although the width of the thick walls and the ‘height’ were essentially identical for each size of particle, the key differences lay in the capacity of the central cavities. This structural analysis suggested strongly that different numbers of TatA protomers could combine to generate the different-sized channels needed to transport a diverse array of folded proteins across a membrane .
Clearly, the signal-binding module and the channel module must interact efficiently, and probably extensively, during the protein-export process. Indeed, some preparations of isolated TatBC do contain traces of TatA, suggesting strongly that there is a TatA-binding site within the TatBC complex. The most compelling current model for a ‘Tat-transport cycle’ is based on elegant in vitro work using thylakoid membranes [30,31,34,40,41]. Initial signal peptide binding to TatBC, which is dependent on the transmembrane protonmotive force , induces a conformational change in TatBC that exposes a TatA-binding site [31,41]. Thus signal peptide binding, in conjunction with the protonmotive force, induces assembly of the complete translocase. The binding pocket for TatA may be on the TatC protein, since pre-incubation of thylakoid membranes with a specific anti-TatC antibody prevented the formation of the TatABC translocase . At this point, the signal peptide is extended so that the c-region is now exposed on the trans side of the membrane, and some researchers believe this action is the ‘power-stroke’ that propels the passenger domain across through the channel , although this would probably limit the size of passenger that could be ‘pulleD’. The n-region of the signal peptide remains on the cis side of the membrane throughout this process, and indeed can be covalently bonded to TatC before translocation with no effect on the transport of the substrate [43,32]. This ultimate transport step is probably also energized by the transmembrane proton electrochemical gradient [44,45], although this is controversial. Following transport, the TatABC complex dissociates again, perhaps because the signal peptide somehow disengages itself from its binding site on TatC.
Although the scheme described above fits well with the current data coming from E. coli and thylakoid studies, there are numerous alternative models that can be constructed based on other studies of Tat subunit structure and function. The roles of members of the wider TatA and TatB families is hotly debated in the literature. For instance, TatA-type proteins are less than 100 amino acids in length, but, despite their small size, they have generated a disproportionate amount of controversy. In E. coli, the TatA protein is fully integrated into the inner membrane [29,46,47]. However, the model Gram-positive bacterium B. subtilis contains three copies of TatA and studies of one of these suggests a fraction may be soluble in the cell cytoplasm [48,49]. Similarly the Gram-positive actinomycete Streptomyces lividans has also been shown to contain a proportion of water-soluble cytoplasmic TatA and TatB . This has led to the alternative hypothesis, which resonates with the original ideas of Chanal et al. , that TatA may be a cycling ‘receptor’ subunit. Indeed, peptide array experiments  and surface plasmon resonance data  suggest that the TatA has affinity for signal peptide-bearing precursor proteins in these biological systems.
The topological organization of E. coli TatA is also not without controversy. A recent study using thiol-specific cross-linking suggested that this protein had a topology where the extreme N-terminus was exposed to the cytoplasm . This was a very surprising find, since it contradicts the ‘positive-inside rule’ postulated by Sipos and von Heijne . The evidence for an N-out topology for membrane-bound TatA proteins was strengthened recently by a study of TatA operation in the opportunistic bacterial pathogen Providencia stuartii . The TatA protein in P. stuartii is synthesized as an inactive ‘zymogen’ bearing an N-terminal extension of seven amino acids which is sufficient to completely annul cellular Tat transport activity . Why TatA should be inactivated by this addition remains unclear; however, the extended protein is membrane-bound , which suggests that impairment of membrane integration is not the reason. Very interestingly, Tat translocation in P. stuartii is actually instigated by proteolytic removal of the seven-amino-acid extension from TatA, which activates the protein . It is known that the TatA transmembrane helix is responsible for oligomerization of TatA in the membrane , probably initially into bundles of at least three TatA protomers [29,46] and ultimately as large ring-shaped channels . It is possible that the N-terminally extended inactive TatA is in monomeric form in the membrane and that proteolytic cleavage enables the essential oligomerization of the TatA transmembrane helices to occur [29,46]. This process is useful in inferring the topology of TatA, since the proteolytic event must occur when TatA is already in the membrane. The protease responsible has been unequivocally identified as a member of the GlpG family of rhomboid serine proteases . Crystal structures of these integral membrane proteins are available (see, e.g., [54,55]), and it can be clearly seen that the active site of GlpG family proteases is located closer to the periplasmic side of the membrane [54,55], suggesting strongly that TatA must adopt an N-out orientation in order to be efficiently cleaved and thus activated (Figure 3).
Proteolytic activation of the TatA protein from Providencia stuartii
The TatB protein is also interesting. Sequence analysis clearly shows that TatB and TatA are distantly related and thus probably arose from a common ancestor [21,22]. The Gram-negative E. coli TatB protein and the Hcf106 equivalent in plants are essential for Tat transport of physiological substrates and clearly have physiological roles distinct from their TatA-like cousins [2,56]. However, in most Gram-positive bacteria, with the notable exception of the actinomycetes, the Tat translocation system comprises only TatA-like and TatC-like proteins [27,28]. In these cases, it seems most likely that the Gram-positive TatA proteins retain both biological activities. Indeed, a study using an in vivo genetic screen for successful Tat transport has isolated variant bifunctional E. coli TatA proteins that can bypass the requirement for TatB . This work suggests that most Gram-positive bacteria utlize a Tat system component most closely related to the ancient progenitor of the functionally distinct TatA-like and TatB-like proteins.
Ante-transport events: protein folding and transport quality-control systems
The evidence that the prokaryotic and thylakoidal Tat translocase transports folded proteins is extremely strong [58–64]. The central dogma of the Tat research field has therefore come to be that this system is specifically dedicated to the transmembrane translocation of fully folded proteins. Does this mean that the Tat transporter is physically unable to transport unfolded polypeptides? DeLisa et al.  suggest that the bacterial Tat translocase may be able to sense regions of unfolded protein, much like a housekeeping chaperonin senses the exposed hydrophobic core of a misfolded protein, and thus reject any immature proteins attempting export (designated folding ‘quality controL' ). However, it has been shown for the thylakoid Tat system that at least some malfolded protein can be successfully translocated .
A number of reports have indicated that general housekeeping chaperones such as DnaK and GroEL interact with Tat-dependent substrates [66–69]. Though chaperonin contamination is often seen in recombinant protein and protein–protein interaction studies (see, e.g., [70,71]), Graubner et al.  have demonstrated convincingly that DnaK binds to the CueO full-length precursor and was essential for the export of CueO by the Tat pathway. The latter may be good evidence that the mature domain of CueO cannot reach a stable tertiary fold without the chaperone activity of DnaK and is thus rejected by the Tat translocase quality-control system; more evidence that the Tat system transports folded proteins. The former, however, suggests that either CueO is not completely folded prior to translocation, which raises the questions of when is it correctly folded and when is DnaK released, or that DnaK interacts specifically with the CueO signal peptide and that this interaction is essential for Tat transport. It is known that general chaperones such as DnaK do bind to isolated twin-arginine signal peptides [68,69,72], most probably through interaction with the h-region. This is not altogether surprising, however, since the physiological role of the DnaK chaperonin system is to recognize and bind non-specifically to exposed, unstructured, hydrophobic regions of polypeptide  and this is exactly what Tat signal peptides possess [74,75]. It is unlikely, however, that the interaction between signal peptide h-regions and DnaK is involved directly in the Tat-export pathway, since covalent attachment of the CueO signal peptide on to the mature domain of HiPIP (high potential iron protein) did not confer DnaK-dependence on translocation of the CueO::HiPIP fusion . However, such signal-swap experiments may also muddy the waters somewhat, since the nature of the passenger domain has a direct effect on the function of the Tat signal peptide. For example, one of the most heavily exploited bacterial twin-arginine signal peptides is that of the TMAO (trimethylamine N-oxide) reductase (TorA) from E. coli. The TorA signal peptide has been attached to GFP (green fluorescent protein) [62,63,76], colicin V , PhoA , OEC23 , LepB , MalE , DMSO reductase , hydrogenase-2  and GFOR (glucose/fructose oxidoreductase) . Interestingly, the nature of the passenger protein to which the TorA signal is attached seems to have an influence on its operation at the molecular level. For example, a mutant SRKRFLA TorA Tat motif was incapable of transporting the native TMAO reductase to the periplasm ; however, TorA signals carrying this very sequence still directed export of both GFP  and colicin V . Clearly, understanding and harnessing this effect is of paramount importance if the Tat pathway is to be used for industrial-scale protein production; however, currently, the molecular basis by which passenger proteins exert influence on signal peptide activity remains unknown.
Ante-transport events: cofactor loading and proofreading systems
The E. coli TMAO reductase enzyme is a water-soluble periplasmic protein that contains a single redox cofactor and is synthesized with a twin-arginine signal peptide. The availability of crystal structures for the TorA homologues [83–86], together with its high abundance under certain growth conditions, and its tractability to various in vitro and in-gel activity assays [4,60], has led to this particular periplasmic protein enjoying the attention of Tat researchers from the outset. E. coli TMAO reductase is encoded by the torA gene within the torCAD operon  and is a member of a vast family of molybdenum cofactor-containing redox enzymes [88,89]. Correct loading of the molybdenum cofactor into the TorA apoprotein is essential before Tat transport can proceed , and this suggests either that the Tat translocase simply cannot transport a non-globular/unfolded substrate or that something is repressing signal peptide activity until the cofactor is loaded . The reality is probably a combination of both.
In the case of TorA, a dedicated signal-peptide-binding cytoplasmic protein has been positively identified. E. coli TorD is a 199-amino-acid cytoplasmic protein that has been shown to bind specifically to the TorA apoenzyme and maintain it in a conformation suitable for cofactor loading [90–95]. Although it seems likely that TorD binds to TorA at more than one locale [80,92], in vivo two-hybrid analysis  and in vitro calorimetry  have established unequivocally that TorD binds directly to the TorA twin-arginine signal peptide. Optimum binding by TorD appears to involve a 27-residue core region of the TorA signal peptide stretching from the twin-arginine motif, through the h-region, to another arginine dipeptide at the C-terminus . The twin-arginine dipeptide itself is not critical for the binding since a transport-inactive ‘twin-lysine’ variant signal peptide is also recognized by TorD . In addition, calorimetry clearly showed that TorD binds to the TorA signal peptide in an energetically favourable reaction with an apparent dissociation constant (Kd) for the 27-residue synthetic peptide of ∼1 μM . Similar calorimetry experiments using a fusion of the fulllength TorA signal peptide (39 amino acids) to the C-terminus of maltose-binding protein point to an apparent Kd of ∼ 60 nM for TorD binding (J. Maillard and F. Sargent, unpublished work). Having established that TorD does bind to the TorA signal, the next pressing questions are why does TorD attach to the signal peptide and when and how is the signal peptide ‘handed-over’ to the Tat translocase?
Biophysical studies of the twin-arginine signal peptides of the Allochromatium vinosum HiPIP  and the E. coli SufI protein  suggest that Tat signal peptides are largely unstructured in aqueous solution. Such peptides are undoubtedly susceptible to opportunistic (or otherwise) binding by housekeeping chaperonins (see previous section), and probably also proteolytic enzymes. Indeed, when either the TorA apoenzyme or a fusion of the TorA signal peptide to GFP was synthesized in vivo, the signal peptide was rapidly degraded unless excess TorD was provided [94,97]. It is therefore conceivable that the role of TorD is to ‘protect’ the signal peptide from the molecular environment of the cell cytoplasm before transport [94,97]. However, experiments designed to unravel the physiological activity of TorD suggest that it has other functions in addition to protecting TorA from proteolysis .
E. coli hydrogenase-2 is a heterodimer of a catalytic α-subunit (HybC) and an electron-transferring β-subunit (HybO), which also bears the twin-arginine signal peptide . The αβ dimer is transported to the periplasm as a fully active folded complex by the Tat translocase [4,99]. Because HybC has no signal sequences whatsoever, it is clearly important that assembly of the HybOC dimer is carefully co-ordinated in order to prevent premature export of HybO without its HybC partner. The key to this co-ordination lies in the primary structure of the native HybO signal peptide, since swapping it for the TorA signal peptide impaired assembly of the hydrogenase and resulted in premature targeting of HybO to the membrane without its HybC partner . The removal of the native signal peptide probably also removed the activity of a biosynthetic accessory protein (HybE) whose job was to assist in the assembly of HybOC [80,100]. Most surprising, however, was that the introduction of the TorA signal peptide to the hydrogenase system also introduced a requirement for TorD for the correct assembly of this otherwise completely alien enzyme . Overproduction of TorD in the strain expressing the TorA::HybO fusion protein led to the return of cellular hydrogenase activity because the HybOC dimer was now correctly assembled before export .
The ability of the TorD/TorA signal peptide couple to direct assembly of the hydrogenase provided a facile in vivo assay for TorD activity. Moreover, using this assay, it was possible to concentrate solely on any activity associated specifically with signal peptide binding. TorD family proteins contain a highly conserved EPXDH amino acid motif where the aspartate residue is invariable and the histidine residue is only very occasionally replaced by tyrosine [92,101,102]. Variant TorD proteins in which alanine had been substituted for either the histidine or the aspartate residue were no longer able to rescue hydrogenase activity in the strain producing a TorA::HybO fusion . However, the in vitro kinetics and thermodynamics of signal peptide binding by the variant proteins remained similar to that of the native TorD protein . Thus the TorD DH dipeptide is required for a biochemical activity linked to the signal binding event, suggesting that TorD is not just stearically hindering non-specific attack by cytoplasmic proteases.
The work of Jack et al. [80,103], taken together with the previous hypotheses of others [60,104–106], has addressed the why and when questions posed above and led to the suggestion that E. coli TorD is the paradigm representative of a family of proteins acting as ‘Tat-proofreading chaperones’. The term ‘proofreading’ when used in the context of Tat transport relates to a quality-control process distinct from that exerted by the Tat translocase itself , in which soluble binding proteins prevent protein export by shielding the signal peptide from the Tat translocase and suppressing transport activity until all assembly processes are complete [80,103]. It is not known whether the chaperone-mediated Tat-proofreading process is exclusive to prokaryotes; however, it is interesting to note that a heterodimeric Tat substrate has been identified in algae in which one signal-less substrate probably ‘piggy-backs’ on another signal-peptide-bearing protein . By analogy with broadly similar prokaryotic Tat substrates, signal-peptide-binding chaperones may well be at work in this system.
An answer to how TorD recognizes and binds the TorA signal peptide is expected from the structural biologists. The sequence databases contain hundreds of members of the wider TorD family  and the crystal structures of three have been solved. The TorD homologues from Shewanella massilia , Salmonella Typhimurium (PDB code 1S9U) and Archaeoglobus fulgidus  exhibit essentially identical novel all-helical folds that are completely devoid of β-strands. The A. fulgidus protein has been termed NarJ, since the genetics at least suggest that it is required for the assembly of a Tat-dependent nitrate reductase of the NarG-type [109,110]. The crystal structure of A. fulgidus NarJ may provide a tantalizing glimpse of the mode of peptide recognition by this family of proteins since the N-terminal His-tag of one protomer appears to be tightly bound within a hydrophobic cavity of a second protomer in the crystal structure . The binding site is part of a funnel-shaped cavity on the surface of NarJ (Figure 4). The N-terminus of the A. fulgidus NarG contains an SRRDFIK motif [109–111], which closely follows the SRRXFLK consensus for prokaryotic Tat signal peptides . Kirillova et al.  speculate that the key interacting His-tag residues (a YFQ tripeptide) may be adopting a conformation that mimics that of the twin-arginine motif displayed by NarJ's cognate binding partner NarG. The bound YFQ tripeptide is suggested to correspond to the DFI tripetide (XFL of the consensus sequence) of the A. fulgidus NarG signal peptide twin-arginine motif [91,93]. However, if A. fulgidus NarJ binds as specifically to A. fulgidus NarG as E. coli TorD binds to the E. coli TorA signal peptide , then the twin-arginine motif, which is common to all Tat signals, is probably not the dominant driver of the interaction. Paradoxically, if A. fulgidus NarJ is able to recognize a YFQ motif in place of the native DFI tripeptide, then the extreme sequence selectivity shown by E. coli TorD (which will not bind to any other Tat signal peptide tested other than that of E. coli TorA (, and J. Maillard and F. Sargent, unpublished work) must not be a universal feature of the TorD family.
The TorD family protein Af0173 (NarJ) from Archaeoglobus fulgidus
Salmonella DmsD is essentially identical with E. coli DmsD, which was the first twin-arginine signal-peptide-binding protein to be identified . The E. coli DMSO reductase is a trimeric complex comprising the DmsA catalytic subunit, which contains a molybdenum cofactor identical with that bound by TorA and an additional FeS cluster as cofactors, the DmsB protein, which contains four FeS clusters, and an integral membrane protein termed DmsC , which anchors the DmsAB dimer to the periplasmic side of the membrane . DmsA is closely related to TorA and can also use TMAO as a substrate . Recent in vitro calorimetric experiments have shown E. coli DmsD to interact with an N-terminal fusion of the E. coli DmsA Tat signal peptide to GST (glutathione S-transferase) with an apparent Kd of 200 nM . Similar experiments have demonstrated that Salmonella DmsD interacts with the full-length Salmonella DmsA Tat signal peptide when synthesized as a C-terminal fusion to maltose-binding protein with an apparent Kd of ∼100 nM (J. Maillard and F. Sargent, unpublished work).
Given the ever increasing body of evidence that demonstrates that TorD family proteins bind to particular twin-arginine signal peptides, it would be easy to hold up TorD as the mechanistic and structural paradigm for all Tat signal-recognition proteins. However, emerging evidence from enzyme systems both closely related to and completely distinct from the molybdenum-dependent TMAO/DMSO reductase systems, suggests that this is clearly not the case. The most striking proof of structural and mechanistic diversity in twin-arginine signal-peptide-binding proteins comes from studies of another molybdenum-containing enzyme in E. coli.
The E. coli napFDAGHBC operon encodes a periplasmic nitrate reductase . The napA gene encodes the catalytic subunit of the reductase, a periplasmic protein that binds a molybdenum cofactor identical with that found in TorA and DmsA, together with an FeS cluster . Consistent with this cofactor content and subcellular location, NapA is synthesized as a precursor with an N-terminal twin-arginine signal peptide . The other proteins encoded by the nap operon have roles either in electron transfer from quinone to NapA  or in NapA biogenesis [120,121]. The similar cofactor requirements of NapA, TorA and DmsA  suggest that NapA should also be subject to Tat proofreading during assembly. However, even with the vast amount of prokaryotic genomic data available, a torD gene has never been reported to be genetically linked to a nap operon. Indeed, primary-sequence analysis suggests that NapA followed a different evolutionary line from TorA and DmsA [89,122], which could suggest that any Tat-proofreading or other biosynthetic mechanisms are also different.
The E. coli napD gene encodes a small cytoplasmic protein that is essential for NapA activity [116,120,123,124], and two-hybrid experiments have shown NapD binds to the full-length NapA precursor in the bacterial cytoplasm . Recently, Maillard et al.  used calorimetry to demonstrate direct binding of NapD to the NapA twin-arginine signal peptide in vitro. This interaction was very tight, with an apparent Kd of ∼7 nM. The physiological role of the signal-peptide-binding activity displayed by NapD is probably similar to that of the TorD family of peptide-binding proteins. Indeed, overproduction of NapD can suppress the targeting activity of the NapA signal peptide, suggesting that NapD is able to retard translocation of NapA to prevent premature export: the very definition of ‘Tat proofreading’ . Very importantly, however, Maillard et al.  went on to solve the three-dimensional structure of NapD by NMR methods, and this type of Tat signal peptide binding protein was revealed to adopt a β-α-β-β-α-β ‘ferredoxin-type’ fold (Figure 5), which is completely distinct from the all-helical fold of TorD family proteins [110,123]. The four β-strands assemble together to generate a single antiparallel β-sheet that forms one face of the protein, while the two α-helices together form the opposite face (Figure 5). Clearly, the huge structural differences between E. coli NapD and A. fulgidus NarJ means that there is no analogous signal-peptide-binding funnel identifiable on NapD. Further NMR experiments instead identified a patch of residues within the β-sheet face of NapD where the local environment was disrupted upon peptide binding, suggesting strongly that the peptide-binding site lies within the β-sheet (Figure 5) . This is consistent with what is known about the functions of other members of the ferredoxin-like family; indeed, so often is the β-sheet region of such proteins implicated in mediating protein–protein or protein–ligand interactions that it has been described as a ‘supersite’ for binding .
The NapD signal-peptide-binding protein from E. coli
E. coli TorD and E. coli NapD are therefore the paradigm members of two large and different families of twin-arginine signal-peptide-binding proteins. The utterly dissimilar structures of these two families probably points to unrelated molecular mechanisms of signal peptide recognition and demonstrates convergent evolution at work in a key biological process in the most dramatic fashion. Indeed, it is very likely that further families of specific Tat signal-binding proteins exist in the prokaryotic world. Genomic analysis has predicted possible Tat-proofreading chaperones for a number of complex bacterial Tat substrates [6,18]. Although hard biochemical data on signal peptide binding is not yet available, structural genomics consortia are beginning to release structural information on some of these proteins. For example, the NMR solution structure of E. coli HyaE reveals a thioredoxin-like fold (PDB code 2HFD), unlike that of TorD or NapD. There is some genetic, but as yet no biochemical, evidence that this protein interacts with the precursor form of a Tat-dependent hydrogenase . In addition, the crystal structure of the FdhE protein from Pseudomonas aeruginosa has been solved (PDB code 2FIY). FdhE is a rubredoxin homodimer and has a fold unrelated to that of NapD, TorD or HyaE. In E. coli, the fdhE gene is essential for the biosynthesis of Tat-dependent formate dehydrogenases (multi-subunit molybdenum-containing enzymes), but is not required for the assembly of non-exported formate dehydrogenases . There is currently no evidence that this protein binds directly to a twin-arginine signal peptide or even to a Tat-dependent precursor protein.
Taken together, these studies highlight the amazing structural diversity of chaperone-mediated Tat-proofreading systems either known or predicted to be operating on the bacterial Tat pathway. It is worth reiterating that all twin-arginine signal peptides must interact efficiently with the Tat machinery to facilitate export, and the twin-arginine motif is one of the keys to the transport activity. Where, then, does the very strict chaperone specificity come from? How are the essential structural features for transport balanced with the equally essential structural features for chaperone binding? One future challenge will be to isolate each activity for a selection of signal peptides and attempt to unravel the molecular basis of this dual functionality. Indeed, these questions also overlap with understanding how a Tat signal peptide is handed over from a cytoplasmic chaperone to the Tat translocase.
Post-transport events: peptide processing
The vast majority of known and predicted Tat substrates are water-soluble globular proteins. Although most, if not all, will have adopted their native folds before the export event, there are still biosynthetic hurdles to overcome after translocation. Some Tat-dependent bacterial proteins, in particular those that contain certain types of copper cofactor [127,128], do not bind their prosthetic groups until after translocation. In addition, some Tat-dependent prokarytotic proteins may undergo a significant covalent modification by the addition of a lipid group following transport [17,129]. The Shewanella DMSO reductases, for example, are Tatdependent molybdenum-containing enzymes that are predicted to be lipidated following export across the cytoplasmic membrane and then must be secreted across the outer membrane (probably via the Type II apparatus ) to the cell surface . However, far and away the most common, and the most significant, post-translational event in the biosynthesis of the vast majority of Tat substrates is proteolytic cleavage of the twin-arginine signal peptide.
Signal peptide cleavage probably occurs at a late stage in the transport process, probably well after complete transport of the globular domain. What is the evidence for this? First, not all Tat substrates have cleaved signal peptides. The Rieske FeS proteins, which form part of the cytochrome b6f complex in plant thylakoids, and the cytochrome bc1 and some nitrate reductase complexes in prokaryotes, are synthesized with functional N-terminal Tat signals that then act as transmembrane signal anchors [9,132–135]. Clearly, if signal removal was a critical early step in the export process, biosynthesis of this type of protein would be impossible. Secondly, both prokaryotic and thylakoidal Tat substrates that are known to possess cleavable signal peptides contain obvious protease-recognition sequences within their c-regions. In the case of prokaryotes, this is either an AxA ‘type-I’ signal peptidase motif  or an [L/I/G/A][A/G/S]C ‘lipo box’ motif for ‘type-II’ signal peptidases (reviewed in ), and in the case of thylakoid Tat substrates there is usually an AXA TPP (thylakoidal-processing protease) recognition motif . The active site of each of these proteases is located on the trans side of the membrane [137,138], which means cleavable signal peptides must be fully extended, either through the Tat channel or across the bilayer, to be accessible. Again, given that the signal peptide first interacts with the Tat apparatus in a loop conformation , full extension of the signal peptide would only be expected near the end of the transport process. Indeed, recent evidence suggests that signal processing occurs at such a late stage that the signal has probably been released from the Tat apparatus following translocation of the passenger domain. In the study of Gérard and Cline , the N-terminus of a twin-arginine signal peptide was first covalently cross-linked to thylakoidal TatC before transport was initiated. Despite tethering the N-terminus signal peptide at the cis side of the membrane, the passenger domain could still be translocated . Significantly, however, the signal peptide remained unprocessed, suggesting strongly that TPP had no access to this very-late-stage complex between the Tat machinery and a Tat-dependent precursor .
Taken together, it is conceivable that all signal peptides behave, albeit fleetingly, as N-terminal signal anchors at a late stage of translocation. In this model, once fully extended, the signal peptide would escape laterally from the Tat complex through a designated ‘side-gate’ in either the Tat channel or the signal recognition complex, or perhaps be released to lipid if dissipation of the Tat subunits occurs at the end of translocation. The hydrophobicity of the lipid would induce spontaneous α-helix formation in the Tat signal peptide, which would then be subjected to proteolysis if appropriate. The escape of the signal peptide must follow on very quickly after full extension of the signal peptide, since overloading of the Tat pathway with substrate can lead to processing of the signal before complete translocation of the passenger domain, resulting in back-sliding of processed protein . This model also suggests that any mutagenesis of the signal peptidase-recognition motifs within the signal peptides, or inactivation of the signal peptidases themselves, could result in signal-anchored membrane-bound Tat substrates. In studies of the thylakoid Tat system using isolated chloroplasts, removal of the TPP-recognition sequence from the signal peptide of a heterologous Tat substrate (GFP) prevented proteolytic processing but did not impair Tat transport of the reporter protein . In the E. coli system, mutagenesis of the AXA motif of the DmsA Tat signal peptide had no effect on the assembly or physiological activity of the DmsABC DMSO reductase complex . In this case, however, it appeared that the signal peptide was either being opportunistically cleaved at a different location or the NXN substitution used was not sufficient to block processing by LepB .
Why is it so important to know what happens to the signal peptide at the end of translocation? As well as possible medical and biotechnological applications (discussed below), the route of escape taken by Tat signals from the Tat apparatus may be related to the molecular mechanism of membrane protein integration adopted by the Tat system. In both bacteria and chloroplasts, a subset of Tat-targeted proteins are genuine integral membrane proteins synthesized with single internal or C-terminal transmembrane helices [140,141]. In all cases, precursors are synthesized with N-terminal twin-arginine signal peptides that are immediately linked to globular water-soluble domains, which are followed by hydrophobic transmembrane helices. Integration of the transmembrane helices into the lipid bilayer is strictly Tat-dependent [140,141]. However, the very existence of Tat-targeted integral membrane proteins poses a number of tricky questions. For instance, early on in the biosynthetic life of these types of Tat substrates, there are clear issues of solubility: how, for example, is aggregation of such exposed hydrophobic helices prevented? Indeed, in the case of the chloroplast proteins, how is such a protein successfully negotiated through the envelope membranes before being integrated into the thylakoid membrane? There appears to be nothing unusual in terms of overall hydrophobicity or amino acid content about Tat-targeted transmebrane helices when compared with those of other membrane proteins . It is possible that cytoplasmic chaperonins or dedicated accessory proteins could mask such helices before transport, in much the same way that a Tat-proofreading chaperone would mask an exposed signal peptide. However, such binding proteins are yet to be found. The successful Tat-dependent translocation of an otherwise folded protein bearing a long exposed hydrophobic helix raises questions, again, about the ability of the Tat translocase to recognize and actively reject unfolded polypeptides: the ‘quality-controL' mechanism . Clearly, Tat-dependent membrane proteins are not rejected. Unfortunately, very little is known about the molecular mechanisms of either process to enable further comment.
Medical and biotechnological applications
Invading bacterial plant and animal pathogens often evade host cell responses by the expression of numerous and diverse ‘virulence factors’ (reviewed by Finlay and Falkow ). The bacterial Tat system itself has been found to be an important virulence factor in plant [143–146] and animal [147–152] pathogens. Since the Tat translocase is not a feature of human or other higher animal physiologies, the role of the Tat pathway in the virulence of animal pathogens has attracted some interest from the biomedical sector as a possible target for novel drug development. Why does the loss of Tat transport compromise bacterial virulence? There may well turn out to be a different explanation for each bacterium under investigation. For example, disruption of Tat transport in the E. coli can have gross pleiotropic effects on outer membrane integrity and cell division [153,154], which would undoubtedly lead to increased susceptibility to host defences. Alternatively, the non-export of a single important Tat-dependent virulence factor, such as the Tat-dependent hydrogenase of Helicobacter pylori , would also be sufficient to compromise virulence. Furthermore, in Ps. aeruginosa, the Tat translocase has been shown to operate as the initial export machinery in the two-step secretion of phospholipase virulence factors .
Because most Tat mutant strains that have been engineered in bacteria are generally still viable, a drug that could specifically inhibit the Tat translocase would probably not kill a pathogenic bacterium outright, but may serve to prevent infection and/or proliferation. In addition, commensal bacteria that can survive under fermentative conditions, or that do not utilize Tat-dependent transport (e.g. lactobacilli), would, in principle, not be adversely affected by such an anti-infective agent. A drug that would specifically inhibit Tat transport could act in a number of ways, for example to affect the mechanism of energy transduction (which is currently unknown, but may be related to that used by the mitochondrial F1Fo ATPase [136,156]), the mechanism of twin-arginine signal peptide recognition, or perhaps the mechanism of disengaging signal peptides or transmembrane helices from the TatA channel. Since these latter two key activities involve peptide recognition, a peptide antimicrobial could be the answer to providing specific inhibition of the bacterial Tat pathway. Antimicrobial peptides are usually less than 40 amino acids long and are produced naturally by all animals and plants . It is conceivable that, by combining research into the physiological role of the Tat pathway in model laboratory organisms such as E. coli, Salmonella or Pseudomonas with the latest peptide synthesis and rapid screening technologies, a powerful selection protocol could be developed for identifying anti-Tat transport peptides that would mimic a Tat-dependent signal or C-tail.
In addition to attracting interest as an antibacterial target, the Tat system is also gaining increasing attention from the biotechnology sector, principally because of its remarkable ability to move folded proteins across biological membranes. The application of this system for biotechnological purposes has recently been extensively reviewed by Brüser ; however, there are a few interesting advances worth reiterating here. Harnessing the ability of the Tat system to actively select for folded substrates would undoubtedly increase the efficiency of large-scale protein production protocols. Fisher et al.  addressed this very issue and, using an ingenious tripartite fusion with a Tat signal at the N-terminus, a β-lactamase at the C-terminus and the protein of interest sandwiched in the middle, were able to actively select for variant fusion proteins with increased water solubility. The Tat system has also been exploited in several other protein-engineering projects, including a novel two-hybrid system for screening for protein–protein interactions , production and engineering of single-chain antibodies [64,161] and for phage display [162,163]. Some bacteria, in particular Gram-positive Streptomyces spp., export large numbers of diverse substrates by the Tat system and might be ideal hosts for the heterologous production of pharmaceutically important proteins that are incompatible with secretion by the Sec pathway .
The Tat protein transport system, which was given its name just over 9 years ago , has flourished into a large and dynamic research field, as is evident from the number of papers cited in this review, and the equal number of papers regrettably left out owing to space constraints. There are emerging signs that scientists may be beginning to agree on the basic principles by which the Tat system operates. The future acquisition of high-resolution structural information will pave the way for molecular understanding of Tat transport and for further exploitation of this remarkable protein transport pathway.
Colworth Medal Lecture Delivered at the SECC, Glasgow, on 12 July 2007 Frank Sargent
I dedicate the 2007 Colworth Medal Lecture to my wife Tracy Palmer and our boys James and Jack. I thank David J. Richardson (University of East Anglia) and Ben C. Berks (University of Oxford) for guidance, support and many useful discussions. I acknowledge the hard work and dedication of the numerous postdocs, students and technical staff who have worked with me in Norwich. You have all contributed something important to this prize award. Thanks also to members of Tracy Palmer's and Ben Berks's research groups, past and present, for their valuable input over the years. In addition to Tracy, Ben and David, I also thank Stuart Ferguson (Oxford), Tony Pugsley (Paris), Ray Dixon (Norwich), George Georgiou (Austin) and Dave Kelly (Sheffield) for supporting me in my early research career. I also acknowledge The Royal Society, the BBSRC (Biotechnology and Biological Sciences Research Council), and the John and Pamela Salter Charitable Trust for financial support.