LD motifs (leucine–aspartic acid motifs) are short helical protein–protein interaction motifs that have emerged as key players in connecting cell adhesion with cell motility and survival. LD motifs are required for embryogenesis, wound healing and the evolution of multicellularity. LD motifs also play roles in disease, such as in cancer metastasis or viral infection. First described in the paxillin family of scaffolding proteins, LD motifs and similar acidic LXXLL interaction motifs have been discovered in several other proteins, whereas 16 proteins have been reported to contain LDBDs (LD motif-binding domains). Collectively, structural and functional analyses have revealed a surprising multivalency in LD motif interactions and a wide diversity in LDBD architectures. In the present review, we summarize the molecular basis for function, regulation and selectivity of LD motif interactions that has emerged from more than a decade of research. This overview highlights the intricate multi-level regulation and the inherently noisy and heterogeneous nature of signalling through short protein–protein interaction motifs.
LD motifs (leucine–aspartic acid motifs) are short α-helical amphipathic protein–protein interaction motifs with a degenerate sequence consensus, as are NR (nuclear receptor) box motifs, NES (nuclear export signals), dileucine endocytosis motifs and many others. Moreover, the same LD motifs can be recognized by LDBDs (LD motif-binding domains) with different 3D structures. This promiscuity and diversity raise the question of how LD motifs are selectively identified by their target molecules. Answering this question is important for understanding the biological processes in which these motifs are involved, and for understanding how dysfunctions in LD motif-mediated networks can result from pathogens or contribute to diseases such as cancer.
In the present review, we summarize more than 10 years of research on the molecular basis for LD motif recognition, regulation and function. The data presented show that the intrinsic promiscuity and cross-reactivity of LD motifs require control by multiple layers of regulation and selection.
PAXILLIN LD MOTIFS
Paxillin is a non-catalytic scaffolding protein that consists of a flexible N-terminal region and four C-terminal double zinc finger LIM (lin-11/isl-1/mec-3) domains [1,2]. Through its many interactions, paxillin is capable of localizing to the intracellular side of cell adhesion structures. At these sites, paxillin and bound molecules orchestrate the dynamic changes in signalling, cytoskeletal reorganization and gene expression that result in control of cell migration and survival . LD motifs were first discovered in paxillin and the homologous Hic-5 [hydrogen peroxide-inducible clone 5; also known as TGFB1I1 (transforming growth factor β-1-induced transcript 1) and ARA55 (androgen receptor-associated protein of 55 kDa)] through their capacity to bind to FAK [FA (focal adhesion) kinase] and vinculin, two FA proteins . These LD motifs, named after their core consensus sequence LDXLLXXL, are conserved regions in the N-terminal 300 amino acid part of paxillin (Figures 1 and 2). Subsequently, a number of cellular proteins that interact with the LD motifs in paxillin were identified (Figure 1 and Table 1). Moreover, the E6 protein from PV (papillomavirus) binds to paxillin, suggesting that pathogens have learned to exploit LD-mediated signalling for their purposes [5,6]. All paxillin family members (in humans, paxillin, leupaxin and Hic-5) contain multiple conserved LD motifs, although their number and spacing are not necessarily the same. For example, whereas paxillin has five motifs (LD1–5), Hic-5 and leupaxin lack LD3, and leupaxin LD2 appears non-functional or functionally divergent due to non-homologous substitutions of key residues (Figures 1 and 2) . The multiplicity of LD motifs in all members, and the specific loss of one motif in most members, raise the question of whether and how a particular number and sequence of motifs is linked to a particular biological task.
LD motif proteins
LD motif sequences
|Paxillin LD motif|
|FAK||9 (SPR)1*||2 (SPR)1||1 (SPR)1||CD4 (4.5/16) (ITC)2|
|9/12 (ITC)3‡||24/4 (ITC)2‡||DLC1|
|CCM3||17 (SPR)5||39 (SPR)5||23 (SPR)5|
|GIT||25 (ITC)7||7,10 (ITC)7,8|
|α-Parvin||96 (NMR)9,10||204 (NMR)9,10||2300 (NMR)9,10||140 (NMR)9,10||mM (NMR)9,10|
|53 (SPR)||76 (SPR)||55 (SPR)|
|β-Parvin||27 (SPR)10||42 (SPR)10||73 (SPR)10|
|PV E6 (BE6 and 16E6)||X11||X11||X11||E6AP11|
|Direct LD interaction to be determined|
|Direct LD interaction controversial|
|Paxillin LD motif|
|FAK||9 (SPR)1*||2 (SPR)1||1 (SPR)1||CD4 (4.5/16) (ITC)2|
|9/12 (ITC)3‡||24/4 (ITC)2‡||DLC1|
|CCM3||17 (SPR)5||39 (SPR)5||23 (SPR)5|
|GIT||25 (ITC)7||7,10 (ITC)7,8|
|α-Parvin||96 (NMR)9,10||204 (NMR)9,10||2300 (NMR)9,10||140 (NMR)9,10||mM (NMR)9,10|
|53 (SPR)||76 (SPR)||55 (SPR)|
|β-Parvin||27 (SPR)10||42 (SPR)10||73 (SPR)10|
|PV E6 (BE6 and 16E6)||X11||X11||X11||E6AP11|
|Direct LD interaction to be determined|
|Direct LD interaction controversial|
Binding not observed by all authors.
The exact mode of interaction is controversial and involves more than just an LD motif of paxillin. Value shown is for LD1 binding to RRM2.
Binding affinities for sites 1/4 and 2/3 respectively.
I. Barsukov and G. Roberts, personal communication.
The N-terminal part of paxillin family members, which is predicted to be unstructured, also contains other protein–protein interaction sites besides LD motifs, including a proline-rich SH3 (Src homology 3)-binding motif and multiple sites for serine, threonine and tyrosine phosphorylation . With this N-terminus, paxillin interacts directly or indirectly with a large number of cytoplasmic signalling and cytoskeletal proteins [1,2]. The C-terminal LIM domains are required to target paxillin to FAs, reportedly through an association with the cytoplasmic tail of β-integrin , although this interaction remains controversial. LIM domains also mediate protein interactions with several structural and regulatory proteins, including tubulin and the tyrosine phosphatase PTP (protein-tyrosine phosphatase)-PEST (Pro-Glu-Ser-Thr).
The zinc-binding LIM domain region also localizes paxillin family members to the nucleus. Hic-5 exhibits zinc-dependent direct DNA binding , whereas the leupaxin and Hic-5 LIM domains are required for association with NRs [leupaxin, AR (androgen receptor); Hic-5, AR and GR (glucocorticoid receptor)] [10,11]. In addition, paxillin was shown to bind to the AR . By interacting with NRs and with other co-regulators, paxillin-family members can affect gene expression [10,12,13]. Selected LD motifs of paxillin (LD2 and LD4) [14,15], leupaxin (LD4 and LD5)  and Hic-5 (LD4)  have been shown to function as a NES to allow these proteins to shuffle between the nucleus, cytoplasm and FAs (LD numbering as in Figure 1; different from some authors).
By combining a FA/nuclear localization LIM region with a signalling scaffold and NES, paxillin family proteins have become important links between the extracellular environment (sensed through integrins and various classes of growth factor receptors), cell adhesion and gene expression (through NRs and/or DNA interactions). As such, paxillin family proteins belong to a set of ancestral key molecules that enable multicellularity . Indeed, PaxB, a paxillin homologue from the amoeba Dictyostelium, is required for cell-substrate adhesion, cell migration and multicellular aggregation behaviour . In higher eukaryotes, the interactions between paxillin family proteins and their many ligands regulate processes such as embryogenesis, wound healing and cancer invasiveness [19–22].
LD MOTIFS FROM OTHER PROTEIN FAMILIES
Owing to the importance of LD motifs in evolution, biology and disease, there is great interest to identify such motifs in proteins not related to paxillin. However, a sequence search using a degenerate consensus appears insufficient for finding LD motifs with good confidence, because the amphipathic LD motif pattern is frequently found in helices of all sorts of protein domains, leading to a prohibitively high number of false positives for genome-wide searches. Moreover, the LD motif consensus has similar features to other short amphipathic helical interaction motifs, such as LXXLL motifs implicated in transcriptional regulation  and the NES . In the absence of a reliable genome-wide detection algorithm for LD motifs, the currently known non-paxillin LD motifs were established one by one using cell biology and biochemical methods. Potential LD motif consensus sequences were only confirmed in three cellular proteins, namely gelsolin (an actin binding, severing and capping protein mediating osteoclastic actin cytoskeletal organization) , the DLC1 (deleted in liver cancer 1) tumour suppressor gene  and the RoXaN (rotavirus ‘X’-associated non-structural) protein  (Figures 1 and 2). Of these, only DLC1 and RoXaN adhere to the strictest LD motif consensus, because gelsolin LD has only an alanine instead of the bulky hydrophobic leucine/methionine at position +3 (Figure 2). These proteins are functionally related to paxillin; both DLC1 and gelsolin are important in functions of cellular adhesion structures, whereas RoXaN interacts with a major paxillin ligand, PABP1 [poly(A)-binding protein 1], and thus may contribute to targeted delivery of mRNA to adhesion sites (detailed below) [14,25–27].
The ubiquitin ligase E6AP (human PV E6-associated protein) and the calcium-binding E6BP [E6-binding protein, also known as RCN2 (reticulocalbin 2, EF-hand calcium-binding domain) and ERC55] are often mentioned in the context of LD motifs [6,28–30]; however, strictly speaking, the motifs of these proteins do not conform to the LD motif consensus. The E6AP motif lacks a negative charge at position +1 and has a glutamic acid instead of the bulky hydrophobic residue at position +7, whereas the E6BP motif has unusually large hydrophobic residues at positions +3 (phenylalanine) and +7 (tyrosine) and is part of a functional calcium-binding EF-hand structure . This structural context is unusual, because all other currently known LD motifs are not part of a folded 3D domain. Rather, the LD motifs of the paxillin family, RoXaN, DLC1 and E6AP are localized in regions that are most probably largely unstructured, according to computational sequence analysis (results not shown). The gelsolin LD motif is localized within a folded structure, but this LD motif forms the protein's C-terminal helix, placed at the end of a ten-residue linker . As such, the gelsolin LD motif helix is only weakly and reversibly bound to other independently folded domains of gelsolin. All of the structural data available to date confirm that LD motifs form helices upon binding to their ligands (Figures 3 and 4). In the absence of ligands, the stability of the helix conformation may vary, because paxillin LD2, but not LD4, forms a stable helix in solution [32,33]. However, all other LD motifs have a propensity to form helices, as indicated by secondary structure predictions.
Molecular recognition of LD motifs
Several different ligand domains have evolved the capacity to recognize LD domains (Figures 3 and 4). These ligands bind LD motifs with affinities in the micromolar range, typical for many dynamic peptide–ligand interactions in signalling  (Table 1). LDBDs use structurally diverse architectures to bind LD motifs. The majority of known LDBDs use a C-terminally located domain that is similair to the FAT (FA-targeting) domain of FAK. In the present review we refer to these domains as FAH (FAT-homology) domains. However, C-terminal FAH domains do not necessarily bind LD motifs [35–37].
The FAK FAT domain
FAK is composed of a central tyrosine kinase domain, flanked by an N-terminal FERM (4.1/ezrin/radixin/moesin) domain and a C-terminal FAT domain, which is attached to the kinase domain by a 220-residue-long flexible linker. The scaffolding and kinase functions of FAK are central for FA assembly, disassembly and signalling. FAK has also numerous kinase-independent functions in other subcellular sites, including the nucleus [21,38,39]. The interaction between FAT and paxillin LD motifs is instrumental for recruiting FAK to FAs, although FAT also has paxillin-independent ways of localizing to FAs, possibly through an interaction with talin [40,41]. Beyond recruiting FAK, paxillin also stimulates kinase-dependent functions of FAK at FAs by enhancing FAK dimerization via clustering and intradomain interactions . The crystal structures of paxillin LD2 and LD4 peptides bound to the FAT domain of FAK  presented the first reported atomic structures of paxillin–ligand interactions. These crystal structure were soon complemented by NMR analyses of FAT–paxillin LD motif interaction [33,44].
FAT forms a four-helix bundle structure that offers two LD motif-binding sites: one formed between helices 1 and 4 (site 1/4) and one between helices 2 and 3 (site 2/3) [33,43–45] (Figure 3). Structural and biophysical measurements show that both LD2 and LD4 are, in principle, capable of binding very similarly to both sites on FAT . A preference of LD4 for site 2/3 has been reported, whereas LD2 binds either preferentially to site 1/4, or indiscriminately to both sites, according to two studies that used different experimental setups and different peptide sequences [33,46]. On each site, the hydrophobic surface of the helical LD motif peptides docks on to a shallow hydrophobic patch formed between two FAT helices. The acidic residues at position −1 and +1 (ELD) interact with the basic charges on FAT that surround the basic patch. Additionally, polar or basic FAT residues interact with the peptide. In the absence of ligands, site 1/4 is occupied by an N-terminal extension forming a short PPII (polyproline II) helix . Apart from displacing this N-terminal extension upon binding, LD peptides do not trigger large structural rearrangements in FAT, although binding of LD peptides globally stabilizes the otherwise very loose four-helix structure of FAT [33,43,44].
FAT also uses the same two binding sites to associate with the endocytosis motif of CD4. The FAK–CD4 complex represents an alternative route for eliciting T-cell-specific signals and links gp120 engagement to distinctive T-cell signalling during HIV infection . Although this CD4 motif forms a short amphipathic helix, it is not a LD motif in a narrow sense because it lacks acidic charges in positions −1 and +1 and it lacks a bulky hydrophobic residue at position +7 (Figure 2). To fit to the LD motif-binding sites, CD4 orients itself differently on FAT. Compared with LD motifs, the CD4 helix axis is rotated by 180° (site 1/4) or by 30° (site 2/3) to allow ionic interactions between the positive charges on FAT and the only negative charge on CD4, while maintaining a similar hydrophobic interface. The capacity of FAT to accommodate different types of two short helical motifs shows the plasticity of this LDBD and suggests that FAT might be capable of interacting with a range of other motifs.
The PYK2 FAT domain
PYK2 [proline-rich tyrosine kinase 2; also called FAK2, CAKβ (cell adhesion kinase β) and RAFTK (related adhesion focal tyrosine kinase)] is a close homologue to FAK, but with a unique mechanism for calcium regulation [48,49]. Although PYK2 can locate to FAs using its FAT domain, PYK2 is not strongly localized at FAs in many cell types, suggesting altered regulation of FA targeting [49,50]. The PYK2 FAT four-helix bundle domain is very homologous with FAT (~60% sequence identity), and also binds two LD4 peptides in the same position and orientation as FAT . The main difference with FAT comes from the PYK2 FAT N-terminal extension, which lines the LD motif-binding site 1/4 and interacts with the LD motif peptide (Figure 3). As a result the interaction between PYK2 FAT and the LD motif position −3 is no longer possible, and this residue of the LD motif appears to become flexible. It is therefore probable that the N-terminal extension of PYK2 contributes to ligand selection.
The CCM3 FAH
Mutations in the CCM3 [cerebral cavernous malformation 3; also known as PDCD10 (programmed cell death 10)] gene lead to cerebral cavernous malformation, a dysplasia of the central nervous system. The non-catalytic CCM3 interacts with several proteins; however, the function of CCM3 remains poorly defined. CCM3 probably promotes Golgi assembly and polarization, but also has functions at the cell membrane, to which it gets recruited by interactions with phosphatidylinositol, VEGFR2 (vascular endothelial growth factor receptor 2) and paxillin . CCM3 consists of an N-terminal dimerization domain and a C-terminal LDBD. Despite a lack of obvious sequence similarity, the CCM3 C-terminus forms a four-helix bundle similar to FAT (with an RMSD over 110 residues of less than 2 Å between them despite only 14% identity) . This CCM3 FAH binds only one LD motif on its site 2/3, whereas CCM3 site 1/4 is occluded by an N-terminal extension, partly loop, partly helical, connecting the FAH to the CCM3 dimerization domain (Figure 3). Although CCM3 site 2/3 is very similar to FAT site 2/3, the orientation of the LD2 and LD4 peptides on CCM3 is inverted compared with FAT. Interestingly, LD1 binds CCM3 FAH in the opposite direction of LD2 and LD4 [hereafter, we define the (+) orientation as the one adapted by the LD2 and LD4 motifs on LDBDs, and the opposite direction as the (−) orientation]. The D+6LE motif of LD1 functions as an inverted E−1LD motif, presumably because the D−1LD sequence in the LD motif is less optimal in this position than an (inverted) E−1LD motif. The hydrophobic interactions LDXLLXXL are maintained, despite the (−) orientation, due to the pseudo-palindromic nature of the hydrophobic patch in helix conformation.
Vinculin is another important component for linking FAs to the actin cytoskeleton. Vinculin associates with many cytoskeletal and signalling proteins, and the C-terminal Vt (vinculin tail domain) binds paxillin and also other molecules, such as actin and raver1. Vt forms a five-helix bundle with similarity to the FAH domains when considering the Vt N-terminal helix as a structural element similar to the N-terminal extensions of FAT and the CCM3 FAH . Vinculin binds to LD motifs 1, 2 and 4, but mutational analysis suggests that Vt only binds one LD motif at a time [55,56]. The binding site of LD motifs on Vt remains to be determined, but given that Vt offers sites with characteristics similar to other FAHs (Figure 3), it is expected that Vt binds LD motifs similarly between two helices. The affinity of single LD motifs for Vt is in the same range as for other FAHs, but given that Vt forms homodimers in vitro  and that vinculin dimerizes and becomes clustered upon binding to actin filaments , it can be speculated that multiple interactions tether one paxillin molecule to two or several vinculin molecules with increased affinity.
The GIT (G-protein-coupled receptor kinase-interacting) proteins GIT1 and GIT2 (also known as p95-PKL), are adaptor proteins that recruit signalling proteins to distinct cellular locations . In particular, by interacting with paxillin, GIT proteins link the PIX (p21-interacting exchange factor) family of Rho guanine-nucleotide-exchange factors and their binding partners, the p21-activated protein kinases, to FAs [56,60]. GIT1 and GIT2 contain an N-terminal Arf GTPase-activating protein domain, an SHD (Spa2 homology domain), a CC domain (coiled–coil domain) and, finally, a C-terminal domain. Computational followed by experimental approaches identified a four-helix FAH domain in the C-terminus (Figure 3) [61,62]. GIT proteins bind to paxillin and Hic-5 LD motifs 2 and 4 [61,62]. The experimental structure of a GIT–LD complex remains to be determined, but computational and experimental evidence suggests that the GIT FAH binds only one LD motif at site 1/4. Multiple simultaneous GIT–LD interactions might be possible, because GIT1 dimerizes through the CC domain and GIT dimers bind trimeric PIX through the SHD [63,64]. GIT SHD also binds FAK , thus providing the possibility of forming various complexes with multiple heterologous paxillin-binding sites.
In addition to FAHs, other molecular architectures have also been shown to bind LD motifs.
Besides vinculin, parvins constitute another paxillin- and actin-binding protein family that contributes directly to the initial nucleation of actin at FAs . Mammals have three parvins; α-parvin (also called actopaxin), β-parvin (or affixin) and γ-parvin. Parvins form a tight complex with the ILK (integrin-linked kinase) and paxillin binding recruits the complex to FAs . Structural and biochemical analyses show that α- and β-parvin bind to paxillin LD motifs 1, 2 and 4 [67–69]. Parvins contain two CH (calponin homology) domains (CH1 and CH2), linked by a 60-residue linker. The LD motif-binding surface is situated on the C-terminal CH domain (CH2). An N-terminal helix, atypical for canonical CH domains, is required for paxillin binding, and its presence may thus distinguish potential LD motif-binding CH domains from others. Parvin CH2 domains offer only one LD motif-binding site that is formed by three non-parallel helices and is therefore less shallow than in FAHs (Figure 4). The affinities of LD motifs to α- and β-parvins (30–200 μM) are a little lower than those for other FAHs (5–30 μM) (Table 1). Possibly, this could also in part be a result of an entropic penalty on recruiting and pinning down the N-terminal helix, which appears mobile in the absence of LD motifs [67–69]. LD motif binding of γ-parvin, which is a less well-characterized isoform specific to haemopoietic tissues, remains to be determined . γ-Parvin shows non-conservative substitutions of amino acids in the LD motif-binding site, which may negatively affect paxillin interactions (see the legend to Figure 4). The α-parvin CH1 domain was reported to form dimers in solution  and LD motif-bound parvin CH2 domains form the same dimers in the different crystal lattices (results not shown), raising the possibility that two paxillin LD motifs may interact with a parvin dimer. As described for CCM3, LD1 binds in the (−) orientation compared with LD2 and LD4, with the D+6LE motif occupying the position of the E−1LD motif in LD2 and LD4 [67,69]. Thus, despite having a different architecture than FAHs, the ELD/DLE motif directs its binding orientation on the α- and β-parvin LDBD.
Mammalian PV E6 proteins, which consist of two zinc-binding domains (E6N and E6C), are key players in epithelial tumours induced by PV, including cervical cancer in humans. E6 are adaptor proteins that recognize some of their target proteins, including E6AP, through acidic motifs containing the LXXLL consensus sequence. E6 from bovine PV1 (BE6) and human PV16 (16E6) have specifically evolved to recognize paxillin LD1, LD2 and LD4 [28,70]. The interaction of E6 with paxillin may disrupt the actin cytoskeleton, characteristic of transformed cells . Human 16E6 also binds LXXLL motifs that are not true LD motifs, whereas E6 from other PVs fail to bind paxillin [28,70]. The structure of the BE6–LD1 complex shows that E6 binds the LD1 helix akin to the E6AP LXXLL motif in a deep cleft formed between the two zinc-binding domains (Figure 3) . To bind E6 in the same mode as the shorter LXXLL motif (Figure 2), the last helical turn of LD1 is required to unfold. Collectively the data suggest that the interconversion of the E6 site between an LDBD and a binding site for other helical LXXLL motifs requires only minor changes, due to the structural malleability of LD motifs.
The anti-apoptotic Bcl-2 protein promotes cell survival in response to a range of stimuli, including those from integrins and growth factor receptors. Bcl-2 interacts with paxillin LD4, thus linking cell survival and cell adhesion [71,72]. The Bcl-2–LD4 association is unusually specific, because Bcl-2 only binds to paxillin LD4 and not even to LD4 from leupaxin or Hic-5 . Bcl-2 achieves this specificity by selecting for LD motif residues E+2 and S+6. On the LD motif helix, these residues are located on the side opposite to the hydrophobic patch and are not required for other known LDBDs. The atomic structure of the paxillin LD4–Bcl-2 complex is unknown; however, the structure of Bcl-2 bound to the BH3 (Bcl-2 homology 3) domain of the pro-apoptotic Bcl-2 family member BAX might provide clues for the specific molecular recognition of paxillin LD4 by Bcl-2 (Figure 4) . BAX BH3 forms a long amphipathic helix that is enclosed in a deep trench of the all-helical Bcl-2. If Bcl-2 bound paxillin LD4 similarly, then LD4 positions E+2 and S+6 would be part of the interface and could play a role in LD motif selectivity.
The interaction between paxillin and the PABP1 is required to deliver mRNA transcripts from the nucleus to the leading edge of the migratory cell where proteins such as actin are synthesized [27,74]. The high affinity of the paxillin–PABP1 association (~10 nM) combined with the abundance of PABP1 result in the majority of paxillin being associated with PABP1 most of the time . The atomic structure of the paxillin–PABP1 complex has not been established, but it is known that the tight interaction involves PABP1 RRM (RNA recognition motif) domains 1–4 and the paxillin N-terminus . PABP1 RRM domains are small globular domains composed of a 4–5-stranded β-sheet and two α-helices (Figure 4) . LD1 alone was shown to bind PABP1 only weakly, suggesting an extended interaction surface  (I. Barsukov and G. Roberts, personal communication). The paxillin LD2 appears to function as a NES in this context, to mediate nuclear export via a CRM1 (chromosome region maintenance 1)-dependent pathway (see below) [14,27,74]. Thus paxillin might use LD1 and other regions to bind mRNA-loaded PABP1, LD2 to mediate nuclear export of the complex and the C-terminal LIM domains to localize the complex to FAs.
During rotavirus infection, the viral non-structural protein NSP3 displaces PABP1 from the eukaryotic translation initiation factor eIF4G (eukaryotic translation initiation factor 4γ), promoting the translation of viral mRNAs while inhibiting the translation of cellular poly(A) mRNAs . NSP3 also recruits the cellular protein RoXaN into the complex with eIF4G. RoXaN is an adaptor protein containing TPRs (tetratricopeptide repeats) and an LD motif in its N-terminus and several repeats of RNA-binding zinc fingers in the C-terminal half . NSP3 uses its coiled–coil dimerization domain to bind to the LD motif of RoXaN . The function of the virus in NSP3–RoXaN interaction is to confine PABP1 into the nucleus and use eIF4G to export viral RNA. PABP1 and RoXaN, which bind mRNA with different sequence specificity, also bind each other, although not through the RoXaN LD motif, but probably through an RNA intermediate . The RoXaN LD motif also functions as a CRM1-dependent NES. Thus, in uninfected cells, RoXaN might assist in the export of a subset of mRNAs through the CRM1 pathway, and RoXaN sequestration by NSP3 in the cytoplasm would deplete the nucleus of RoXaN and stop the nuclear export of PABP1. An experimental structure of the NSP3–RoXaN LD motif interaction is currently missing, but mutational analysis combined with homology modelling suggests that the two parallel NSP3 helices provide an environment similar to FAHs, namely a shallow hydrophobic patch lined by positive charges . The dimer symmetry of the NSP3 coiled–coil suggests that a NSP3 dimer binds to two RoXaN LD motifs simultaneously, although this requires experimental confirmation.
The CRM1 protein [also known as Xpo1 (exportin 1)] is an adaptor protein that enables the nuclear export of more than 100 proteins through the recognition of α-helical NES. CRM1 contains 20 HEAT repeats that form a ring-shaped structure . NES form short helical motifs that bind to a hydrophobic groove formed between two parallel HEAT helices [77,78]. This hydrophobic binding site is surrounded by positive charges and hence is similar to LD motif-binding sites offered by FAHs. The structures of the dual LD/NES motifs from RoXan and paxillin family proteins in complexes with CRM1 remain to be determined, but it is expected that these motifs bind CMR1 akin to NES motifs from other proteins. NES motif recognition is enhanced by supplementary contacts between CRM1 and cargo proteins outside the NES sequence. An interesting question is therefore whether such additional contacts also exist for complexes between CRM1 and RoXaN or paxillin family members.
Clathrin is a major component for the formation of coated vesicles. Three clathrin heavy chains and three clathrin light chains create a triskelion shape. These triskelia can assemble into a polyhedral coat for vesicles. Using proteomics approaches, two groups observed an interaction between paxillin LD4 and the clathrin heavy chain [56,74]. The function and structure of this interaction, which may be indirect, are unknown.
ILK is implicated in several cellular functions, including cell migration and adhesion. ILK contains four ankyrin repeats and a C-terminal pseudokinase domain that functions as a protein–protein interaction domain. ILK forms a complex with PINCH (particularly interesting Cys-His-rich protein; through the ankyrin repeats) and parvin (which binds to the pseudokinase active site). ILK was initially reported to bind to paxillin LD1 directly, using its pseudokinase domain . However, based on current structural and functional data, it is most probable that the paxillin LD1–ILK interaction is indirect, via ILK-associated parvins. Indeed, the ILK–parvin interaction is compatible with binding of paxillin LD1, 2 and 4 motifs to parvin (see above) , the ILK mutations that disrupted paxillin-binding affect the kinase fold and stability, and hence parvin binding, and LD1 binds significantly tighter to parvins than do LD2 and LD4 [67,69], reflecting the apparent LD1 specificity of ILK  (Table 1).
PAKs (p21-activated kinases) are implicated in a number of different intracellular processes, including integrin signalling and cytoskeletal regulation. Although Turner et al.  showed that paxillin binds PAK1/3 only indirectly via GIT1/2 (see above), Hashimoto et al.  have published data in support of a direct interaction between the N-terminal 62 residues of PAK3 and paxillin LD4. Intriguingly, computational analysis (results not shown) strongly supports that this region is mostly unstructured and therefore constitutes an unexpected interaction partner for a short helical LD motif. Further experiments are required to characterize this interaction.
Summary of LD–LDBD interactions
Most LD motif-binding proteins fall into two major classes. The first includes proteins that are involved in remodelling the actin cytoskeleton, either by directly connecting adhesion sites with actin (parvin and vinculin) or by regulating these interactions (FAK and GIT). The second class comprises molecules that regulate delivery of mRNA from the nucleus to adhesion sites (PABP1 and, by interference, NSP3). The data we summarized above show that different, or even the same, domain architectures have independently evolved to recognize LD motifs. All known LD motif-binding sites, and most of the predicted ones, consist of a mostly helical framework that forms an elongated hydrophobic patch, flanked by one or two positive charges on each side, towards the edges. However, the depth of the binding site ranges from shallow (FAHs) to deep (E6 and, possibly, Bcl-2).
Initially, a sequence analysis based on mutational and biochemical evidence led to the suggestion that so-called PBS (paxillin-binding subdomains) mediate LD motif binding in many ligands (including FAK , ILK , vinculin , β-parvin , GIT2  and PABP1 ). However, the currently available structural analysis clearly shows that the predicted PBS are not directly involved in any of the known LD–LDBD complexes [43,61,62,67,68]. The fact that some mutations in PBS affect LD motif interactions can be explained by these mutations disrupting the 3D fold of the LDBD, rather than affecting the LD motif-binding site directly. PBS sequences appear to be enriched in helix–loop structural elements, and helical domains are often used for LD motif recognition. As a consequence, although structural analysis has made the PBS obsolete, it nonetheless needs to be credited for helping biologists identify a probable helical domain before any structural information was available.
THE MOLECULAR BASIS FOR THE ADAPTABILITY AND SELECTIVITY OF LD MOTIFS
The present review highlights that the interactions between single LD motifs and LDBDs are generally only poorly selective. Many LD motifs bind to LDBDs with different architectures, whereas some LDBDs, such as FAT or E6, also bind motifs that are not strict LD motifs (CD4, E6AP and E6BP). Moreover, highly similar LD motifs bind the same LDBDs in opposite directions (LD1 compared with LD2 or LD4). Collectively, the available data suggest how LD motifs can structurally adapt to diverse LDBDs. This adaptation results from a malleability of the LD motif helix conformation, which can even partly melt (see E6–LD1  or compare sites 1/4 and 2/3 in PYK2–LD4 ). Within this malleable LD motif helix, the hydrophobic side chains can adapt to differently shaped hydrophobic surfaces, whereas the long-range ionic interactions between the long and flexible residues of the LDBD (lysines and/or arginines) and the LD motif (E−1LD+1) tolerate a large difference in their respective Cα positions. In the helix conformation, the LXXLLXXL hydrophobic patch of LD motifs is pseudopalindromic, explaining how LD motifs can bind in opposing directions to the same LDBD (using D+6LE+8 instead of E−1LD+1) if this allows optimizing ionic interactions.
Many LDBDs bind to paxillin LD motifs 1, 2 and 4 with moderate to low affinity, and with only little difference in affinity for different LD motifs. Even LD motifs 3 and 5, which have so far not been shown to bind to a particular ligand, still retain some affinity for parvin  and possibly for other LDBDs (Table 1). This low affinity may result from the relatively small binding surface (<1000 Å2 for FAHs), but also from the entropic cost for LD motifs to form a stable α-helix upon binding, or for the LDBD to become more rigid (e.g. the FAT domain or N-terminal helix of the parvin CH2 domain [33,43,44,67]). Therefore the apparent selectivity for single LD motifs may often only be a matter of (experimentally) defining a certain Kd value cut-off. For example the binding of paxillin LD1 to FAT was observed by Li et al.  using surface plasmon resonance, but not by Turner et al.  using precipitation binding assays (Table 1). The in vivo relevance of affinity differences and low-affinity interactions remains to be determined, especially in the context of multiple simultaneous interactions within multiprotein complexes.
Within the limitations imposed by the available detection methods, some LD–LDBD interactions appear highly selective, suggesting that if required LDBDs can evolve high specificity. Thus Bcl-2 LD4 only binds from paxillin, not from Hic-5 or leupaxin , whereas gelsolin binds to the FAH only from PYK2 and not from FAK  (Table 1).
Given that LDBD–LD interactions can be highly specific, their apparent promiscuity and low affinity appear functionally important, by allowing, for example, LD motif-containing proteins to serve as scaffolds for different cellular partners in different cellular contexts. The low affinity of a single LD motif allows multiple simultaneous interactions to occur without reaching a combined affinity that is too high to be dissociated in a timely manner. Thus the low affinity of single LD motifs allows for selection through coincidence detection, where multiple simultaneous interactions are required to achieve stable complexes, whereas sporadic associations of single LD motifs with non-specific partners might be too weak to elicit a cellular response.
Multiple binding sites can occur on the same LDBD, on a LDBD homodimer or a LDBD heterodimer. For example, it has been shown that a paxillin fragment comprising LD motifs 2–4 binds 5–10 times tighter to FAT than fragments containing only LD2 or LD4 , and CCM3 forms homodimers that are compatible with a simultaneous LD motif interaction . In addition, non-LD motif interactions outside the LD motifs can increase the specificity and affinity for a specific ligand, as exemplified by the paxillin–PABP1 association. Unfortunately, experimental structures of extended and/or multiple simultaneous LD motif–ligand interactions have not been reported.
REGULATION OF LD MOTIFS
Considering that LD motifs show poor selectivity and non-specific stickiness and that binding of the same LD motif to different partners can trigger different cellular effects, it is expected that regulatory mechanisms are in place to allow the spatial and time-controlled binding of LD motifs to specific partners and to avoid excessive interactions with non-specific partners.
One type of regulation results indirectly from the low affinity of a single LD–LDBD interaction, by only allowing stable complexes to form when all components of multiprotein complexes co-localize, or when high local concentrations of a particular LD motif or LDBD-containing protein are reached. Thus spatial and temporal control of LD motif interactions can result indirectly from co-expression and co-localization of the LD motifs and LDBD-containing proteins, or from clustering of their recruiting factors (e.g. integrin clusters at FAs).
Conversely, caspase cleavage of paxillin, resulting in cleavage of the N-terminal part from the LIM domains, blocks LD motif interactions at FAs and constitutes another important indirect regulatory element in FA disassembly . Similarly, LIM domain-only paxillin family proteins [such as DALP (death-associated LIM-only protein)] might displace paxillin from specific sites, and hence block LDBD interactions.
Indirect regulation can also be achieved through concealing LD motifs from ligands through a strong interaction with one particular LDBD. For example the interaction between paxillin and PABP1 has been described to be in the nanomolar range . This tight interaction requires large parts of the paxillin N-terminal region and seems to sequester the majority of paxillin in the cytosol or nucleus.
In addition, the affinity of LD motifs can be regulated directly on or adjacent to the LD motif. For instance paxillin LD4 can be phosphorylated on two serine residues (positions +6 and +8) [87,88]. This phosphorylation event affects binding to LDBDs, although the exact outcome remains controversial. Bertolucci et al.  observed in vitro that phosphorylation weakens the propensity of LD4 to form the helix conformation required for binding, explaining why phosphorylation of Ser+6 decreased the affinity for FAT by a factor of ~2.5. We  and Dong et al.  have reported that the phosphorylation of Ser+6 selectively reduces binding to GIT FAH, but not FAT, in vitro and in cells, because of a cluster of acidic residues, which is present in the GIT1 LD-binding site and absent from FAT. Conversely, Nayal et al.  observed that Ser+6 phosphorylation augmented interaction with GIT1. Although the exact mechanism requires clarification, serine phosphorylation within LD4, and possibly within other LD motifs, appears to be an important modulator of binding. Of note, Ser+6 phosphorylation also blocked the NES activity of paxillin LD4 . Conversely, the Hic-5 LD4 NES function was affected by an allosteric mechanism through oxidation of cysteine residues located close to LD2 .
Akin to the allosteric oxidative regulation of the Hic-5 NES, LD motif phosphorylation events not directly situated in the LD motifs can modulate LD–LDBD interactions. For example, upon the ligation of an integrin to either a fibronectin or collagen substrate, paxillin becomes tyrosine phosphorylated, primarily on Tyr31 and Tyr118 . Phosphorylation of Tyr31 and Tyr118 increases the association of FAK with the adjacent LD motifs of paxillin by an unknown allosteric mechanism, introducing a switch for FA maturation .
Autoinhibitory interactions can regulate the exposure of LD motifs. For example, the gelsolin LD motif is located at the C-terminus of a compact structure formed by six calcium-binding domains (G1–G6). The LD motif is separated from G6 by a linker of approximately 10 residues. In the calcium-free form, the gelsolin LD motif forms a helix and hides its potential ligand-binding site by docking on to the G2 domain. Structural studies suggest a regulatory mechanism where co-operative binding of the calcium to sites of G2 and G6 triggers the release of the LD motif and initiates the gross conformational changes required for activation . Calcium regulation of LD motif interactions might also regulate the interaction between the atypical LD motif of E6BP and the viral E6 proteins. The suggested E6BP LD motif is part of an EF-hand domain . Ligand interactions of the potential LD motif of E6 would require unfurling of the EF-hand structure. E6 binding would hence be incompatible with EF-hand–calcium interactions, suggesting a possibility for calcium regulation of this potential autoinhibitory mechanism.
REGULATION OF LDBDs
Cellular control of LD binding also occurs on the LDBDs. Currently known regulatory mechanisms include post-translational modifications, regulation of subcellular localization, autoinhibitory interactions, and structural and dynamic changes.
The FAK FAT domain possesses an astonishing variety of interlinked regulatory mechanisms. Phosphorylation of Tyr925, located in the FAT helix 1, is required to link FAK to MAPK (mitogen-activated protein kinase) pathways , initiating microtubule-induced disassembly of FAs . Tyr925 is phosphorylated by the Src kinase , and paxillin triggers this phosphorylation event by promoting FAK clustering and dimerization [42,94]. FAK dimerization is required for FAK autophosphorylation on Tyr397 , which creates a site for Src binding and activation [21,95,96]. Phosphorylation of Tyr925 by Src and the subsequent interaction of Tyr925 with the SH2 domain of Grb2 (growth factor receptor-bound protein 2) require dissociation of helix 1 from the FAT core [36,97–100]. This dissociation is promoted by a PXPP motif in the loop between helices 1 and 2 that functions like a molecular spring [36,98]. However, the opening of helix 1 is incompatible with paxilin LD motif interactions with FAT, because opening disrupts FAT site 1/4 and probably also site 2/3, whereas LD binding to FAT stabilizes the four-helix bundle structure of FAT, which is incompatible with Tyr925 phosphorylation [36,98].
In the absence of a ligand, the N-terminal extension of FAT covers site 1/4 . This extension needs to be displaced to allow LD motif binding to this site. However, in full-length FAK, FAT binds to the FERM domain . The FERM–FAT interaction requires the N-terminal extension of FAT and is reinforced by paxillin LD motifs. The molecular basis for the link between these interactions is unknown, but may involve stabilization of the FAT domain, and/or exposure of the N-terminal extension. The N-terminal extension of FAT carries the S910PPP motif, which binds to FAT site 1/4 . Ser910 is phosphorylated by ERK (extracellular-signal-regulated kinase), and pSPPP recruits PIN1 (protein interacting with NIMA 1) and PTP-PEST, leading to dephosphorylation of FAK Tyr397 . FAK inhibition by this mechanism mediates cell migration, invasion and metastasis. Another level of regulation is provided by caspase or calpain cleavage that cleaves the C-terminal FAK region, including FAT, from the N-terminal FERM kinase fragment [102,103]. Expression of FRNK (FAK-related non-kinase; a FAK isoform lacking the FERM and kinase domains) provides a means for disrupting the FAK–paxillin interactions through competition with this truncated and catalytically inactive LDBD. Thus the LD motif–FAT association is part of an intricate control mechanism that allows FAK to sense and act in an environment-dependent manner.
Several, but not all, of the regulatory mechanisms established for FAK are present in PYK2. PYK2 has the same domain structure as FAK and is recruited to FAs by an interaction with its FAH domain [49,50]. The FERM dimerization site and the Src binding and activation motif are conserved in PYK2, as are Tyr925 and the helix 1-opening PXPP motif. PYK2 is regulated through calpain cleavage sites [104,105] and can be negatively regulated through a PRNK (PYK2-related non-kinase) isoform, the PYK2 analogue of FRNK . However PYK2 dimerization requires interaction with calmodulin , and regulation of PYK2 through the N-terminal extension appears to be different, since the extension assists, rather than competes with, LD motif binding . The N-terminal extension also lacks an SPPP motif. PYK2 is generally more cytoplasmic than FAK and less efficiently recruited to FAs [49,50], suggesting that the accessibility of the FA targeting sites is more restricted on PYK than on FAK.
Akin to FAT, Vt undergoes a large structural opening. This opening, which is triggered by binding of phospholipids, is almost certainly influencing the capacity of Vt to bind paxillin LD motifs . In the inactive form of full-length vinculin, Vt contributes to a compact assembled conformation of vinculin through interactions with the Vh (vinculin head) . This interaction involves large surfaces of Vt, thus potentially providing a means for functionally linking LD motif binding, with disruption of Vh–Vt interactions, and hence activation of vinculin's scaffolding activity.
In adherent cells, most of the GIT–PIX complex is cytosolic, and only a small fraction is bound to paxillin at FAs . Regulation of paxillin binding through GIT FAH shows different characteristics to those of FAT. In contrast with FAT, which is highly dynamic in the absence of ligands, apo-GIT FAH is dynamically stable and does not show a tendency to unfold , as explained by the absence of a PXPP motif. Also differing from FAT, the GIT1/2 FAH is not regulated through tyrosine residue phosphorylation, but possesses a particular basic cluster that senses phosphorylation of LD4 . Paxillin binding is influenced by phosphorylation of GIT1 Ser709 through an unknown allosteric mechanism, because Ser709 lies on the FAH site opposite to the LD motif-binding site. Several GIT mutations were shown to increase LD motif binding. Of these, E695T is situated within the LD motif-binding site . E667R lies with the GIT1 FAH, but away from the putative LD motif-binding site, whereas Y563F is N-terminal to the GIT1 FAH. A Y563F/E667R/E695T triple mutant substantially increased synergistically the interaction with paxillin from cell lysates . Collectively, these data suggest that a tight autoregulatory control mechanism is in place for paxillin binding by GIT1.
CONCLUSIONS AND OPEN QUESTIONS
Through mediating protein–protein interactions, the interaction between LD motifs and their cognate LDBDs link cell motility, cell survival and communication with the extracellular environment. Thus LD motif-mediated interactions play important roles in embryogenesis, wound healing, cancer metastasis and the evolution of multicellularity. Since their first discovery almost 20 years ago in the paxillin protein family , strict LD motifs have been confirmed in only three non-paxillin proteins (DLC1, RoXaN and gelsolin), whereas less strictly related acidic LXXLL motifs have been described in two additional proteins both binding to the viral E6 proteins (E6AP and E6BP) (Figure 2). Currently at least 13 proteins (11 when not counting close isoforms separately) have been proven to associate directly with LD motifs (Figures 3 and 4, and Table 1). Although these proteins use LDBDs with different types of domain architectures to recognize LD motifs, most LDBDs share some common features such as a helical framework for the binding site, and the distribution of 2–4 basic residues around an elongated hydrophobic groove.
The adaptability of both the LD helix conformation and the LD–LDBD interactions, combined with the pseudopalindromic nature of the LD motif sequence and structure, entail that neither the LD motifs nor LDBDs have sufficiently stringent common features to distinguish them clearly from other helical motifs or other ligand-binding domains. This observation has several important repercussions: (i) the discovery of LD motifs and LDBDs has been hampered by the high number of false positives obtained when using only sequence pattern searches; (ii) in cells, several layers of orthogonal and complementary mechanisms have to be combined to allow sufficient selectivity of LD–LDBD interactions and block untimely and misplaced interactions; (iii) the versatility of possible interactions of an LD or LDBD contributes to multifunctionality, robustness and adaptability of cellular signalling networks; (iv) the significant cross-reactivity provides means for embedding LD motifs or LDBDs within other signalling networks, such as seen for the FAK–CD4 interaction or the dual function of LD motifs as NES. This promiscuity may also provide opportunities to evolve new functions by establishing connections between signalling pathways. For example, by connecting nuclear export with membrane-proximal protein–protein interactions; and (v) the cross-reactivity, promiscuity and adaptability of LD–LDBD interactions also creates opportunities for pathogens to re-route cellular signalling networks for their own benefit.
Our understanding of how LD motifs contribute to an organism's function and disease would be greatly enhanced by the following developments. First, development of more sophisticated computational algorithms that allow high-confidence detection of LD motifs and LDBD domains on a genome-wide level to assess their prevalence more comprehensively in different species, including pathogens. Secondly, development of structural biology methods that allow discovery of LD–LDBD motif interactions within biologically important multiprotein complexes to assess how ligand recognition occurs in these signalosomes. Thirdly, development of conceptual and mathematical frameworks for signal transduction that correctly take into account the heterogeneity and noisiness of ligand interactions within the cell . Given the similarities of LD motif-mediated interactions with many other interactions based on sequence recognition by interaction domains, these developments would of course be generally useful for understanding, predicting and manipulating cellular signalling networks.
Bcl-2 homology 3
- CC domain
cerebral cavernous malformation 3
chromosome region maintenance 1
deleted in liver cancer 1
human PV E6-associated protein
eukaryotic translation initiation factor 4γ
G-protein-coupled receptor kinase-interacting
hydrogen peroxide-inducible clone 5
- LD motif
leucine–aspartic acid motif
LD motif-binding domain
nuclear export signal
poly(A)-binding protein 1
p21-interacting exchange factor
proline-rich tyrosine kinase 2
rotavirus X-associated non-structural
RNA recognition motif
Spa2 homology domain
vinculin tail domain
We thank Maria K. Höllerer, Sonja Lorenz, Jean-Antoine Girault, Martin E.M. Noble, Richard Premont and Igor Barsukov for comments and discussions, the anonymous reviewer for their comments and suggestions, and Virginia A. Unkefer for editorial help before submission.
This work was supported by funding from the King Abdullah University of Science and Technology (KAUST). T.A. was supported by a KAUST Baseline fund to Vladimir B. Bajic.