Mutations within the leucine-rich repeat kinase 2 (LRRK2) gene represent the most common cause of Mendelian forms of Parkinson's disease, among autosomal dominant cases. Its gene product, LRRK2, is a large multidomain protein that belongs to the Roco protein family exhibiting GTPase and kinase activity, with the latter activity increased by pathogenic mutations. To allow rational drug design against LRRK2 and to understand the cross-regulation of the G- and the kinase domain at a molecular level, it is key to solve the three-dimensional structure of the protein. We review here our recent successful approach to build the first structural model of dimeric LRRK2 by an integrative modeling approach.
Parkinson's disease (PD) is a slow but progressive neurodegenerative disorder, which hits neuromelanin-containing neurons, mainly dopaminergic in the substantia nigra pars compacta . The loss of dopamine content in this region leads to a combination of motor-related symptoms, such as bradykinesia, rigidity, resting tremor, flexed posture, freezing and loss of postural reflexes.
Ten years ago, mutations in the leucine-rich repeat kinase 2 (LRRK2) gene have been identified as a major cause in Mendelian PD with autosomal dominant inheritance, being also a risk factor for developing idiopathic forms of the disease [2–4]. Its gene product, LRRK2, is a large, multidomain Roco protein containing two catalytic domains: a Ras of complex proteins (Roc) G-domain followed by the dimerization domain COR (C-terminal of Roc) and a kinase domain (Figure 1A) [5,6]. Mutations with confirmed disease co-segregation reside mostly in the catalytic core, RocCOR and the kinase domain, which are linked to increased kinase and decreased GTPase activity [7–11]. In addition, several PD-associated variants, including risk variants, have been described for the terminal, putative regulatory domains, including the N-terminal solenoid repeat domains as well as the C-terminal WD40 repeat .
Structural model of LRRK2.
Here, we review our recent work to obtain the first structural model of full-length LRRK2 and discuss its implication in the perspective of the complex LRRK2 activation mechanism .
Advances in the structural analysis of LRRK2 towards a structural model of the full-length protein
Commonly, modern treatment development projects start with the identification of a target protein and validation of its involvement in the disease progression. For a successful rational drug design, however, the three-dimensional (3D) structure of the disease-related protein in question needs to be determined. Despite the importance of LRRK2 in the pathogenesis of PD, its structural determination by common experimental techniques (i.e. X-ray crystallography, NMR and cryo-EM) has remained elusive so far, also hampering the understanding of how PD-linked mutations alter its function.
By now, the majority of high-resolution structures come from bacterial, orthologous Roco proteins. These structures already give a valuable insight into the central RocCOR interface and the dimerization mechanism, very likely to be conserved between the different Roco proteins . These studies provide valuable information on the central conserved part of Roco proteins, showing that Roco proteins have functional and structurally unique features compared with Ras-like small G-proteins. In addition, the kinase domain of Roco4, an LRRK2 ortholog from Dictyostelium discoideum, has been used as an excellent structural model. The same work demonstrated that a chimeric protein of Roco4 containing the LRRK2 kinase domain is able to partially rescue null phenotypes in D. discoideum, demonstrating a high degree of functional conservation in the Roco protein family across different species . Furthermore, the Roco4 G1179S kinase structure revealed a mechanism for the activating G2019S mutation in LRRK2 and humanized versions of this structure support the design and assessment of LRRK2-specific inhibitors [14,15].
Today, only one high-resolution structure is available for LRRK2 itself, demonstrating the challenges in the structural analysis of this large protein. It has been solved based on a construct covering the Roc G-domain . However, the quite unusual dimerization mode of the isolated Roc domain, in the absence of surrounding domains, fueled debates to what extent this structure reflects the physiological situation of the large multidomain protein LRRK2 . Before our study  and despite the availability of biochemical data showing physical interaction between different LRRK2 domains , the domain–domain interactions in the intact LRRK2 protein (Figure 1A) and their regulatory function at a structural level remained largely elusive. Furthermore, humanized versions of Roco4 kinase domain structure are quite helpful for analysis and design of LRRK2-specific kinase inhibitors.
To completely understand the activation mechanism of LRRK2, the domain–domain interactions as well as the underlying defect in LRRK2-associated PD is, however, essential to solve the full-length structure. Given the difficulties of obtaining crystal structures for LRRK2, we have chosen an integrative structural biology approach. We employed a combination of computational modeling, negative-stain EM, SAXS, X-ray crystallography and chemical cross-linking together with mass spectrometry (CL-MS) to obtain the first 3D model of full-length LRRK2, albeit at a low resolution (22 Å) . Recently, CL-MS has emerged as a powerful tool to provide spatial constraints for computational modeling of complex protein structures, mainly due to improvements in the sensitivity of modern mass spectrometers as well as the availability of suitable software tools [19,20]. The low-resolution EM negative-stain map of full-length, dimeric LRRK2 served as a guide to fit in single domains modeled based on X-ray structures of homologous proteins.
The degrees of freedom and computational complexity have both to be reduced to be able to model such a flexible multidomain system. For this reason, the distance information provided by chemical cross-links has been used to filter most realistic domain–domain interaction docking solutions, which have subsequently been fitted to the EM envelope to obtain a model of the full-length LRRK2 protein .
In addition, SAXS data from a bacterial orthologous Roco protein (Chlorobium tepidum Roco) served as an important additional experimental restraint to model the orientation of the leucine-rich repeats with respect to the dimeric RocCOR domain at high confidence.
Despite the lack of a kinase domain on the C. tepidum Roco protein, the substantial agreement of the models generated from SAXS and EM volume densities indicated that the quaternary structures of the dimeric RocCOR–LRR domains (see scheme in Figure 1B) from orthologous proteins are conserved.
The resulting model shows that LRRK2 dimerizes in a head-to-head orientation with the COR as the central part (Figure 1B). A closer look at the COR domains reveals that it consists of two subdomains, an N-terminal part, containing the PD mutation Y1699C most probably involved in GTP regulation, and a C-terminal part that facilitates dimerization (see scheme in Figure 1B) . Dimeric LRRK2 appears to be quite globular (Figure 1C), where the C-terminus, e.g. the kinase–WD40 module, folds back to the N-terminal domains, so both parts could interact with each other. The model is in good agreement with biochemical data showing the interaction of different LRRK2 domains and identified autophosphorylation sites [18,21–23]. Interestingly, the position of the kinase domain in the final model is in close proximity to the autophosphorylation site Serine 1292 (Figure 1D). This site, originally mapped by mass spectrometry , has emerged as an important biomarker for LRRK2 activity and pathogenic LRRK2 variants show increased phosphorylation levels at this site . It is noteworthy that increased phosphorylation at S1292 has recently been shown in patient-derived biomaterials, such as exosomes, isolated from idiopathic PD patient urine. This supports the outcome from a recent meta-analysis of genome-wide association studies, suggesting a role of LRRK2 as a risk factor in idiopathic forms of PD .
A question which has been raised in the field is whether LRRK2 autophosphorylation occurs in cis or trans orientation, in other words within a monomer or between two monomers in the LRRK2 dimer. Previous work, based on biochemical methods, suggests that LRRK2 autophosphorylation predominantly occurs in cis . However, there is no straightforward analytical strategy to discriminate between cis and trans cross-links. An exact assignment of the kinase domain positions in the dimeric model to a specific monomer could, therefore, not been provided by the experimental data. Given the fact that LRRK2 COR–kinase linker regions are in close spatial proximity, the positions of the C-terminal domains in the final models are also agreeable with an intertwined homodimer as found for other dimeric proteins . As a result, the final model is in agreement with both cis and trans mechanisms for LRRK2 autophosphorylation. Considering biochemical data demonstrating a cross-regulation between the LRRK2 GTPase and kinase activity , our structural model suggests that regulation of the LRRK2 kinase domain by the Roc domain is mediated by the N-terminal part of the protein. The model is in agreement with a scenario where opening and closing of the Roc domains leads to a rearrangement of the N-terminus and thereby increases kinase activity either by releasing an autoinhibition or enhanced the binding of substrates.
Structural impact of LRRK2 mutations
Another key question in the field is to understand the impact of PD-associated mutations at a structural level. In the past years, knowledge about the impact of mutations of LRRK2 at a molecular level came mainly from biochemical analysis combined with biophysical assessment and modeling of single domains, e.g. the leucine-rich repeats or the Roc and kinase domains [27–30], or from structural and biophysical assessment of bacterial orthologs [11,14,17]. These studies already gave significant input towards a better understanding of the molecular pathomechanisms of mutations at a local level and within the quite conserved enzymatic core of the protein. In contrast, structural models of the full-length protein can provide a more comprehensive view on how various mutations, especially the less well-studied mutations within the scaffolding domains, might affect the domain–domain interactions of LRRK2. Although these models provide good starting points for forming hypotheses, clearly high-resolution structures, at atomic-level resolution, are necessary to fully understand the molecular pathomechanisms underlying PD-associated LRRK2 mutations. Additionally, genome sequencing of healthy individuals is revealing new single-nucleotide polymorphisms of the LRRK2 gene at various allele frequencies within the world population, which also demand a functional characterization to establish their causal relationship with the disease [31,32].
Activity-state and interaction-dependent, functional distinct conformations
An additional aspect, which has to be taken into account, is that LRRK2 exists in multiple conformations, which likely reflect different activity states. On one hand, it has been shown that LRRK2 is predominantly monomeric in the cytosol while forming dimers at the membrane [33,34]. The membrane-bound form thereby shows higher kinase activity, which is in good agreement with the recent identification of a subset of Rab proteins, i.e. Rab3a, Rab8a, Rab10 and Rab12, key players in vesicular trafficking, as physiological substrates of LRRK2 . On the other hand, via its different scaffolding domains, such as its C-terminal WD40 domain, LRRK2 is a part of larger protein complexes interacting with various proteins, such as different chaperones, cytoskeletal and vesicular proteins, including Rabs [36–42]. A binding epitope for the latter, i.e. for Rab7L1 (Rab29), Rab32 and Rab38, has been identified in the N-terminal Armadillo domain [39,40]. In addition, LRRK2 has also been shown to interact with microtubules via its WD40 domain in a well-ordered periodic fashion . Noteworthy, LRRK2 has been found to be target of dynamic phosphorylation/dephosphorylation by upstream kinases and phosphatases, which affects its cellular localization but also regulates the interaction with its binding partners as demonstrated for 14-3-3 or ARHGEF7 [44–47]. The probably best characterized protein–protein interaction for LRRK2 is its interaction with 14-3-3. It has been shown that this interaction critically depends on phosphorylation of LRRK2 at multiple sites, including the N-terminal residues Serine 910 and 935 also regulating its cellular localization . Furthermore, an LRRK2 mutant, S910A/S935A, unable to bind 14-3-3 at its N-terminus, does also not phosphorylate Rab proteins within intact cells, suggesting that 14-3-3 binding is necessary for proper recruitment of LRRK2 to endosomal membranes . This is also in good agreement with previous results, demonstrating that LRRK2 secretion in exosomes is regulated by 14-3-3 . Unfortunately, due to the lack of homology to existing protein structures, no suitable templates exist to model the interdomain space containing the residues S910/S935. For this reason, the current full-length LRRK2 model does not cover this functionally relevant part. However, as the latter is localized between the C-terminus of the Ankyrin and the N-terminus of the LRR domain, the model implies that this interdomain space falls in close proximity to the N-terminal COR and the Roc G-domains within the LRRK2 dimer (for visualization, see Figure 1E). This suggests that N-terminal phosphorylation and 14-3-3 binding might also regulate the LRRK2 G-domain/GTPase activity. Interestingly, an additional 14-3-3-binding site has been identified in the Roc domain containing the R1441 residue. The PD-associated R1441C/G/H mutation prevents PKA-dependent phosphorylation, thus disrupting 14-3-3 binding at this site. PKA-dependent phosphorylation and subsequent 14-3-3 binding at this site have been shown to negatively regulate LRRK2 kinase activity, in vitro .
Therefore, the analysis of specific structural conformations of LRRK2 as a result of dynamic phosphorylation, localization and complex formation, including the interaction of well-established binding partners, such as Rabs or 14-3-3, represents an interesting target for future structural investigation.
Conclusions and outlook
Taken together, further work is clearly needed to identify the different conformations of LRRK2 and to correlate those with its activity states, defined protein complexes and cellular localization. Given that LRRK2 is a G-protein, especially the structural analysis of G-nucleotide-specific conformational states will certainly be an attractive aim of future work.
Nevertheless, given the challenge of crystallizing a huge, partially flexible protein, our structural model of dimeric LRRK2  already provides a robust basis for new testable hypotheses with regard to intramolecular regulation mechanisms, which to our understanding is of great importance for a rational design of specific compounds to precisely modulate pathogenic activities of disease-associated LRRK2 variants.
This work was supported by The Michael J. Fox Foundation for Parkinson's Research and iMed — the Helmholtz Initiative on Personalized Medicine.
The Authors declare that there are no competing interests associated with the manuscript.