Regulatory spine RS3 residue of protein kinases: a lipophilic bystander or a decisive element in the small-molecule kinase inhibitor binding?

In recent years, protein kinases have been one of the most pursued drug targets. These determined efforts have resulted in ever increasing numbers of small-molecule kinase inhibitors reaching to the market, offering novel treatment options for patients with distinct diseases. One essential component related to the activation and normal functionality of a protein kinase is the regulatory spine (R-spine). The R-spine is formed of four conserved residues named as RS1–RS4. One of these residues, RS3, located in the C-terminal part of αC-helix, is usually accessible for the inhibitors from the ATP-binding cavity as its side chain is lining the hydrophobic back pocket in many protein kinases. Although the role of RS3 has been well acknowledged in protein kinase function, this residue has not been actively considered in inhibitor design, even though many small-molecule kinase inhibitors display interactions to this residue. In this minireview, we will cover the current knowledge of RS3, its relationship with the gatekeeper, and the role of RS3 in kinase inhibitor interactions. Finally, we comment on the future perspectives how this residue could be utilized in the kinase inhibitor design.


Introduction
Protein kinases are dynamic proteins that regulate a multitude of cellular signalling processes. They control the activity of their downstream targets mainly by phosphorylation, and their own activity is also usually controlled in the same manner. The human kinome comprises more than 500 protein kinases [1], and nearly 500 proteins contain a typical kinase domain [2]. Still, the biological role of many protein kinases is largely unknown, and there are ongoing efforts aiming to characterize these poorly understood protein kinases [3]. Although protein kinases display high similarity in their kinase domain, there is a higher level diversity in their structures; while some kinases consist (almost) solely of the kinase domain (e.g. MAPK14, GSK3B), other are larger multidomain proteins (e.g. LRRK2 [4]). The structure and function of the protein kinase domain is well-established ( Figure 1A). For a comprehensive view of structural history of protein kinases, the reader is recommend a recent review by Taylor et al. [5].
In the protein kinase domain, one of the key dynamic elements in regulating protein kinase function is the hydrophobic regulatory-spine (R-spine), which was discovered already 15 years ago in 2006 [6]. The R-spine consists of four residues, named RS1-RS4, which connects the two lobes of the kinase domain ( Figure 1A,B). Two of these residues, RS1 and RS2, belong the C-lobe. RS1 is His (sometimes Tyr) residue from the HRD (or YRD) motif [7]. RS2 is Phe (or Leu) from the DFG-motif, which is part of the activation loop of a protein kinase. The other two R-spine residues belong to the N-lobe of the protein kinase. RS4 is a residue from the β4-strand, which is less conserved but frequently Leu can be found in this position. Finally, RS3 is located four residues C-terminal from the αC-helix Glu that forms a salt-bridge to the Lys of β3-sheet. RS3 is usually (not always) accessible from the ATP binding site as its side chain lines the active site cleft. Overall, αC-helix, where RS3 is located, has a central role in the kinase activation process [8]. In the catalytically active form of a protein kinase, R-spine is assembled and in the inactive state it is disassembled ( Figure 1C). In the active state the location of RS2 as part of the assembled R-spine results in an open and extended conformation of the activation loop (A-loop), while a closed A-loop configuration is preferred in the inactive state. Notably, additional stabilization of the R-spine, such as via in-frame insertions or RS3 mutations [9,10], may result in increased catalytic activity of the protein kinase.
Next to the R-spine in the N-lobe are located three conserved residues, named as Shell (SH) residues ( Figure 1C) [9]. These residues, which are usually hydrophobic, have a role in supporting R-spine and are therefore important for kinase activity. One of these residues, SH2, is found close to RS3. This SH2 residue is more commonly known as the gatekeeper residue, which is named due to its role in controlling access to the (A) A typical structure of a protein kinase domain. ATP-binding cleft is located between the N-and C-lobes of the kinase. In the figure, structure of cAMP-dependent protein kinase catalytic subunit alpha is depicted (PDB ID: 4wb5 [20]; inhibitory peptide is hidden). R-spine residues are illustrated with black surface. ATP, β3-Lys and αC-Glu are shown with stick model. (B) Shell residues SH1-SH3 (grey surface) are located next to the R-spine RS3 and RS4 residues. Shell residue SH2 (gatekeeper) is located close to RS3 (yellow surface). (C) The R-spine of a protein kinase is assembled in active conformation and disassembled in inactive conformation. In the figure, active and inactive configurations are illustrated with BRAF (PDB IDs: 4e26 [21] and 1uwh [22]). (D) Several small-molecule kinase inhibitors are already in clinical use, and dozens are in clinical trials.
hydrophobic pocket [11]. This shell residue participates in regulating R-spine dynamics, and gatekeeper mutations may stabilize the R-spine promoting the kinase activation [12].
In recent years, ever increasing efforts have been conducted by the pharmaceutical industry to target protein kinases [13]. These efforts have resulted in numerous small-molecule kinase inhibitors, totalling now over 70 FDA approved small-molecule inhibitors ( Figure 1D). According to the Protein Kinase Inhibitors in Clinical Trials database (PKIDB) [14,15], approximately 300 small-molecule kinase inhibitors are either in clinical trials or already approved. Comprehensive reviews of the kinase inhibitor drug discovery and kinase inhibitor development are available [16][17][18]. Currently, oncology is dominating indication for the kinase inhibitors, but there is potential also in other therapeutic areas such as autoimmune and inflammatory diseases, and degenerative disorders [19]. (A) Occurrence of RS3 residues in human protein kinases with publicly available structural data (289 kinases). The location of RS3 is highlighted in IGFR1 kinase domain (PDB ID: 3qqu [28]). The shown frequencies are rounded up to the nearest %, for residues with <1% frequency, percentage is not shown. (B) RS3 residue distribution in human kinome. Colours of the residue types are as in A. Eight structures with KLIFS annotation errors that were manually curated (RS3 was properly assigned) are indicated with an asterisk. Data in A and B consist of human protein kinases with publicly available structures with (with lipid kinases excluded). Human kinome tree illustration was made with the help of KinMap [29].
Here, we review the characteristics of RS3 and its relationship with the neighbouring gatekeeper (SH2) based on the publicly available structural data. We also have a look at RS3 interactions to small-molecule kinase inhibitors, including the approved drugs. Finally, we end the review with available mutational data of RS3.

RS3 in the human kinome
A majority of the human protein kinases with publicly disclosed structures display a nonpolar aliphatic RS3 residue ( Figure 2). Nearly half of these kinases exhibit Leu in RS3, and almost a third have Met (Figure 2A). In the overall human proteome Leu is also the most abundant residue (9.97%), while Met has the second lowest frequency (2.13%) of all amino acids [23]. Following the abundant Leu and Met in RS3, next preferred are aromatic residues. Tyr, His and Phe appear with the frequencies of 3-6%. Cys and Gln exist in 2%, Ile and Ser in 1%. Even more rare residues that are observed in this location are Val, Thr, Asn and Ala. The charged residues, Asp, Glu, Lys and Arg, as well as structurally more unique Trp, Gly or Pro are not present in the analysed set of human protein kinases in RS3. Based on the sequence alignment of a larger set of eucaryotic protein kinases these residues have been suggested to exist as RS3 residues, although rarely [7]. Regardless of the high existence of hydrophobic residues in RS3, there exists no clear trend related to the hydrophobicity ranking of the residues and their observed frequencies [24,25].
Protein kinases of different groups and families display distinct preferences for RS3 residues ( Figure 2B). The majority of the kinases (72%) belonging to the TK group display Met RS3. More than a fifth (22%) of this group present Leu in this position, including protein kinases belonging to JakA family ( JAK1-3; TYK2) and Trk family (NTRK1-3 also known as TRKA-C). Four protein kinases have either Ile (ALK; ERBB3 (ErbB3)) or a Phe (PTK7 (CCK4); LMTK3 (LMR3)). In EGFR family Met is preferred in RS3, except ErbB3 has Ile. Interestingly, ErbB3 has been identified to display considerably lower kinase activity [26,27]. However, ErbB3 displays also other unique characteristics that differ from other EGFRs (for instance, instead of αC-Glu ErbB3 has a His in this location). In the PDGFR family, KIT displays Leu instead of Met that is observed in other family members (FLT, PDGFRA, CSF1R (FMS)).
In the CK1 group, Tyr is dominating in RS3 (73%). VRK3 displays an aromatic Phe in this position. TTBK family kinases (TTBK1, TTBK2) are more diverse in this group with their aliphatic Leu in RS3.
The majority of the AGC and CAMK group kinases exhibit Leu in RS3, which is followed by Met with lower frequencies. Two kinases with an aromatic RS3 (Phe) are observed in AGC group, in PKN2 and PRKCI (PKCi). In CAMK group, kinases of the CAMK2 family (CAMK2A, CAMK2B, CAMK2D, CAMK2G) display Cys in RS3, as well as CASK. His is observed in MAPKAP family (MAPKAP2, MAPKAP3). TRIB1 (Trb1) represents Ile in RS3; however, in this pseudokinase the neighbouring Tyr may actually occupy the canonical RS3 position [34,35].
The protein kinases that are not belonging to any specific group display also family specific preferences. For instance, protein kinases of WEE family and PLK family exhibit His in RS3 (located above and below CK1 group in the kinome tree). Of the Atypical kinases, COQ8A that is also known as ADCK3 [36], is the only structure in the dataset that displays Ala in RS3.

Polar RS3 are rare
Not only hydrophobic RS3 residues exist, but also polar residues are observed in this position. AURKA is an example of a widely studied kinase that has a polar RS3 (Gln) is. It was disclosed by Levinson et al. that this polar residue has a specific role in AURKA activation via a water-network [37]. Similarly, AURKB and AURKB have also Gln as their RS3. In addition, Gln is observed in MAP3K8 (COT, TPL2) and ATM. MAP3K8 controls inflammation [38] and ATR DNA damage responses [39].
In the available data, Ser is observed in three kinases. While in STRADA (STLK5, STE20) this residue appears unreachable from the binding cleft (PDB ID: 3gni [40], 2wtk [41]), in Haspin Ser is accessible ( participates in water coordination next to αC-Glu (PDB ID: 4ouc [42]). RS3 Ser may also be accessible in MAP2K6 when it is not in its autoinhibited state (PDB ID: 3fme). In the autoinhibited state its neighbouring Met appears to take the regular RS3 position (PDB ID: 3vn9 [32]).
Two unique polar RS3 residues are present in the data. Asn is observed in the RS3 of CHK1, while all the other kinases belonging to the same CAMKL family have either a lipophilic Leu or Met in this position. CHK1 inhibition could be useful in the treatment of KRAS driven pancreatic ductal adenocarcinoma [43]. Thr is observed in ULK4, while other members of the ULK family (ULK1-ULK3) have Leu in the respective position. ULK4 is a pseudokinase and has an unusual structural characteristic in its αC-helix: it exhibits Trp residue in the location of αC-Glu. This Trp appears to participate in its R-spine formation [44].

RS3 relationship with gatekeeper
The access towards the hydrophobic pocket (and towards RS3) is controlled by gatekeeper, also known as SH2 residue. This residue may also influence R-spine dynamics and it can be found in close contact to RS3 ( Figure 3A). Generally, protein kinases prefer Met, Leu, Phe and Thr gatekeepers ( Figure 3B,C). In the available structures, Met is the most abundant gatekeeper (40%), followed by Leu (18%), Phe (16%) and Thr (15%). Less frequent gatekeepersbut presented in more than eight kinase domainsare Ile 4%, Tyr 3%, Val 3%.

RS3 and small-molecule kinase inhibitors
We searched the KLIFS database [52] and complemented our search using Protein-Ligand Database (PLDB) tool of Maestro (Schrödinger LLC) to map out all the existing protein kinase-ligand complexes that have contacts between RS3 and the ligand (Figure 4). In KLIFS, RS3 is named as residue #28 (αC-Glu is #24) [53]. Over 100 protein kinases have structures where RS3-ligand interactions are observed ( Figure 4B). In total, more than 1000 structures with RS3-ligand interactions are available.
Interactions to RS3 appear independent on the kinase conformation ( Figure 4C,D) [55]. Based on the KinaMetrix [56], 'αC-helix in' conformations are dominating in the structures. CIDI is the most populated with 459 structures and CIDO appears in 323 structures. CODI and CODO structures with αC-helix out configuration exist in 148 and 42 structures, respectively. The ambiguous ωCD occurs in 47 structures. Leu and Met RS3-ligand interaction structures display all configurations, albeit less structures of CODI (16%), CODO (4%) and ωCD (4%) conformations exists. With less structural information containing RS3 residues the conformational representation does not cover all configurations. Nevertheless, all conformations are present in the structures with RS3-ligand interactions. As there exists distinct protein kinase conformation classifications, we also analysed the conformational distribution of RS3 contact structures with Kincore [58]. Table 1 shows the conformational distribution of these structures assigned with the Kincore. Overall, majority of the available compounds with the RS3 interactions exist in DFGin and DFGout spatial classes, covering different conformational classes (dihedrals).

Approved small-molecule kinase inhibitors and RS3
A publicly available structure exists for 49 out of the 71 FDA-approved small-molecule kinase inhibitors. From these, interaction to RS3 is displayed by 26 inhibitors (55%). These structures include targets with Met ( Figure 5) and Leu RS3 residues ( Figure 6). Inhibitors which exhibit RS3 interactions represent all types of kinase inhibitors that bind to the ATP-binding cleft. Of note, inhibitors of different type may engage RS3 site in a different manner. For instance, type II inhibitors, which bind the kinase in its inactive conformation, reach beyond the RS3 on the αC-helix side, and thereby can interact with the side chain to its 'side' or 'head' or both. Conversely, type I inhibitors, which bind to the active conformation of the kinase, interact mainly with the head of the RS3 side chain.
RS3 interaction is kinase dependent. Approved drugs with interactions to RS3 do not necessarily exhibit contacts to RS3 with other kinases that they bind to. For instance, gefitinib has been co-crystallized with GAK (Met), where it displays no contact to RS3 (PDB ID: 5y7z [69]). Ibrutinib has been also co-crystallized with MAP2K7 that has Val in RS3, but it does not display any contacts to this residue in this complex (PDB ID: 6yg2 [70]).

Mutations in RS3 exist rarely
According to the Catalogue of Somatic Mutations In Cancer (COSMIC; v.95) database [71], no clear tendency for mutations in RS3 exists. In total 82 kinases display at least one mutation (missense or silent) ( Table 2). Only with ALK, several mutations at this location appear in the data. These mutations include, I1171N, I1171T and I1171S. For BRAF, L505H mutation is found in eight samples. Perhaps the low number of observed RS3 mutations is not surprising, due to the crucial role of this residue in the kinase function. In comparison, RS2 mutations are also rare, with BRAF F595L (13 samples in COSMIC v.95) being the most frequent in the analysed kinases. Meanwhile, RS2 flanking residues are common oncogenic drivers; for instance, BRAF V600E is found in 52 733 samples and EGFR L858R is present in 10 642 samples. Mutations at αC-helix may activate the kinase via destabilizing the kinase inactive conformation [72], but they are mainly found in other locations on the αC-helix than on RS3 [73].

Conclusions
The function of the R-spine and the role of RS3 is quite conserved with typical protein kinases. Couple of residues are dominating RS3 in the available protein kinase domain structures. Nevertheless, also unique RS3 residues are observed, and in combination with gatekeeper even more kinase specific profiles for these residues are observed. Obviously, even with identical RS3-gatekeeper combinations the 3D-environment within this region can be quite different between two kinases. Kinase specific angles and absolute positions of these residues may provide important opportunities for selective targeting. Obviously, one must carefully consider this case-by-case, as RS3 is not targetable in all kinases. With pseudokinases [79][80][81], which compared with regular protein kinases can vary more in their structure in this region, the role of RS3 and its targeting would require further research. The general understanding of RS3-ligand interactions are quite limited, even though numerous structures that contain these mainly hydrophobic interactions are available. Currently, no studies investigating specific effect of RS3 on ligand binding affinity exist that directly compare a set of ligands with selected mutations of this residue. Further research is needed to disclose the influence of RS3 residue for ligand binding and should be also extended to the cases where no direct contact between the residue and inhibitor exists. Of note, even with hydrophobic interactions (in the case of hydrophobic RS3) this should not be overlooked as these interactions may be crucial for the inhibitor binding [82]. For example, non-canonical interactions play a detrimental role in binding affinity of the ultra-potent small-molecule biotin [83]. There may be good possibilities available to optimize RS3-specific interactions, for instance, with enhanced interactions with the sulfur atom of Met [84].   Infrequency of mutations in RS3 may indicate a defiance against plausible point mutation in this position that could cause drug resistance [85]. Perhaps the somewhat buried location of RS3 in the quite rigid αC-helix position that offers a limited flexibility, renders the mutations in this location (at least in most cases) incompetent to drive kinase activation. This motivates further to optimize protein-ligand interactions for RS3. However, the data at hand may not necessarily cover potential drug therapy induced mutations in cancer patients. We believe that in near future, with accumulation of this data, this information will be more accessible, and a better estimate can be provided.
The full data presented in this review are freely available at https://doi.org/10.5281/zenodo.5796550

Perspectives
• The role of the conserved R-spine and RS3 residue in protein kinase function is well established.
• Protein kinases display diversity in their RS3 residue and in its surroundings. Many smallmolecule protein kinase inhibitors, including approved drugs, display contacts to RS3.
• Considering the RS3 residue more carefully in the design of small-molecule kinase inhibitors may offer important advantage for the inhibitor binding and selectivity.

Competing Interests
The authors declare that there are no competing interests associated with the manuscript. Hydrophobic interaction between ligand and RS3: +at least one structure with RS3-ligand contact available; -no contacts observed in available structures; 2 Silent mutation.