The first protein structures revealed a complex web of weak interactions stabilising the three-dimensional shape of the molecule. Small molecule ligands were then found to exploit these same weak binding events to modulate protein function or act as substrates in enzymatic reactions. As the understanding of ligand–protein binding grew, it became possible to firstly predict how and where a particular small molecule might interact with a protein, and then to identify putative ligands for a specific protein site. Computer-aided drug discovery, based on the structure of target proteins, is now a well-established technique that has produced several marketed drugs. We present here an overview of the various methodologies being used for structure-based computer-aided drug discovery and comment on possible future developments in the field.

Introduction

Computational methods of drug discovery have a basis in the very earliest structural studies of protein molecules. The first structure of a protein, that of sperm whale myoglobin initially at 6 Å resolution [1] then refined to 2 Å resolution [2], revealed a complex network of interactions between the amino acids far more difficult to understand than the elegant simplicity seen in the base-pairing of DNA [3]. Understanding how proteins interact with ligands had to wait for several more years; the analysis of the structure of lysozyme [4] and comparison with inhibitor–lysozyme complexes [5] in 1965 immediately revealed that there is structural specificity in ligand–protein binding, with known lysozyme inhibitors all binding to the same site on the protein unlike closely related molecules, that could not inhibit the enzyme and did not bind to a particular site. Details at the atomic level of ligand–protein complexes soon followed, demonstrating that the theoretical models for how enzyme-bound substrates and allosteric modulators were an accurate picture of complex formation [6].

These early structural insights made it clear that an understanding of ligand–protein interactions could also provide the basis for the rational design of new molecules with an increased affinity for their binding site. A knowledge of the contacts, at the atomic level, present between protein residues and ligand atoms, should allow selective modification of the chemical structure of the ligand, perhaps improving the complementarity of the shape of the ligand to the binding site of the protein, introducing additional favourable interactions or removing interactions that would inhibit binding. One of the first attempts to do this was the design of novel compounds intended to mimic the action of 2,3-diphosphoglycerate (DPG, Figure 1) on haemoglobin [7], based on the crystal structure of the DPG–deoxyhaemoglobin complex [8]. Carried out using a hand-made physical model of the protein, this involved measuring distances between amino acid side-chains and identifying possible electrostatic interactions in the DPG pocket. A set of three compounds, with chemical structures unrelated to that of DPG, were designed, synthesised and shown to induce a shift in the oxygen dissociation curve of haemoglobin in a manner closely following that of the original ligand [7]. The success of this first attempt at using a protein structure to design novel ligands indicated the great promise of this approach for drug design.

Chemical structures of the compounds listed in this review.

Figure 1.
Chemical structures of the compounds listed in this review.

(A) DPG, (B) Relenza™ (Zanamivir), (C) Sabril™ (Vigabatrin, γ-vinyl GABA), (D) DADMe-Immucillin-H (BCX4208, Ulodesine), (E) Isentress™ (Raltegravir), (F) Incivek™ (Telaprevir), (G) Tekturna™ (Rasiliz™, Aliskiren), (H) Glivac™ (Gleevec™, Imatinib, STI-571) and (I) Asciminib (ABL001). Alternative names are given in the brackets. Chemical structures were drawn using MarvinSketch 6.0.2 (ChemAxon, https://chemaxon.com/).

Figure 1.
Chemical structures of the compounds listed in this review.

(A) DPG, (B) Relenza™ (Zanamivir), (C) Sabril™ (Vigabatrin, γ-vinyl GABA), (D) DADMe-Immucillin-H (BCX4208, Ulodesine), (E) Isentress™ (Raltegravir), (F) Incivek™ (Telaprevir), (G) Tekturna™ (Rasiliz™, Aliskiren), (H) Glivac™ (Gleevec™, Imatinib, STI-571) and (I) Asciminib (ABL001). Alternative names are given in the brackets. Chemical structures were drawn using MarvinSketch 6.0.2 (ChemAxon, https://chemaxon.com/).

Coupled with improvements in computer graphics, such as the release of the Evans and Sutherland Picture Systems in the 1970s, the translation of these early manual approaches to drug design into practical computational processes for the rational development of new drugs was a logical step [9]. Among the first programmes for protein structure-based drug design was GRID [10], created by one of the authors (Dr Peter Goodford) of the DPG analogue study [7]. GRID calculates the energy of interaction for a series of probes (water, methyl, amine nitrogen, carboxyl oxygen and hydroxyl, representing a subset of likely atom types in a ligand) with the atoms of a protein structure, providing a three-dimensional potential map for each probe in the volume of space around the protein. Graphical representations of the potential maps for each probe are then displayed overlaid with the structure of the protein, ‘…so that energy and shape can be considered simultaneously when designing drugs.’ [10]. Commercialisation of molecular modelling and drug design occurred at almost the same time as the publication of GRID, with companies such as Biosym Incorporated, Tripos Incorporated and Molecular Discovery Ltd. coming into existence in the mid-1980s.

The field of computational drug design has expanded enormously since these early exploratory efforts. Several marketed drugs now exist that were invented using structure-based drug design techniques, the first being the neuraminidase inhibitor Relenza™ (Figure 1), a treatment for influenza modelled on the structure of the sialic acid–neuraminidase complex [11,12]. Protein structures are now used routinely at several points in the drug development process (Figure 2), from assessing the ‘druggability’ of a target through initial hit identification and design, to checking for potential off-target effects. It should be noted, however, that information arising from structure-based computer-aided drug discovery is essentially a prediction and remains as such until confirmed using appropriate experimental techniques (e.g. biological screening of compounds identified in an in silico screen or chemical synthesis and assay of de novo designed molecules). Before we outline how protein structures are employed for drug discovery and some of the computational approaches used at each point in the process, we will touch very briefly on the experimental methods used to obtain atomic level protein structures and how to access them.

Typical computational drug discovery workflow.

Figure 2.
Typical computational drug discovery workflow.

Schematic illustrating where atomic level protein structures fit into the computational drug discovery workflow. High-throughput chemical screening assays (HTS) encompass both affinity and in vitro activity assays and include techniques such as surface plasmon resonance, microscale thermophoresis, isothermal calorimetry, NMR, fluorescence polarisation, Förster resonance energy transfer (FRET), enzyme-linked immunosorbent assay (ELISA) and radioligand binding. Fragment screening utilises many of the HTS experimental techniques but specifically applied to small molecules, typically 200–300 Da.

Figure 2.
Typical computational drug discovery workflow.

Schematic illustrating where atomic level protein structures fit into the computational drug discovery workflow. High-throughput chemical screening assays (HTS) encompass both affinity and in vitro activity assays and include techniques such as surface plasmon resonance, microscale thermophoresis, isothermal calorimetry, NMR, fluorescence polarisation, Förster resonance energy transfer (FRET), enzyme-linked immunosorbent assay (ELISA) and radioligand binding. Fragment screening utilises many of the HTS experimental techniques but specifically applied to small molecules, typically 200–300 Da.

The worldwide repository for protein structures is the Protein Data Bank (PDB, www.rcsb.org, www.wwpdb.org, [1315]) and at the time of preparing this mini review (8 June 2018), there were 141 010 data entries, covering 44 278 unique protein sequences. Originally housed at the Brookhaven National Laboratory (Long Island, New York, U.S.A.), the creation of the PDB in 1971 and free access to the data it contains has increasingly driven the drug discovery process. The conceptualisation of a common file format to store protein atomic coordinate data, combined with the development of the Brookhaven Raster Display (BRAD) molecular graphics system to visualise proteins in three dimensions and the SEARCH program to remotely access the database, ultimately resulted in the birth of the PDB as we know it today. Overwhelmingly the bulk of the protein structures are determined by X-ray crystallography and currently in the database there are 124 299 protein structures determined using this experimental technique. A relatively minor proportion of the protein structures (∼8%) arise from nuclear magnetic resonance (NMR) techniques, with 400–600 structures being added annually since the peak in 2007 when 965 structures were deposited. Since the first electron microscopy (EM) protein structure was deposited in the database in 1977, the number of structures determined by EM, and more recently cryo-EM (cryogenic-electron microscopy), have increased exponentially each year with 555 structures deposited in 2017 bringing the total to 2123 (∼1.5% of entries in the PDB). While successfully employed to study soluble proteins down to the size of haemoglobin (64 kDa) [16], it is the increasingly successful application of cryo-EM to challenging targets such as membrane receptors [1719] where the contribution of this technique to the field of protein structure-based computer-aided drug design is set to sky-rocket. It is, of course, the case that it is not always possible to start a computational drug discovery project with an experimentally derived protein structure, but the wealth of data in the PDB means that useful homology models of many proteins of interest can be prepared from a suitable template protein structure [2024].

It must always be remembered that the protein structures in the PDB are three-dimensional model representations of the experimental data. The quality of the model depends upon the resolution, accuracy, completeness and interpretation of the experimental dataset. For example, in a protein structure obtained by X-ray crystallography, there may be disordered side-chains and the interpretation of the electron density in these regions is largely dependent upon the crystallographer building the model. Submission of the experimental data to the PDB was initially voluntary; however, by 2008 it was compulsory to deposit crystallographic structure factors and NMR restraints along with the atomic coordinates [25]. Thus, it is prudent to always read the PDB file header information and download the experimental data to evaluate the quality of the protein model prior to commencing drug design work.

Site identification and validation

Too often it is forgotten that a protein crystal structure is not only a model, but is also simply a snapshot of the protein, a low-energy conformation trapped at a single time point. Proteins are dynamic — whole domains rotate/translate, loops flap around and residue side-chains move. Proteins have active and inactive states (or open and closed states in the case of channels and transporters), and the structures of these low-energy states (or conformations) can usually be captured by X-ray crystallography, cryo-EM or NMR. Some proteins also have clearly identifiable intermediate states which can be trapped experimentally; a text search for the key words ‘intermediate state’ identified ∼430 available X-ray crystallography and cryo-EM structures in the PDB. Increasingly, molecular dynamic simulations are being used to predict intermediate protein states on conformational landscapes connecting stable, low energy end states with varying degrees of success [2628]. Ligand (or protein)-binding sites can be targeted by small molecules using a variety of computational techniques (described below). One of the first steps is to identify which protein state is appropriate for you to target, for example targeting the ‘DFG-out’ or inactive conformation of the c-Abl kinase domain of the Bcr-Abl oncogenic fusion protein resulted in the highly successful drug Glivac™ (also known as Gleevec™, imatinib and STI-571, Figure 1, [29]) approved for the use in treating chronic myelogenous leukaemia (CML), gastrointestinal stromal tumours and other cancers.

Having decided on the protein conformation or state to use, the next step is to identify ‘druggable’ sites on the protein, i.e. binding site(s) to target with small molecules. If ligand–protein complex structure(s) already exist for a protein target with biologically relevant molecules or approved drugs, then the binding site is not only identified but also validated. Drug-/ligand-binding sites come in a variety of shapes, with different characteristics and locations (Figure 3); the ideal site is a concave pocket lined by many hydrogen bond donors and acceptors, and a few hydrophobic side-chains [30,31]. Some binding sites are relatively flat and seemingly featureless, as in the case of many protein–protein interaction (PPI) interfaces. However, at the interface, we often find ‘hotspot’ or key residues which account for a high percentage of the interaction energy and it is possible to target these residues with small molecules [3234]. Detailed descriptions of computational methods or tools to characterise binding sites have been recently published [31,35,36], there are also comprehensive lists (with web links) available on many structure-based drug design websites (e.g. https://www.click2drug.org/).

Different types of druggable ligand-binding sites in proteins.

Figure 3.
Different types of druggable ligand-binding sites in proteins.

(A) The substrate binding site of phosphodiesterase 4 (PDE4, PDB ID: 3SL4, [124]) is an example of an ideal ligand-binding site. The concave pocket contains hydrogen bond donors and acceptors, as well as hydrophobic and aromatic residues and the catalytic zinc (grey sphere in inset panel). The thiophene inhibitor is shown as cyan sticks and hydrogen bonds are depicted as black dashed lines. (B) Thanos and co-workers exploited two hot spot residues (E62 and F42, coloured red) on the relatively flat interleukin-2 (IL-2, PDB ID: 1PY2, [125]) surface that are critical for the interaction with its receptor α-subunit. By targeting these two IL-2 residues, and using a fragment-based approach to ligand design, they identified the PPI inhibitor SP4206 (shown as yellow sticks). (C) Binding of asciminib (cyan coloured sticks) in the allosteric site of Bcr-Abl is specific and maintains the kinase catalytic domain in an inactive conformation. The inhibitor, nilotinib (orange sticks), targets this conformation of Bcr-Abl and binds into the adenosine triphosphate-binding pocket (PDB ID: 5MO4, [126]). (D) The antibiotic target, TEM-1 β-lactamase, is a prime example of a protein with a cryptic ligand-binding site. In the ligand-free state (top), the catalytic residue Ser70 (coloured orange) is solvent accessible but only upon binding of a non-competitive, reversible inhibitor (cyan sticks, bottom protein structure) is the location of a cryptic binding site revealed (PDB IDs: 1JWP, 1PZO, [127,128]). In each panel, the protein is depicted as a grey molecular surface.

Figure 3.
Different types of druggable ligand-binding sites in proteins.

(A) The substrate binding site of phosphodiesterase 4 (PDE4, PDB ID: 3SL4, [124]) is an example of an ideal ligand-binding site. The concave pocket contains hydrogen bond donors and acceptors, as well as hydrophobic and aromatic residues and the catalytic zinc (grey sphere in inset panel). The thiophene inhibitor is shown as cyan sticks and hydrogen bonds are depicted as black dashed lines. (B) Thanos and co-workers exploited two hot spot residues (E62 and F42, coloured red) on the relatively flat interleukin-2 (IL-2, PDB ID: 1PY2, [125]) surface that are critical for the interaction with its receptor α-subunit. By targeting these two IL-2 residues, and using a fragment-based approach to ligand design, they identified the PPI inhibitor SP4206 (shown as yellow sticks). (C) Binding of asciminib (cyan coloured sticks) in the allosteric site of Bcr-Abl is specific and maintains the kinase catalytic domain in an inactive conformation. The inhibitor, nilotinib (orange sticks), targets this conformation of Bcr-Abl and binds into the adenosine triphosphate-binding pocket (PDB ID: 5MO4, [126]). (D) The antibiotic target, TEM-1 β-lactamase, is a prime example of a protein with a cryptic ligand-binding site. In the ligand-free state (top), the catalytic residue Ser70 (coloured orange) is solvent accessible but only upon binding of a non-competitive, reversible inhibitor (cyan sticks, bottom protein structure) is the location of a cryptic binding site revealed (PDB IDs: 1JWP, 1PZO, [127,128]). In each panel, the protein is depicted as a grey molecular surface.

In the majority of cases, the choice of binding site may be an obvious one; for example, the orthosteric agonist site of a receptor or the catalytic site of an enzyme. However, sometimes greater selectivity and/or potency is achieved by targeting a unique allosteric or cryptic binding site. A cryptic binding site (or pocket) is a site on a protein that is only revealed upon the binding of a ligand (or another protein). They can be identified experimentally (e.g. by analysing ligand-bound and unbound structures of the same protein) and computationally (e.g. by molecular dynamic methods, flexible ligand docking and hot spot residue prediction) [3740]. The interest in targeting cryptic binding sites has led to the development of more automated web-based computational tools to identify or predict their presence in unbound protein structures, like the TRAPP webserver [41] and CryptoSite [42]. A recent example of a compound designed to target an allosteric pocket is the clinical candidate asciminib (ABL001, Figure 1), currently in Phase I and III clinical trials for the treatment of CML, Philadelphia positive acute lymphoblastic leukaemia and advanced solid tumours (NCT03106779, NCT02081378, NCT03292783, [43,44]). Asciminib binds into a small pocket on the Bcr-Abl kinase N-terminal domain usually occupied by myristate (Figure 3). The exploitation of a cryptic pocket in HIV integrase led to the development of Isentress™ (raltegravir, Figure 1, [45]) a retroviral drug used in the treatment of HIV/AIDS.

In silico screening and ligand docking

Over the last ∼30 years, the techniques have moved from manually (or interactively) docking single compounds into a pharmaceutically validated protein binding site to now docking in silico libraries containing millions of compounds. The development of the individual technologies that are distilled into what is referred to as in silico or virtual screening (new algorithms, parallelisation of algorithms, improvements in molecular mechanics force fields, incorporation of quantum mechanical features, increased CPU power and the use of graphics processing units for computation, ability to handle extremely large datasets electronically, high resolution graphic displays etc. [26,4651]), combined with ever decreasing costs for computer hardware, has resulted in the methodology becoming entrenched in the drug discovery process in both industry and academia. In silico screening is now routinely used to enrich compound libraries prior to high-throughput screening campaigns [21,33,47,5257]. Owing to the number of in-depth reviews on this subject published over the last ten years (e.g. [21,49,51,5862]), the method is only described briefly here.

The key steps in protein structure-based in silico screening are: (1) preparation of the protein structure(s); (2) compound library preparation; (3) compound docking strategy selection and (4) analysis of results. Some of the factors to consider, or tasks to complete, when preparing the protein structure are to add in missing residues or atoms, decide whether structurally or functionally important water molecules should be included in the target site, and assign protonation/tautomer states to amino acid side-chains. When preparing an in silico compound library, it is critical that the nature of the protein target is taken into consideration. For example, when targeting a PPI some of the traditional drug-like physicochemical compound selection parameters may be less relevant [33,6367]. For a central nervous system protein target, a compound library with a bias towards physicochemical properties enabling the compounds to cross the blood–brain barrier would be essential [68]. Other factors to consider during the compound library preparation are whether to generate stereoisomers, which protonation and tautomer states to include, whether to filter out compounds known to show non-specific inhibition or those with other unwanted chemical properties [6971]. In the case of natural product compound libraries being used to explore new chemical scaffolds, it may be that none of the standard compound filters are appropriate [72]. Any confirmed ligands for a protein target should be included in the compound library as positive controls, while any compounds known to be non-binders can be included as negative controls. Once the compound library is prepared, different docking strategies must be considered; target the selected site in a single protein structure or, if available, multiple structures, what level of protein and compound flexibility to use, which docking algorithm and scoring functions are most appropriate, whether the presence of specific interactions seen for known inhibitors should be enforced, etc. Finally, the library screen is complete and the task of analysis begins — typically the compounds are ranked by docking score and the highest ranking compounds are clustered by chemical similarity to provide a chemotype ranking rather than simply a compound ranking. This is particularly helpful if a docked library contains several closely related compounds that all score well against a target site, swamping the compound ranking list but providing a single entry in a chemotype ranking list. Compounds are then purchased, or medicinal chemistry undertaken, and the affinity and/or activity of the in silico ligands evaluated in in vitro and possibly in vivo assays. Examples of protein targets, software, methodology used and hit compounds can be found in the reviews cited above.

The initial focus of in silico screening was the identification of non-covalently bound compounds. However, with the FDA approval of several covalent drugs (e.g. Sabril™ in the treatment of epilepsy, [73] and Incivek™ in the treatment of hepatitis C virus, [74], Figure 1), there is renewed interest in applying the process to the discovery of covalently bound compounds. Modifications have been made to existing docking algorithms, such as Gold and Autodock, in an attempt to account for the covalent linkage between compound and protein, while new algorithms, such as DOCKTITE and CovalentDock, have also been developed [7578]. The reader is directed to an excellent review on the subject by De Cesco et al. [79].

De novo ligand design

Identification of potential ligands for a target site on a protein through in silico screening has one drawback — the process can only identify compounds that already exist in the library used for screening. However, the number of potential drug-like compounds, a ‘chemical space’ estimated to be of the order of 1060 distinct molecules, is vastly greater than the ∼108 molecules that have been synthesised [80] let alone the standard in silico library size in the 106 range. As an alternative, de novo ligand design combines information about the target site on the protein with computational chemistry to build new ligands in situ, selecting functional groups on the basis of their interaction with the target site and the geometry and chemistry of the compound as it is assembled [81,82]. This means that a much wider region of drug-like chemical space can be sampled directly, potentially leap-frogging some of the initial hit to lead development process while also identifying unique, and therefore valuable, compound structures.

While the principle in each case is the same, a range of different approaches exist for the derivation of ligand-binding sites during de novo design. New molecules are assembled in the target site on an atom-by-atom or fragment-by-fragment basis, where each fragment is a larger subsection of a drug-like molecule, such as an aromatic ring or methyl group (Figure 4). A method of scoring the addition of each new group to the growing molecule is used to rank a potential change. Depending on the software used, the scoring process may be rule-based, rely on spatial probe maps derived from a program such as GRID [10], employ a force field or make use of knowledge-based or empirical scoring functions [80]. The de novo design process can be interactive, allowing a medicinal chemist to design new compounds by hand, or automated such that the program generates a list of potential ligands independently. In both cases, ensuring synthetic tractability of the designed compounds is key, with the medicinal chemist's direct input during manual design and increasingly sophisticated computational synthesis for automated design. A recent review of chemistry-driven de novo design [82] summarises the software packages available with examples of their application, such as the successful development of both BACE-1 [83] and dihydroorotate dehydrogenase [84] inhibitors using SPROUT [85].

Overview of the process of de novo drug design.

Figure 4.
Overview of the process of de novo drug design.

In de novo design (A), the structure of a target site on a protein is selected. (B) The site is then used to identify sets of molecular fragments that are complementary to the site. Each starting fragment is then further elaborated and the process repeated several times until a set of end-point compounds are identified (C). The various starting fragments will end up identifying different elaborations, resulting in a wide diversity of final compounds. Shown here is a much simplified process with only three initial fragments, a single round of elaboration and a single product compound, with the dashed arrows indicating elaborations that are not detailed in the figure. In actual de novo design campaigns, there will be tens of fragments elaborated, refined and ranked, producing hundreds or thousands of potential ligands.

Figure 4.
Overview of the process of de novo drug design.

In de novo design (A), the structure of a target site on a protein is selected. (B) The site is then used to identify sets of molecular fragments that are complementary to the site. Each starting fragment is then further elaborated and the process repeated several times until a set of end-point compounds are identified (C). The various starting fragments will end up identifying different elaborations, resulting in a wide diversity of final compounds. Shown here is a much simplified process with only three initial fragments, a single round of elaboration and a single product compound, with the dashed arrows indicating elaborations that are not detailed in the figure. In actual de novo design campaigns, there will be tens of fragments elaborated, refined and ranked, producing hundreds or thousands of potential ligands.

As well as building new compounds from scratch, the de novo design approach can be applied to an existing ligand–protein complex with the known ligand used as a seed structure for the design process. The software then identifies bioisosteres [86] for replaced sections of the molecule (Figure 4). Retaining part of a bound compound in its crystallographically determined position guarantees a starting point for de novo design that is located correctly, while maintaining the advantage of novel chemical space [87]. A single compound structure can potentially provide multiple substructural starting fragments to which de novo design rebuilding steps can be applied [87].

De novo design can also be used to remove undesirable features, such as off-target effects, from a ligand. Comparison of the structure of a developmental drug in complex with both its target and an off-target protein can potentially identify vectors of modification that will reduce binding to the undesired protein. An excellent example of this use of structure-based design is in the development of Bcl-2 family inhibitors as cancer therapeutics, where a co-crystal of the lead compound with human serum albumin (HSA) allowed the design of new molecules with reduced HSA affinity but unchanged binding to their true target [88].

Enzyme transition state inhibitors

Distinct from intermediate states, all enzymes have transitional states that exist on a femtosecond (10−15 s) to picosecond (10−12 s) timescale [89,90]. Given that this timescale is of the same order as bond vibrations, it is extremely difficult to study these transient states experimentally. They have been observed in the gas phase using femtosecond laser technologies, but they are more commonly studied using a combination of biochemical kinetic isotope effects and computational chemistry [89,90]. NMR and mass spectroscopy techniques have been developed to analyse kinetic isotope effect data [9193]. X-ray absorption spectroscopy currently operates at the femtosecond timescale and is moving towards the attosecond (10−18 s) realm; application of this powerful technique to enzyme reactions could lead to major advances in the observation of transition states [94,95]. Currently enzyme transition states can be modelled using computational chemistry methods, such as quantum mechanic calculations to interpret the kinetic isotope effect data [96,97], transition path sampling to find transition states in complex systems in a more unbiased way [98,99] or molecular dynamic simulations [100,101]. Compounds can then be designed to mimic the geometry and electrostatics of the transition state species. Such compounds are known as transition state analogues or inhibitors and they typically display a slow on-set of enzyme inhibition and then slow inhibitor release [90]. It is only possible to develop transition state inhibitors for enzymes where the features of the transition state can be mimicked by stable chemistry, for example the transition state hydride transfer of dehydrogenases would be extremely difficult to exploit.

The first X-ray crystal structures of transition state analogues in complex with their enzymes were solved in 1970 [102,103]. Over the ensuing 48 years, this number has increased by several hundred and a text search of the PDB identified ∼360 structure entries with the key words ‘transition state inhibitor’. A variety of computational methods have been applied to transition state inhibitor–enzyme crystal complexes to design new inhibitors with improved potency and selectivity, many of which are in the clinic or undergoing clinical trials. One example is the first-in-class renin inhibitor Tekturna™ (Figure 1), approved by the FDA in 2007 for primary hypertension. Structural analysis of transition state inhibitor–renin crystal complexes combined with interactive molecular modelling led to the development of Tekturna™ (also known as Rasilez™ and aliskiren, [104]). A more recent example is DADMe-Immucillin-H (Figure 1), a transition state inhibitor of human purine nucleoside phosphorylase. Immucillin-H–purine nucleoside phosphorylase crystal complexes were used as the starting point to design achiral inhibitors utilising molecular dynamics and quantum chemistry in the design process [89,90]. Under the names BCX4208 and Ulodesine, DADMe-Immucillin-H completed Phase II clinical trials in 2012 for gout (NCT01407874, NCT01265264).

Problem targets

While each technique has its unique shortcomings, there are some proteins that remain difficult to target through computational drug design regardless of the method used. Clearly where a protein has a poorly defined structure, as is the case of intrinsically disordered proteins that may only adopt transitory structures in complex with a physiologically relevant partner [105], structure-based design is problematic. Computational approaches are being developed to tackle such proteins [106], but they remain a challenging target. As mentioned above, PPIs are another hard target for computational drug discovery. The typical PPI interface is a relatively large, flat, featureless surface; the exact opposite of a desirable target site for ligand design (see above and Figure 3). The importance of PPIs in many physiological processes means that efforts are still made to target them for drug design, but particular care needs to be taken [33].

Future directions

Computational techniques for structure-based drug discovery have come a long way from the initial attempts using manual fitting of wood, wire and plastic models. The field, however, is not static and there is a range of areas in which active development promises additional ways in which ligand–protein complexes will provide insights for the process of drug discovery, from initial hit discovery to preclinical candidate selection.

One area where there has been substantial progress is the determination of accurate binding energies for the formation of ligand–protein complexes [107109]. To date, the binding affinity of ligands has been calculated either through some form of scoring function, whether knowledge-based or empirical, or through a relatively simple force field. While fine-tuning of these over time has meant that useful ranking information can be generated from these simplistic tools, an accurate measure of affinity with an error less than 2 kcal mol−1 is much harder to achieve. Using full atom molecular dynamic simulations, an estimate of the total energy of the system, including protein, ligand, solvent etc., can be calculated as a function of the position of all the atoms. In simple terms, repetition of this calculation for extended simulations with the ligand both bound to the protein and free in solution allows an estimate of the energy of binding to be computed. Accuracy in the calculation of binding affinity is limited by the length of the simulation, and hence by the computational power available and the patience of the researcher. Even more accurate predictions can now be made where there is a known ligand with an experimentally determined affinity. The ligand structure can be computationally morphed into a new ligand of interest during the simulation [110]. This has particular applicability in medicinal chemistry campaigns where a compound core may be decorated with a range of functional groups by a team of chemists; experimental binding data need only be determined for one of the class of compounds allowing the rest to be accurately (±1 kcal mol−1) determined computationally [110].

Another area of active development is in silico prediction of absorption, metabolism, distribution, elimination and toxicity (ADME-Tox). Prior to research compounds moving into the clinic and being tested in people, they go through a process of preclinical development, where off-target effects and toxicity are explored in tissue culture and non-human animals [111]. Animal testing is a major ethical concern, so computational models that can replace some of the live animal testing are attractive. Computer models based on the chemical structure of potential drugs can be used to predict many of the important ADME-Tox properties of the compounds [112116]. With the availability of more and more structures for important proteins in human metabolism, it is also now possible to explore the interaction of a compound with a potential metabolic or toxicity target directly. For example, plasma protein binding is an important factor in the distribution and elimination of drugs. As noted previously for Bcl-2 family inhibitors [88], knowledge of the details of ligand–plasma protein complex structures can be exploited to alter the affinity of the complex, thus altering the ADME parameters [117]. Recently, the cryo-EM structure of the potassium channel protein hERG became available, allowing rationalisation of years of toxicity models for this important cardiac ion channel [118]. As further structures of relevant human proteins become available, the ability to make ADME-Tox decisions on compound design based on specific ligand–protein complex models will continue to grow.

With the growth in computing power, there has also been a growth in the power of artificial intelligence (AI) techniques for problem solving. Machine learning algorithms, in which a training set of known results is used to optimise decision-making processes independently of direct human intervention, can produce systems that solve extremely complex problems very efficiently including some where algorithms seemed impossible, such as the game of Go [119,120]. These deep learning approaches can be applied to drug discovery as well [121]; it is currently being applied to all aspects of computational chemistry and in silico ligand identification including de novo design [122], docking and in silico screening [123]. While such techniques hold great promise, they are currently restricted by the inherent multi-feature optimisation required in drug design and the limited availability of appropriate training data [121]. As more work is put into machine learning and AI techniques for computational drug design, it can be expected that appropriate datasets for training will be developed in parallel.

Perhaps, as such techniques are perfected, a drug discovery programme will one day consist of simply handing a protein structure to a dedicated AI and collecting the clinical candidate from the automated medicinal chemistry laboratory next door. The inherent complexity of both the computational problem in identifying a potential drug and the synthetic challenge in creating it means that such an idyllic process is many years from being realised — there will be jobs for human drug discovery and development teams for the foreseeable future!

Abbreviations

     
  • ADME-Tox

    absorption metabolism distribution elimination and toxicity

  •  
  • AI

    artificial intelligence

  •  
  • CML

    chronic myelogenous leukaemia

  •  
  • cryo-EM

    cryogenic-electron microscopy

  •  
  • DPG

    2,3-diphosphoglycerate

  •  
  • EM

    electron microscopy

  •  
  • ELISA

    enzyme-linked immunosorbent assay

  •  
  • FRET

    Förster resonance energy transfer

  •  
  • HSA

    human serum albumin

  •  
  • HTS

    high-throughput chemical screening assays

  •  
  • IL-2

    interleukin-2

  •  
  • NMR

    nuclear magnetic resonance

  •  
  • PDB

    Protein Data Bank

  •  
  • PDE4

    phosphodiesterase 4

  •  
  • PPI

    protein–protein interaction

Author Contribution

T.L.N. and C.J.M. wrote the paper. All authors refined the manuscript. M.W.P. supervised the work.

Funding

This work was partly supported by a grant from the Australian Cancer Research Foundation to M.W.P. Funding from the Victorian Government Operational Infrastructure Support Scheme to St Vincent's Institute is acknowledged. M.W.P. is a National Health and Medical Research Council of Australia Research Fellow.

Acknowledgments

We thank all current and past members of the Parker laboratory and our collaborators for their contributions to our structure-based drug discovery efforts. We particularly thank Biota Pharmaceuticals for providing the opportunity to us to develop a structure-based drug discovery laboratory.

Competing Interests

The Authors declare that there are no competing interests associated with the manuscript.

References

References
1
Kendrew
,
J.C.
,
Bodo
,
G.
,
Dintzis
,
H.M.
,
Parrish
,
R.G.
,
Wyckoff
,
H.
and
Phillips
,
D.C.
(
1958
)
A three-dimensional model of the myoglobin molecule obtained by X-ray analysis
.
Nature
181
,
662
666
2
Kendrew
,
J.C.
,
Dickerson
,
R.E.
,
Strandberg
,
B.E.
,
Hart
,
R.G.
,
Davies
,
D.R.
,
Phillips
,
D.C.
et al. 
(
1960
)
Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å. Resolution
.
Nature
185
,
422
427
3
Watson
,
J.D.
and
Crick
,
F.H.
(
1953
)
Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid
.
Nature
171
,
737
738
4
Blake
,
C.C.
,
Koenig
,
D.F.
,
Mair
,
G.A.
,
North
,
A.C.
,
Phillips
,
D.C.
and
Sarma
,
V.R.
(
1965
)
Structure of hen egg-white lysozyme: a three-dimensional Fourier synthesis at 2 Å resolution
.
Nature
206
,
757
761
5
Johnson
,
L.N.
and
Phillips
,
D.C.
(
1965
)
Structure of some crystalline lysozyme-inhibitor complexes determined by X-ray analysis at 6 Å resolution
.
Nature
206
,
761
763
6
Koshland
, Jr,
D.E.
(
1963
)
Correlation of structure an function in enzyme action
.
Science
142
,
1533
1541
7
Beddell
,
C.R.
,
Goodford
,
P.J.
,
Norrington
,
F.E.
,
Wilkinson
,
S.
and
Wootton
,
R.
(
1976
)
Compounds designed to fit a site of known structure in human haemoglobin
.
Br. J. Pharmacol.
57
,
201
209
8
Arnone
,
A.
(
1972
)
X-ray diffraction study of binding of 2,3-diphosphoglycerate to human deoxyhaemoglobin
.
Nature
237
,
146
149
9
Goodford
,
P.J.
(
1984
)
Drug design by the method of receptor fit
.
J. Med. Chem.
27
,
557
564
10
Goodford
,
P.J.
(
1985
)
A computational procedure for determining energetically favorable binding sites on biologically important macromolecules
.
J. Med. Chem.
28
,
849
857
11
Varghese
,
J.N.
,
Laver
,
W.G.
and
Colman
,
P.M.
(
1983
)
Structure of the influenza virus glycoprotein antigen neuraminidase at 2.9 Å resolution
.
Nature
303
,
35
40
12
von Itzstein
,
M.
,
Wu
,
W.Y.
,
Kok
,
G.B.
,
Pegg
,
M.S.
,
Dyason
,
J.C.
,
Jin
,
B.
et al. 
(
1993
)
Rational design of potent sialidase-based inhibitors of influenza virus replication
.
Nature
363
,
418
423
13
Berman
,
H.M.
,
Westbrook
,
J.
,
Feng
,
Z.
,
Gilliland
,
G.
,
Bhat
,
T.N.
,
Weissig
,
H.
et al. 
(
2000
)
The Protein Data Bank
.
Nucleic Acids Res.
28
,
235
242
14
Berman
,
H.
,
Henrick
,
K.
and
Nakamura
,
H.
(
2003
)
Announcing the worldwide Protein Data Bank
.
Nat. Struct. Biol.
10
,
980
15
Rose
,
P.W.
,
Prlic
,
A.
,
Altunkaya
,
A.
,
Bi
,
C.
,
Bradley
,
A.R.
,
Christie
,
C.H.
et al. 
(
2017
)
The RCSB protein data bank: integrative view of protein, gene and 3D structural information
.
Nucleic Acids Res.
45
,
D271
D281
16
Khoshouei
,
M.
,
Radjainia
,
M.
,
Baumeister
,
W.
and
Danev
,
R.
(
2017
)
Cryo-EM structure of haemoglobin at 3.2 Å determined with the Volta phase plate
.
Nat. Commun.
8
,
16099
17
Liang
,
Y.L.
,
Khoshouei
,
M.
,
Glukhova
,
A.
,
Furness
,
S.G.B.
,
Zhao
,
P.
,
Clydesdale
,
L.
et al. 
(
2018
)
Phase-plate cryo-EM structure of a biased agonist-bound human GLP-1 receptor-Gs complex
.
Nature
555
,
121
125
18
Liang
,
Y.L.
,
Khoshouei
,
M.
,
Radjainia
,
M.
,
Zhang
,
Y.
,
Glukhova
,
A.
,
Tarrasch
,
J.
et al. 
(
2017
)
Phase-plate cryo-EM structure of a class B GPCR-G-protein complex
.
Nature
546
,
118
123
19
Renaud
,
J.P.
,
Chari
,
A.
,
Ciferri
,
C.
,
Liu
,
W.T.
,
Remigy
,
H.W.
,
Stark
,
H.
et al. 
(
2018
)
Cryo-EM in drug discovery: achievements, limitations and prospects
.
Nat. Rev. Drug Discov.
17
,
471
492
.
20
Albiston
,
A.L.
,
Pham
,
V.
,
Ye
,
S.
,
Ng
,
L.
,
Lew
,
R.A.
,
Thompson
,
P.E.
et al. 
(
2010
)
Phenylalanine-544 plays a key role in substrate and inhibitor binding by providing a hydrophobic packing point at the active site of insulin-regulated aminopeptidase
.
Mol. Pharmacol.
78
,
600
607
21
Leelananda
,
S.P.
and
Lindert
,
S.
(
2016
)
Computational methods in drug discovery
.
Beilstein J. Org. Chem.
12
,
2694
2718
22
Lohning
,
A.E.
,
Levonis
,
S.M.
,
Williams-Noonan
,
B.
and
Schweiker
,
S.S.
(
2017
)
A practical guide to molecular docking and homology modelling for medicinal chemists
.
Curr. Top. Med. Chem.
17
,
2023
2040
23
Schmidt
,
T.
,
Bergner
,
A.
and
Schwede
,
T.
(
2014
)
Modelling three-dimensional protein structures for applications in drug design
.
Drug Discov. Today
19
,
890
897
24
Waterhouse
,
A.
,
Bertoni
,
M.
,
Bienert
,
S.
,
Studer
,
G.
,
Tauriello
,
G.
,
Gumienny
,
R.
et al. 
(
2018
)
SWISS-MODEL: homology modelling of protein structures and complexes
.
Nucleic Acids Res.
46
,
W296
W303
.
25
Berman
,
H.M.
,
Kleywegt
,
G.J.
,
Nakamura
,
H.
and
Markley
,
J.L.
(
2012
)
The Protein Data Bank at 40: reflecting on the past to prepare for the future
.
Structure
20
,
391
396
26
Harpole
,
T.J.
and
Delemotte
,
L.
(
2018
)
Conformational landscapes of membrane proteins delineated by enhanced sampling molecular dynamics simulations
.
Biochim. Biophys. Acta
1860
,
909
926
27
Orellana
,
L.
,
Yoluk
,
O.
,
Carrillo
,
O.
,
Orozco
,
M.
and
Lindahl
,
E.
(
2016
)
Prediction and validation of protein intermediate states from structurally rich ensembles and coarse-grained simulations
.
Nat. Commun.
7
,
12575
28
Pisani
,
P.
,
Caporuscio
,
F.
,
Carlino
,
L.
and
Rastelli
,
G.
(
2016
)
Molecular dynamics simulations and classical multidimensional scaling unveil new metastable states in the conformational landscape of CDK2
.
PLoS ONE
11
,
e0154066
29
Waller
,
C.F.
(
2014
)
Imatinib mesylate
.
Recent Results Cancer Res.
201
,
1
25
30
Lionta
,
E.
,
Spyrou
,
G.
,
Vassilatis
,
D.K.
and
Cournia
,
Z.
(
2014
)
Structure-based virtual screening for drug discovery: principles, applications and recent advances
.
Curr. Top. Med. Chem.
14
,
1923
1938
31
Pérot
,
S.
,
Sperandio
,
O.
,
Miteva
,
M.A.
,
Camproux
,
A.C.
and
Villoutreix
,
B.O.
(
2010
)
Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery
.
Drug Discov. Today
15
,
656
667
32
Fry
,
D.C.
(
2015
)
Targeting protein-protein interactions for drug discovery
.
Methods Mol. Biol.
1278
,
93
106
33
Nero
,
T.L.
,
Morton
,
C.J.
,
Holien
,
J.K.
,
Wielens
,
J.
and
Parker
,
M.W.
(
2014
)
Oncogenic protein interfaces: small molecules, big challenges
.
Nat. Rev. Cancer
14
,
248
262
34
Xue
,
W.
,
Wang
,
P.
,
Li
,
B.
,
Li
,
Y.
,
Xu
,
X.
,
Yang
,
F.
et al. 
(
2016
)
Identification of the inhibitory mechanism of FDA approved selective serotonin reuptake inhibitors: an insight from molecular dynamics simulation study
.
Phys. Chem. Chem. Phys.
18
,
3260
3271
35
Abi Hussein
,
H.
,
Geneix
,
C.
,
Petitjean
,
M.
,
Borrel
,
A.
,
Flatters
,
D.
and
Camproux
,
A.C.
(
2017
)
Global vision of druggability issues: applications and perspectives
.
Drug Discov. Today
22
,
404
415
36
Xie
,
Z.R.
and
Hwang
,
M.J.
(
2015
)
Methods for predicting protein-ligand binding sites
.
Methods Mol. Biol.
1215
,
383
398
37
Beglov
,
D.
,
Hall
,
D.R.
,
Wakefield
,
A.E.
,
Luo
,
L.
,
Allen
,
K.N.
,
Kozakov
,
D.
et al. 
(
2018
)
Exploring the structural origins of cryptic sites on proteins
.
Proc. Natl Acad. Sci. U.S.A.
115
,
E3416
E3425
38
Hart
,
K.M.
,
Moeder
,
K.E.
,
Ho
,
C.M.W.
,
Zimmerman
,
M.I.
,
Frederick
,
T.E.
and
Bowman
,
G.R.
(
2017
)
Designing small molecules to target cryptic pockets yields both positive and negative allosteric modulators
.
PLoS ONE
12
,
e0178678
39
Lu
,
S.
,
Ji
,
M.
,
Ni
,
D.
and
Zhang
,
J.
(
2018
)
Discovery of hidden allosteric sites as novel targets for allosteric drug design
.
Drug Discov. Today
23
,
359
365
40
Oleinikovas
,
V.
,
Saladino
,
G.
,
Cossins
,
B.P.
and
Gervasio
,
F.L.
(
2016
)
Understanding cryptic pocket formation in protein targets by enhanced sampling simulations
.
J. Am. Chem. Soc.
138
,
14257
14263
41
Stank
,
A.
,
Kokh
,
D.B.
,
Horn
,
M.
,
Sizikova
,
E.
,
Neil
,
R.
,
Panecka
,
J.
et al. 
(
2017
)
TRAPP webserver: predicting protein binding site flexibility and detecting transient binding pockets
.
Nucleic Acids Res.
45
,
W325
W330
42
Cimermancic
,
P.
,
Weinkam
,
P.
,
Rettenmaier
,
T.J.
,
Bichmann
,
L.
,
Keedy
,
D.A.
,
Woldeyes
,
R.A.
et al. 
(
2016
)
Cryptosite: expanding the druggable proteome by characterization and prediction of cryptic binding sites
.
J. Mol. Biol.
428
,
709
719
43
Adrian
,
F.J.
,
Ding
,
Q.
,
Sim
,
T.
,
Velentza
,
A.
,
Sloan
,
C.
,
Liu
,
Y.
et al. 
(
2006
)
Allosteric inhibitors of Bcr-abl-dependent cell proliferation
.
Nat. Chem. Biol.
2
,
95
102
44
Fabbro
,
D.
,
Manley
,
P.W.
,
Jahnke
,
W.
,
Liebetanz
,
J.
,
Szyttenholm
,
A.
,
Fendrich
,
G.
et al. 
(
2010
)
Inhibitors of the Abl kinase directed at either the ATP- or myristate-binding site
.
Biochim. Biophys. Acta
1804
,
454
462
45
Summa
,
V.
,
Petrocchi
,
A.
,
Bonelli
,
F.
,
Crescenzi
,
B.
,
Donghi
,
M.
,
Ferrara
,
M.
et al. 
(
2008
)
Discovery of raltegravir, a potent, selective orally bioavailable HIV-integrase inhibitor for the treatment of HIV-AIDS infection
.
J. Med. Chem.
51
,
5843
5855
46
Cavasotto
,
C.N.
,
Adler
,
N.S.
and
Aucar
,
M.G.
(
2018
)
Quantum chemical approaches in structure-based virtual screening and lead optimization
.
Front. Chem.
6
,
188
47
Fradera
,
X.
and
Babaoglu
,
K.
(
2017
)
Overview of methods and strategies for conducting virtual small molecule screening
.
Curr. Protoc. Chem. Biol.
9
,
196
212
48
Korb
,
O.
,
Finn
,
P.W.
and
Jones
,
G.
(
2014
)
The cloud and other new computational methods to improve molecular modelling
.
Expert Opin. Drug Discov.
9
,
1121
1131
49
Śledź
,
P.
and
Caflisch
,
A.
(
2018
)
Protein structure-based drug design: from docking to molecular dynamics
.
Curr. Opin. Struct. Biol.
48
,
93
102
50
Wingert
,
B.M.
and
Camacho
,
C.J.
(
2018
)
Improving small molecule virtual screening strategies for the next generation of therapeutics
.
Curr. Opin. Chem. Biol.
44
,
87
92
51
Yuriev
,
E.
,
Holien
,
J.
and
Ramsland
,
P.A.
(
2015
)
Improvements, trends, and new ideas in molecular docking: 2012-2013 in review
.
J. Mol. Recognit.
28
,
581
604
52
Chen
,
H.
,
Kogej
,
T.
and
Engkvist
,
O.
(
2018
)
Cheminformatics in drug discovery, an industrial perspective
.
Mol. Inform.
53
,
4830
.
53
Glaab
,
E.
(
2016
)
Building a virtual ligand screening pipeline using free software: a survey
.
Brief. Bioinform.
17
,
352
366
54
Macalino
,
S.J.
,
Gosu
,
V.
,
Hong
,
S.
and
Choi
,
S.
(
2015
)
Role of computer-aided drug design in modern drug discovery
.
Arch. Pharm. Res.
38
,
1686
1701
55
Roth
,
B.L.
,
Irwin
,
J.J.
and
Shoichet
,
B.K.
(
2017
)
Discovery of new GPCR ligands to illuminate new biology
.
Nat. Chem. Biol.
13
,
1143
1151
56
Xu
,
D.
,
Wang
,
B.
and
Meroueh
,
S.O.
(
2015
)
Structure-based computational approaches for small-molecule modulation of protein–protein interactions
.
Methods Mol. Biol.
1278
,
77
92
57
Zheng
,
M.
,
Zhao
,
J.
,
Cui
,
C.
,
Fu
,
Z.
,
Li
,
X.
,
Liu
,
X.
et al. 
(
2018
)
Computational chemical biology and drug design: facilitating protein structure, function, and modulation studies
.
Med. Res. Rev.
38
,
914
950
58
Cerqueira
,
N.M.
,
Gesto
,
D.
,
Oliveira
,
E.F.
,
Santos-Martins
,
D.
,
Bras
,
N.F.
,
Sousa
,
S.F.
et al. 
(
2015
)
Receptor-based virtual screening protocol for drug discovery
.
Arch. Biochem. Biophys.
582
,
56
67
59
Forli
,
S.
(
2015
)
Charting a path to success in virtual screening
.
Molecules
20
,
18732
18758
60
Katsila
,
T.
,
Spyroulias
,
G.A.
,
Patrinos
,
G.P.
and
Matsoukas
,
M.T.
(
2016
)
Computational approaches in target identification and drug discovery
.
Comput. Struct. Biotechnol. J.
14
,
177
184
61
Spyrakis
,
F.
and
Cavasotto
,
C.N.
(
2015
)
Open challenges in structure-based virtual screening: receptor modeling, target flexibility consideration and active site water molecules description
.
Arch. Biochem. Biophys.
583
,
105
119
62
Tanrikulu
,
Y.
,
Krüger
,
B.
and
Proschak
,
E.
(
2013
)
The holistic integration of virtual screening in drug discovery
.
Drug Discov. Today
18
,
358
364
63
Koes
,
D.R.
,
Dömling
,
A.
and
Camacho
,
C.J.
(
2018
)
Anchorquery: rapid online virtual screening for small-molecule protein–protein interaction inhibitors
.
Protein Sci.
27
,
229
232
64
Kuenemann
,
M.A.
,
Sperandio
,
O.
,
Labbé
,
C.M.
,
Lagorce
,
D.
,
Miteva
,
M.A.
and
Villoutreix
,
B.O.
(
2015
)
In silico design of low molecular weight protein–protein interaction inhibitors: overall concept and recent advances
.
Prog. Biophys. Mol. Biol.
119
,
20
32
65
Sable
,
R.
and
Jois
,
S.
(
2015
)
Surfing the protein-protein interaction surface using docking methods: application to the design of PPI inhibitors
.
Molecules
20
,
11569
11603
66
Whitby
,
L.R.
and
Boger
,
D.L.
(
2012
)
Comprehensive peptidomimetic libraries targeting protein-protein interactions
.
Acc. Chem. Res.
45
,
1698
1709
67
Zhang
,
X.
,
Betzi
,
S.
,
Morelli
,
X.
and
Roche
,
P.
(
2014
)
Focused chemical libraries — design and enrichment: an example of protein–protein interaction chemical space
.
Future Med. Chem.
6
,
1291
1307
68
Ghose
,
A.K.
,
Herbertz
,
T.
,
Hudkins
,
R.L.
,
Dorsey
,
B.D.
and
Mallamo
,
J.P.
(
2012
)
Knowledge-based, central nervous system (CNS) lead selection and lead optimization for CNS drug discovery
.
ACS Chem. Neurosci.
3
,
50
68
69
Baell
,
J.B.
and
Holloway
,
G.A.
(
2010
)
New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays
.
J. Med. Chem.
53
,
2719
2740
70
Capuzzi
,
S.J.
,
Muratov
,
E.N.
and
Tropsha
,
A.
(
2017
)
Phantom PAINS: problems with the utility of alerts for Pan-Assay INterference CompoundS
.
J. Chem. Inf. Model.
57
,
417
427
71
Yang
,
J.J.
,
Ursu
,
O.
,
Lipinski
,
C.A.
,
Sklar
,
L.A.
,
Oprea
,
T.I.
and
Bologa
,
C.G.
(
2016
)
Badapple: promiscuity patterns from noisy evidence
.
J. Cheminform.
8
,
29
72
Sheppard
,
D.W.
,
Lipkin
,
M.J.
,
Harris
,
C.J.
,
Catana
,
C.
and
Stouten
,
P.F.
(
2014
)
Strategies for small molecule library design
.
Curr. Pharm. Des.
20
,
3314
3322
73
Tolman
,
J.A.
and
Faulkner
,
M.A.
(
2009
)
Vigabatrin: a comprehensive review of drug properties including clinical updates following recent FDA approval
.
Expert Opin. Pharmacother.
10
,
3077
3089
74
Kwong
,
A.D.
,
Kauffman
,
R.S.
,
Hurter
,
P.
and
Mueller
,
P.
(
2011
)
Discovery and development of telaprevir: an NS3-4A protease inhibitor for treating genotype 1 chronic hepatitis C virus
.
Nat. Biotechnol.
29
,
993
1003
75
Bianco
,
G.
,
Forli
,
S.
,
Goodsell
,
D.S.
and
Olson
,
A.J.
(
2016
)
Covalent docking using autodock: two-point attractor and flexible side chain methods
.
Protein Sci.
25
,
295
301
76
Jones
,
G.
,
Willett
,
P.
,
Glen
,
R.C.
,
Leach
,
A.R.
and
Taylor
,
R.
(
1997
)
Development and validation of a genetic algorithm for flexible docking
.
J. Mol. Biol.
267
,
727
748
77
Ouyang
,
X.
,
Zhou
,
S.
,
Su
,
C.T.
,
Ge
,
Z.
,
Li
,
R.
and
Kwoh
,
C.K.
(
2013
)
Covalentdock: automated covalent docking with parameterized covalent linkage energy estimation and molecular geometry constraints
.
J. Comput. Chem.
34
,
326
336
78
Scholz
,
C.
,
Knorr
,
S.
,
Hamacher
,
K.
and
Schmidt
,
B.
(
2015
)
DOCKTITE — a highly versatile step-by-step workflow for covalent docking and virtual screening in the molecular operating environment
.
J. Chem. Inf. Model.
55
,
398
406
79
De Cesco
,
S.
,
Kurian
,
J.
,
Dufresne
,
C.
,
Mittermaier
,
A.K.
and
Moitessier
,
N.
(
2017
)
Covalent inhibitors design and discovery
.
Eur. J. Med. Chem.
138
,
96
114
80
Schneider
,
G.
and
Fechner
,
U.
(
2005
)
Computer-based de novo design of drug-like molecules
.
Nat. Rev. Drug Discov.
4
,
649
663
81
Congreve
,
M.
,
Murray
,
C.W.
and
Blundell
,
T.L.
(
2005
)
Structural biology and drug discovery
.
Drug Discov. Today
10
,
895
907
82
Hoffer
,
L.
,
Muller
,
C.
,
Roche
,
P.
and
Morelli
,
X.
(
2018
)
Chemistry-driven hit-to-lead optimization guided by structure-based approaches
.
Mol. Inform.
55
,
91
83
Mok
,
N.Y.
,
Chadwick
,
J.
,
Kellett
,
K.A.
,
Casas-Arce
,
E.
,
Hooper
,
N.M.
,
Johnson
,
A.P.
et al. 
(
2013
)
Discovery of biphenylacetamide-derived inhibitors of BACE1 using de novo structure-based molecular design
.
J. Med. Chem.
56
,
1843
1852
84
Davies
,
M.
,
Heikkila
,
T.
,
McConkey
,
G.A.
,
Fishwick
,
C.W.
,
Parsons
,
M.R.
and
Johnson
,
A.P.
(
2009
)
Structure-based design, synthesis, and characterization of inhibitors of human and Plasmodium falciparum dihydroorotate dehydrogenases
.
J. Med. Chem.
52
,
2683
2693
85
Gillet
,
V.
,
Johnson
,
A.P.
,
Mata
,
P.
,
Sike
,
S.
and
Williams
,
P.
(
1993
)
SPROUT: a program for structure generation
.
J. Comput. Aided Mol. Des.
7
,
127
153
86
Brown
,
N.
(
2014
)
Bioisosteres and scaffold hopping in medicinal chemistry
.
Mol. Inform.
33
,
458
462
87
Evers
,
A.
,
Hessler
,
G.
,
Wang
,
L.H.
,
Werrel
,
S.
,
Monecke
,
P.
and
Matter
,
H.
(
2013
)
CROSS: an efficient workflow for reaction-driven rescaffolding and side-chain optimization using robust chemical reactions and available reagents
.
J. Med. Chem.
56
,
4656
4670
88
Oltersdorf
,
T.
,
Elmore
,
S.W.
,
Shoemaker
,
A.R.
,
Armstrong
,
R.C.
,
Augeri
,
D.J.
,
Belli
,
B.A.
et al. 
(
2005
)
An inhibitor of Bcl-2 family proteins induces regression of solid tumours
.
Nature
435
,
677
681
89
Schramm
,
V.L.
(
2013
)
Transition states, analogues, and drug development
.
ACS Chem. Biol.
8
,
71
81
90
Schramm
,
V.L.
(
2015
)
Transition states and transition state analogue interactions with enzymes
.
Acc. Chem. Res.
48
,
1032
1039
91
Ducati
,
R.G.
,
Firestone
,
R.S.
and
Schramm
,
V.L.
(
2017
)
Kinetic isotope effects and transition state structure for hypoxanthine-guanine-xanthine phosphoribosyltransferase from Plasmodium falciparum
.
Biochemistry
56
,
6368
6376
92
Harris
,
M.E.
,
York
,
D.M.
,
Piccirilli
,
J.A.
and
Anderson
,
V.E.
(
2017
)
Kinetic isotope effect analysis of RNA 2′-O-transphosphorylation
.
Methods Enzymol.
596
,
433
457
93
Mercedes-Camacho
,
A.Y.
,
Mullins
,
A.B.
,
Mason
,
M.D.
,
Xu
,
G.G.
,
Mahoney
,
B.J.
,
Wang
,
X.
et al. 
(
2013
)
Kinetic isotope effects support the twisted amide mechanism of Pin1 peptidyl-prolyl isomerase
.
Biochemistry
52
,
7707
7713
94
Bressler
,
C.
and
Chergui
,
M.
(
2010
)
Molecular structural dynamics probed by ultrafast X-ray absorption spectroscopy
.
Annu. Rev. Phys. Chem.
61
,
263
282
95
Kraus
,
P.M.
,
Zürch
,
M.
,
Cushing
,
S.K.
,
Neumark
,
D.M.
and
Leone
,
S.R.
(
2018
)
The ultrafast X-ray spectroscopic revolution in chemical dynamics
.
Nat. Rev. Chem.
2
,
82
94
96
Bagdassarian
,
C.K.
,
Schramm
,
V.L.
and
Schwartz
,
S.D.
(
1996
)
Molecular electrostatic potential analysis for enzymatic substrates, competitive inhibitors, and transition-state inhibitors
.
J. Am. Chem. Soc.
118
,
8825
8836
97
Kline
,
P.C.
and
Schramm
,
V.L.
(
1993
)
Purine nucleoside phosphorylase. Catalytic mechanism and transition-state analysis of the arsenolysis reaction
.
Biochemistry
32
,
13212
9
98
Basner
,
J.E.
and
Schwartz
,
S.D.
(
2005
)
How enzyme dynamics helps catalyze a reaction in atomic detail: a transition path sampling study
.
J. Am. Chem. Soc.
127
,
13822
13831
99
Bolhuis
,
P.G.
,
Chandler
,
D.
,
Dellago
,
C.
and
Geissler
,
P.L.
(
2002
)
Transition path sampling: throwing ropes over rough mountain passes, in the dark
.
Annu. Rev. Phys. Chem.
53
,
291
318
100
Hirschi
,
J.S.
,
Arora
,
K.
,
Brooks
, III,
C.L.
, and
Schramm
,
V.L.
(
2010
)
Conformational dynamics in human purine nucleoside phosphorylase with reactants and transition-state analogues
.
J. Phys. Chem. B
114
,
16263
16272
101
Lotz
,
S.D.
and
Dickson
,
A.
(
2018
)
Unbiased molecular dynamics of 11 min timescale drug unbinding reveals transition state stabilizing interactions
.
J. Am. Chem. Soc.
140
,
618
628
102
Johnson
,
L.N.
and
Wolfenden
,
R.
(
1970
)
Changes in absorption spectrum and crystal structure of triose phosphate isomerase brought about by 2-phosphoglycollate, a potential transition state analogue
.
J. Mol. Biol.
47
,
93
100
103
Wolfenden
,
R.
(
1999
)
Conformational aspects of inhibitor design: enzyme-substrate interactions in the transition state
.
Bioorg. Med. Chem.
7
,
647
652
104
Wood
,
J.M.
,
Maibaum
,
J.
,
Rahuel
,
J.
,
Grutter
,
M.G.
,
Cohen
,
N.C.
,
Rasetti
,
V.
et al. 
(
2003
)
Structure-based design of aliskiren, a novel orally effective renin inhibitor
.
Biochem. Biophys. Res. Commun.
308
,
698
705
105
Tsafou
,
K.
,
Tiwari
,
P.B.
,
Forman-Kay
,
J.D.
,
Metallo
,
S.J.
and
Toretsky
,
J.A.
(
2018
)
Targeting intrinsically disordered transcription factors: changing the paradigm
.
J. Mol. Biol.
430
,
2321
2341
.
106
Ambadipudi
,
S.
and
Zweckstetter
,
M.
(
2016
)
Targeting intrinsically disordered proteins in rational drug discovery
.
Expert Opin. Drug Discov.
11
,
65
77
107
Mobley
,
D.L.
and
Gilson
,
M.K.
(
2017
)
Predicting binding free energies: frontiers and benchmarks
.
Annu. Rev. Biophys.
46
,
531
558
108
Abel
,
R.
,
Wang
,
L.
,
Mobley
,
D.L.
and
Friesner
,
R.A.
(
2017
)
A critical review of validation, blind testing, and real- world use of alchemical protein-ligand binding free energy calculations
.
Curr. Top. Med. Chem.
17
,
2577
2585
109
Bruce
,
N.J.
,
Ganotra
,
G.K.
,
Kokh
,
D.B.
,
Sadiq
,
S.K.
and
Wade
,
R.C.
(
2018
)
New approaches for computing ligand-receptor binding kinetics
.
Curr. Opin. Struct. Biol.
49
,
1
10
110
Wang
,
L.
,
Wu
,
Y.
,
Deng
,
Y.
,
Kim
,
B.
,
Pierce
,
L.
,
Krilov
,
G.
et al. 
(
2015
)
Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field
.
J. Am. Chem. Soc.
137
,
2695
2703
111
Colerangle
,
J.B.
(
2013
) Preclinical development of non-oncogenic drugs (small and large molecules). In
A Comprehensive Guide to Toxicology in Preclinical Drug Development
(
Faqi
,
A.S.
, ed.).
Academic Press
,
Cambridge, Massachusetts
112
Alqahtani
,
S.
(
2017
)
In silico ADME-Tox modeling: progress and prospects
.
Expert Opin. Drug Metab. Toxicol.
13
,
1147
1158
113
Lagunin
,
A.A.
,
Dubovskaja
,
V.I.
,
Rudik
,
A.V.
,
Pogodin
,
P.V.
,
Druzhilovskiy
,
D.S.
,
Gloriozova
,
T.A.
et al. 
(
2018
)
CLC-Pred: a freely available web-service for in silico prediction of human cell line cytotoxicity for drug-like compounds
.
PLoS ONE
13
,
e0191838
114
Mervin
,
L.H.
,
Cao
,
Q.
,
Barrett
,
I.P.
,
Firth
,
M.A.
,
Murray
,
D.
,
McWilliams
,
L.
et al. 
(
2016
)
Understanding cytotoxicity and cytostaticity in a high-Throughput screening collection
.
ACS Chem. Biol.
11
,
3007
3023
115
Miteva
,
M.A.
and
Villoutreix
,
B.O.
(
2017
)
Computational biology and chemistry in MTi: emphasis on the prediction of some ADMET properties
.
Mol. Inform.
36
,
1700008
.
116
Rognan
,
D.
(
2017
)
The impact of in silico screening in the discovery of novel and safer drug candidates
.
Pharmacol. Ther.
175
,
47
66
117
Lambrinidis
,
G.
,
Vallianatou
,
T.
and
Tsantili-Kakoulidou
,
A.
(
2015
)
In vitro, in silico and integrated strategies for the estimation of plasma protein binding. A review
.
Adv. Drug Deliv. Rev.
86
,
27
45
118
Wang
,
W.
and
MacKinnon
,
R.
(
2017
)
Cryo-EM structure of the open human ether-à-go-go-related K(+) channel hERG
.
Cell
169
,
422
430.e10
119
Silver
,
D.
,
Huang
,
A.
,
Maddison
,
C.J.
,
Guez
,
A.
,
Sifre
,
L.
,
van den Driessche
,
G.
et al. 
(
2016
)
Mastering the game of Go with deep neural networks and tree search
.
Nature
529
,
484
489
120
Silver
,
D.
,
Schrittwieser
,
J.
,
Simonyan
,
K.
,
Antonoglou
,
I.
,
Huang
,
A.
,
Guez
,
A.
et al. 
(
2017
)
Mastering the game of Go without human knowledge
.
Nature
550
,
354
359
121
Jing
,
Y.
,
Bian
,
Y.
,
Hu
,
Z.
,
Wang
,
L.
and
Xie
,
X.S.
(
2018
)
Deep learning for drug design: an artificial intelligence paradigm for drug discovery in the big data era
.
AAPS J.
20
,
58
122
Olivecrona
,
M.
,
Blaschke
,
T.
,
Engkvist
,
O.
and
Chen
,
H.
(
2017
)
Molecular de-novo design through deep reinforcement learning
.
J. Cheminform.
9
,
48
123
Pereira
,
J.C.
,
Caffarena
,
E.R.
and
Dos Santos
,
C.N.
(
2016
)
Boosting docking-based virtual screening with deep learning
.
J. Chem. Inf. Model.
56
,
2495
2506
124
Nankervis
,
J.L.
,
Feil
,
S.C.
,
Hancock
,
N.C.
,
Zheng
,
Z.
,
Ng
,
H.L.
,
Morton
,
C.J.
et al. 
(
2011
)
Thiophene inhibitors of PDE4: crystal structures show a second binding mode at the catalytic domain of PDE4D2
.
Bioorg. Med. Chem. Lett.
21
,
7089
7093
125
Thanos
,
C.D.
,
Randal
,
M.
and
Wells
,
J.A.
(
2003
)
Potent small-molecule binding to a dynamic hot spot on IL-2
.
J. Am. Chem. Soc.
125
,
15280
1
126
Wylie
,
A.A.
,
Schoepfer
,
J.
,
Jahnke
,
W.
,
Cowan-Jacob
,
S.W.
,
Loo
,
A.
,
Furet
,
P.
et al. 
(
2017
)
The allosteric inhibitor ABL001 enables dual targeting of BCR-ABL1
.
Nature
543
,
733
737
127
Horn
,
J.R.
and
Shoichet
,
B.K.
(
2004
)
Allosteric inhibition through core disruption
.
J. Mol. Biol.
336
,
1283
1291
128
Wang
,
X.
,
Minasov
,
G.
and
Shoichet
,
B.K.
(
2002
)
Evolution of an antibiotic resistance enzyme constrained by stability and activity trade-offs
.
J. Mol. Biol.
320
,
85
95