It is over 20 years since the first fragment-based discovery projects were disclosed. The methods are now mature for most ‘conventional’ targets in drug discovery such as enzymes (kinases and proteases) but there has also been growing success on more challenging targets, such as disruption of protein–protein interactions. The main application is to identify tractable chemical startpoints that non-covalently modulate the activity of a biological molecule. In this essay, we overview current practice in the methods and discuss how they have had an impact in lead discovery – generating a large number of fragment-derived compounds that are in clinical trials and two medicines treating patients. In addition, we discuss some of the more recent applications of the methods in chemical biology – providing chemical tools to investigate biological molecules, mechanisms and systems.
The key feature of fragment methods is that discovery begins by identification of low molecular weight (MW) compounds or fragments (typically approximately 200 Da) that bind to the biological molecule of interest. Because they are small, the fragments are less likely to contain portions that prevent binding and so are more likely to identify functional motifs that match the requirements of the target. The fragment hits can then be optimized, usually through structure-based design, into higher affinity compounds that can be used to probe the biology of the target or as a starting point (lead) for drug discovery.
Typically, fragments bind to most target binding sites with an equilibrium dissociation constant (KD) in the 100s µM to low mM range. This places constraints on the assays that can reliably detect such weak binding (usually biophysical methods) and the design of the fragment library (solubility for the high concentrations of fragments needed in assays). Although there are some reports of using fragment methods against nucleic acids , the overwhelming majority of the published work is on proteins – which will be the type of target covered in this essay.
The first report of fragment-based discovery was in 1996  (discussed later) and there was rapid development of the methods and their application in the late 1990s and early 2000s, particularly in small, structure-based discovery companies such as Astex , Sunesis  and Vernalis . What was vital was that progress could be made with a small (approximately 1000 compounds) fragment library providing novel chemical matter to start a discovery project, and the companies were able to exploit high-throughput crystallographic methods to generate candidate compounds. The successes of the mid-2000s led to the widespread adoption of the approach across pharmaceutical companies and academia. There are now many compounds in the clinic (http://practicalfragments.blogspot.co.uk/2016/07/fragments-in-clinic-2016-edition.html) and two fragment-derived compounds treating patients [6,7].
There have been many reviews and articles summarizing the methods in more detail than will be covered here. Some representative ones are a review by some of the leading practitioners summarizing the developments of the last 20 years , a review of the application of biophysical methods in drug discovery ; a recent summary of fragment to lead campaigns ; an excellent book with chapters from leading figures in fragment-based discovery ; and a chapter written by one of the authors of this essay in the RSC Handbook of Medicinal Chemistry . In addition, there is an excellent blog that contains précis and comments on recent developments, conferences and publications on topics in fragment-based lead discovery (FBLD) (http://practicalfragments.blogspot.co.uk/).
There are variations in the phrase and acronym used to describe the field. We prefer FBLD – fragment-based lead discovery – because the key step in either a drug discovery or a chemical biology project is identifying a compound good enough to answer questions about the target (a lead). Discovery of a drug does depend on the properties introduced as that lead is optimized, and the fragment discovery process can have an impact. However, it is also heavily dependent on whether binding to the target has the required effect on the disease or condition – a reason why some projects fail in clinical trials – which often has little to do with the compound itself.
Here, we will provide a brief overview of the key features of FBLD and give a perspective on some of the more recent developments – in particular the use of fragments in chemical biology, exploring features of protein structure and activity.
Development of the key ideas and methods
In most areas of science, the development of a field is built on a set of ideas established by many researchers. In the case of FBLD, there are a number of contributions that should be highlighted.
The first is from the work of Jencks in the early 1970s . He pointed out that for a molecule made up of two interacting parts, there are contributions to free energy from the interactions made by each of the two parts and additionally from the rotational and translational entropy of the whole molecule. This highlights that although the binding of an individual part (as in the case of a fragment) may have quite low affinity as it overcomes the entropy of binding, all the interaction energy of another binding part is added when it is incorporated (or linked). He also reminded us that ΔG = −RTlnKD – i.e. doubling free energy, squares the binding affinity.
The second set of ideas came from computational methods. Goodford  developed the GRID method that visualizes the calculated energy of interaction between a probe atom and a protein surface, highlighting where particular types of interactions could be made. This approach was extended to mapping where functional groups preferentially bind in the multiple copy simultaneous search (MCSS) method of Miranker and Karplus . This idea of functional groups (i.e. very small fragments) binding was explored experimentally by Ringe and Mattos  and English et al.  through the determination of the structure of protein crystals soaked in high concentrations of various solvents (so-called multiple solvent crystal structures, MSCS). This combination of studies inspired the ideas that small molecules binding to a protein surface can be constructed from small pieces.
It was the work of Fesik and co-workers at Abbott that first reduced these ideas to practice for lead discovery, in a set of papers using a method they termed structure–activity relationship (SAR) by nuclear magnetic resonance (NMR) , reviewed in . They used protein-observed NMR (see later) to identify fragments binding to two independent binding sites and linked these fragments together to make potent inhibitors. The approach was extended, first by Abbott, to use X-ray crystallography to identify the fragments , an approach that was rapidly exploited and developed by Astex  during the early 2000s, along with developments at Sunesis  and Vernalis . Although a few larger companies did experiment with the approaches , it was the small companies that primarily developed the methods and demonstrated success, which by the mid-2000s led to the wider take-up of the methods across the pharmaceutical industry.
Two other developments made a contribution to the thinking in FBLD. The first was from Hann et al. , which developed models around molecular complexity – i.e., that a molecule needs to have sufficient features to make interactions with a binding site, but not so many features (or size) that prevent binding. This provided a framework for thinking about the size of compounds in a fragment library. The second was the idea of ligand efficiency  – which in its simplest form is the amount of free energy per non-hydrogen atom in a compound. This had a major impact on medicinal chemistry thinking, providing a guide as to whether atoms added to a compound during a cycle of optimization really are making optimal interactions with the protein.
Current practice in FBLD
Figure 1 summarizes the main features of an FBLD platform; its five main components – fragment libraries, target enablement, fragment screening, generating a model for fragment binding and fragment to lead optimization – are described in the sections that follow.
The FBLD process
There are three main considerations in assembling a fragment library. Firstly, the properties of the fragments – they should not contain groups known to be reactive to proteins, should be soluble at the high concentrations used for screening and should be as diverse as possible (see  for one of the first descriptions of these properties). Secondly, the MW is a major consideration for the number of fragments in a library. Work by the group of Reymond [23,24] has estimated that the size of chemical space increases by approximately 8-fold for each heavy atom in a molecule. There are many approximations, but this means that a truly diverse library of 1000 fragments with MW of 190 Da covers the available chemical space in an equivalent way to 108 molecules of MW 280 Da or 1018 molecules of MW 440 Da. Hence, many practitioners work successfully with libraries of 1–2000 compounds of average MW 200 Da. Finally, it is important that the library is regularly checked for stability or precipitation/aggregation of the fragments to avoid false positives in the high concentration assays. Some of the other issues to be considered are discussed in a recent review of fragment library design  and the development of a new fragment library at Pfizer .
It is important to remember that this is screening and that small differences in the structure of the compounds can have a big impact on binding affinity. The hope is that if the library is diverse enough then some fragments will bind to the target – they will not be the best fragments but may give a starting point for identifying fragments more suitable for optimization.
In thinking about the issues and opportunities for FBLD, it is possible to define two classes of target: (a) a conventional target, where a crystal structure and a robust binding or functional assay is readily available and there is precedence for obtaining drug-like compounds (such as a protein kinase) and (b) a challenging target, such as disruption of a protein–protein interaction or inhibition of the activity of a multi-protein complex, where there is no precedence for lead discovery and where there are many problems to solve in establishing a platform for lead discovery – such as establishing a robust assay or generation of a model of compound binding. The difficulty and time taken for target enablement will clearly be different for these two classes of target.
Fragment and structure-based lead discovery relies upon the production of sufficient quantities of pure, functional, homogeneous protein suitable for screening and structure determination. Although this has usually been relatively routine for conventional targets, it can be an issue for challenging targets, particularly knowing which post-translational modifications affect the target binding site or which additional proteins are involved in affecting activity in the cell. Finally, it is vital that robust assays (both binding and functional) are available – this can be a real bottleneck for challenging targets as a compound is usually required to validate activity assays – and the assays are required to identify the compound. These issues should be self-evident, but they can be a major barrier to initiating a successful fragment-based discovery project.
The main methods used to identify fragments that bind to a protein are listed in Figure 2, where the figure legend summarizes the main features of each technique. Taking into account the experimental conditions and nature of the assay method, all of the techniques should give the same hits when screening a library against a target [26,27]. What is sometimes forgotten is the relative sensitivity of each method, and whether the fragments (and the target) are still in solution and not aggregated under the conditions of the assay (concentration required, pH, buffer etc). Some practitioners describe screening protocols using a number of different techniques and then taking the intersection of the hits as the true hits. Although such hits will probably be valid, there is a danger that good fragment hits are discarded because of the least robust assay.
Fragment screening methods
The three most widely used methods (http://practicalfragments.blogspot.co.uk/) for fragment screening are ligand-observed NMR (reviewed in ), SPR (for example, ) and X-ray crystallography (for example, ). The different characteristics of the techniques are discussed in more detail in the legend of Figure 2, but the distinctive features can be summarized as: (a) the main advantage of ligand-observed NMR is that it is label free and a measurement in free solution. Furthermore, the solubility and stability of both the protein and the fragment can be evaluated for each measurement. This is particularly important for the more challenging target proteins where stability can be an issue – aggregation or precipitation of the protein is the most usual cause of false positives or false negatives in many assay formats; (b) SPR is chosen by many practitioners as the central screening platform as it uses small amounts of protein and can be used to screen relatively rapidly – the main issue is whether the protein can be immobilized to a surface and retain its binding integrity. What can also be important for both SPR and NMR is to assess the effect of a competitive ligand on putative hit fragments to check for non-specific binding; (c) X-ray crystallography provides an immediate model of the fragment binding to the protein. Because it is usually performed with very high concentration of fragment (10s mM), it usually gives a higher hit rate than other techniques . However, it requires a crystal system suitable for soaking of fragments that can be a major issue for some targets.
Generating a model for fragment binding
It is relatively straightforward to find fragments that bind to most binding sites on most proteins [39,40]. What is a challenge is knowing what to do with them. For most targets with only weak binding fragments, it is very difficult to generate useful SAR for a fragment by trial and error synthesis. Most of the modifications result in a loss or a change of binding affinity that is difficult to differentiate across compounds. All successful fragment optimization reported to date has relied on some model of how the fragment binds to the target. The most robust model comes from X-ray crystallography, and for well-behaved conventional targets it can be possible to determine many hundreds of crystal structures in the early stages of a project to enable the medicinal chemist to make informed decisions about optimization. Where crystallography is not possible, then NMR methods can provide sufficient insight to generate a model, although it can take a number of weeks to generate each model.
Fragment to lead optimization
Strategies for fragment optimization
Fragment to lead (to drug) optimization
‘Fragment linking’ is the initial SAR by NMR approach from Fesik and co-workers  and consists of five steps – identification of the first fragment, optimization of the first fragment (following structure determination), screening a library of smaller fragments in the presence of the first fragment, optimization of the second fragment and then linking of the two fragments. This was successful for a number of targets for this group  but has not been repeated by many others. There are two challenges: the first is whether the binding site has appropriate pockets to accommodate separate fragments; the second is the challenge of the chemistry to link without affecting the detail of the binding orientation and position of the two fragments. The combinatorics of this type of approach is attractive – if each screen is of 1000 fragments and chemistry is possible for 100 ways of linking, then some 1000 × 1000 × 100 (100 million) possible molecules have been assessed in just 1000 + 1000 + 100 (2100) experiments. Figure 4A shows the most advanced example from the SAR by NMR approach – the discovery of the compound ‘7’, venetoclax , now treating patients with certain forms of CLL.
‘Fragment growing’ is the most widely applied strategy in FBLD. Figure 4B summarizes the discovery of ‘11’, vemurafenib – a selective inhibitor of V600E mutant B-Raf kinase, a strong driver of melanoma. The initial fragment was optimized by careful structure-based design to introduce the desired selectivity and drug-like properties. There are many similar examples of such structure-guided fragment optimization, such as the Aurora A inhibitor from Astex . A variant of this approach is to search available chemicals for compounds containing the central binding motif of the initial fragment hit – so-called SAR by catalogue. An example of this is contribution to the the rapid discovery by Vernalis and the Institute of Cancer Research of AUY-922, an inhibitor of the molecular chaperone Hsp90, which entered Phase II trials .
‘Fragment merging’ is an approach that combines information from multiple chemical hits together. It relies on multiple crystal structures and careful structure-based design. Figure 4C summarizes the steps in the discovery of a cell-active selective inhibitor of the kinase PDPK1 at Vernalis , generating a compound that suggested that inhibition of this kinase would not have the desired therapeutic effect. There are fewer examples of this type of approach – as it relies on multiple crystal structures of many chemotypes and a more ambitious approach to chemical design. Another striking example from Vernalis is the identification of an orally active Hsp90 inhibitor, BEP-800, which entered pre-clinical trials . An extension of the merging approach is particularly powerful – i.e. combining the information from the fragments with information about inhibitors identified from the literature, from high-throughput screening (HTS) or natural ligands. One very nice example is work on the enzyme BCATm by a team from GSK, where the fragment was combined with an HTS hit to give a potent lead .
One of the common mistakes in fragments to hits to leads chemistry is beginning the process of fragment growth before the core of the fragment itself is explored and optimized or before the full set of fragment hits has been characterized. There are two aspects to this. Firstly, as discussed earlier, screening is a numbers game and it is unlikely that the optimal fragment for a particular binding site will be in the fragment library. It is therefore important to explore closely related chemotypes to a fragment – by compound purchase (SAR by catalogue) or by limited synthesis, if at all possible probing how small changes affect binding (methyl walk, moving nitrogens around a heterocycle etc). Secondly, exploratory evolution around fragments is a powerful way of mapping a binding site for particular features that add to the affinity and selectivity of ligands – having a marked impact on the success of optimization.
So far in this essay, we have summarized current practice in using the now mature fragment-based methods for lead discovery. Here, we briefly summarize some of the recent developments and applications of fragment-based approaches and thinking.
Fragments and chemical biology
As already mentioned, some of the early research used either computational [14,15] or experimental [16,17] methods to explore binding of small functional groups to a protein. Early analyses by a number of groups [39,40] demonstrated how fragments can be found for essentially all binding sites on all proteins – and the number of fragments gives some indication of the ligandability of the binding site, i.e. how chemically attractive it is for a small molecule. An extension of this has been the realization that fragments can reveal potential binding sites on proteins  although binding to many of these secondary sites might not affect function.
There have been some recent publications highlighting how such fragment binding can explore a protein or a biological system.
Mutations in the protein K-Ras have long been identified as a key driver of many different cancers. It is a relatively small G-protein, where a switch from binding of GDP to GTP induces a conformational change which triggers multiple signalling cascades. It has proven intractable to drug discovery efforts over the decades. Various groups have recently used fragments to identify novel binding sites on the protein and this has reignited interest in the target [48–50].
There has been considerable interest in covalent fragments – i.e. fragments that contain a reactive group that will form a covalent bond with the protein. The origins of these ideas were in the tethering work of the company Sunesis, who developed a library of fragments which would react with surface exposed cysteines . An extension of this approach is to introduce a cysteine mutation close to the functional site  – an interesting example of this type of approach was also demonstrated for K-Ras where a particular mutation seen in some tumours (G12C) was exploited . A particularly impressive extension of these ideas is the work of the Cravatt group  who designed a fragment library where each fragment contained a photoactivatable diazirine and an alkyne group. The fragments were incubated with cells and exposed to light so that the fragments were covalently linked to their binding partner protein. Click chemistry was then used to tag the labelled protein with a fluorescent moiety and subsequent MS identified the targets labelled.
Fragments as enzyme activators
There are well-established targets such as nuclear receptors or G-protein coupled receptors, where the aim in lead discovery is to identify agonists – i.e. compounds that modulate the conformation of the protein to generate biological function. However, there are very few examples where small molecules activate an enzyme activity. Recent work has demonstrated that non-covalent binding of a fragment can increase enzyme activity  and covalent attachment of the fragment is being explored to provide improved industrial enzymes (unpublished). There are a variety of physiological processes where activation of activity could be beneficial – and this could be attractive for therapeutic intervention, as it may be possible to achieve some benefit with just a small increase of activity.
There have been massive increases in the speed of computation over recent years. This has helped advance the use of computational chemistry methods in FBLD but arguably, there have not been any major advances in the underlying methods or computational models that are used. If a structure is available for the protein target, and the ligand binds without major conformational changes, then computational docking of ligands will generate a number of poses (positions and orientations) for the ligand that includes the actual pose, but it remains a challenge to correctly calculate the binding energy and thus recognize that pose. The situation is even more challenging for fragments, as the number of interactions made (and the binding affinity) is much smaller. One interesting new idea is that it is the activation energy for dissociation (off-rate) that dominates the energetics of binding. This has been explored in a dynamic undocking approach called DUCK  where the work required to break a key interaction is estimated as a surrogate for the binding affinity.
One area that has received considerable attention in recent years is using molecular simulation methods to map the protein surface for which functional groups will bind, such as the programs FTMAP  and MDmix . These can be seen as an extension of the earlier GRID  and MCSS  methods, but incorporating flexibility into the protein target and exploiting the massive increase in available computational power. It is arguable how predictive these methods can be, but they do provide tools to explore flexibility and generate ideas about which functional groups could be useful to incorporate as a fragment is optimized.
In many ways, a fragment is no different to any other small molecule that binds to a protein – the main advantage is that the chance of a fragment binding is much greater than a larger molecule. The main challenges have been in developing the methods and experience in both identifying binding of such weak compounds and the methods for optimizing the fragments to lead compounds. The main advantage for the medicinal chemist is that because the chemistry is starting small and there are usually many fragments to consider, there are more opportunities to make better quality decisions in generating a lead compound with optimal properties. This has provided novel, selective lead compounds for conventional targets beyond those identified through more conventional, HTS approaches. More striking are the opportunities offered for challenging targets. Here, the biophysical methods, in particular NMR, are able to identify and validate binding of fragments to the target – often providing starting points for lead discovery when other techniques, such as HTS, have failed.
For drug discovery, a major impact of FBLD methods has been driving development of tools (particularly biophysical methods) and experience in characterizing binding of ligands to proteins. This has developed a mindset that enables problem solving in early lead discovery – in particular characterizing whether and how a ligand is interacting with a protein. This can be particularly important when working with new classes of target where establishing the relationship between ligand binding and functional activity can be a challenge.
For the academic researcher, the methods are attractive since binders can be identified using a small library, not requiring the massive outlay in compounds and automation that is required for HTS; most of the biophysical techniques required for fragment screening are also available in most institutions. As stressed earlier, it is relatively easy to identify fragments that bind to most binding sites on most targets; what is difficult is knowing what to do with them. What recent research has shown, however, is that imaginative use of fragment-based methods can deliver new approaches and results in chemical biology – i.e. chemical tools to probe, understand and modulate biological systems and function.
There is great experience in design of fragment libraries and methods for finding fragments that bind.
Fragments can be found for most targets including those that fail in HTS.
Fragments explore the chemical space of what can bind to a target, providing new chemical entities even for well-characterized targets such as protein kinases.
Fragments that bind with millimolar affinity can be evolved into potent lead compounds.
There are now many compounds in clinical trials that were discovered from fragment methods – and two products on the market treating patients.
There are emerging ideas of using fragments to probe biological systems – as an additional tool in chemical biology.
B.L. is supported by the European Union’s Horizon2020 MSCA Programme under grant agreement 675899 (FRAGNET); research in the R.E.H. group is additionally supported by research grants from the BBSRC and institutional infrastructure support provided by the Wellcome Trust and EPSRC.
R.E.H. conceived the general structure of the essay, B.L. constructed the figures, both authors contributed to the writing of the essay.
The authors declare that there are no competing interests associated with the manuscript.
chronic lymphocytic leukaemia
fragment-based lead discovery
heteronuclear single quantum correlation spectroscopy
ligand-observed gradient spectroscopy
multiple copy simultaneous search
multiple solvent crystal structure
nuclear magnetic resonance
surface plasmon resonance
saturation transfer difference