Catalysts are a vital part of synthetic chemistry. However, there are still many important reactions for which catalysts have not been developed. The use of enzymes as biocatalysts for synthetic chemistry is growing in importance due to the drive towards sustainable methods for producing both bulk chemicals and high value compounds such as pharmaceuticals, and due to the ability of enzymes to catalyse chemical reactions with excellent stereoselectivity and regioselectivity. Such challenging transformations are a common feature of natural product biosynthetic pathways. In this mini-review, we discuss the potential to use biosynthetic pathways as a starting point for biocatalyst discovery. We introduce the reader to natural product assembly and tailoring, then focus on four classes of enzyme that catalyse C─H bond activation reactions to functionalize biosynthetic precursors. Finally, we briefly discuss the challenges involved in novel enzyme discovery.
Catalysts are a vital part of the synthetic chemist's toolkit. However, they are often based on expensive or toxic metals and there are many transformations for which there are no available catalysts, particularly those involving selective C─H bond activation. The ability of enzymes to catalyse chemically challenging transformations in a regio- and stereoselective fashion makes the development of a biocatalytic toolkit desirable to fill gaps in traditional catalysis and develop sustainable synthetic methods. Enzymes are already being successfully applied in the production of bulk chemicals and complex pharmaceuticals. The advent of synthetic biology, including whole pathway and protein engineering, has rapidly increased academic and industrial research activity in this area [1,2].
Microbial natural products have traditionally been of interest due to their important biological activities and have found application as pharmaceuticals and in the agrochemical and food industries. Microbial natural products are structurally diverse, frequently complex molecules, produced via biosynthetic pathways comprising a carefully choreographed series of biochemical steps, many of which involve enzymes that catalyse extremely challenging chemical reactions. The increase in available microbial DNA sequence data over the last decade and the parallel evolution of bioinformatics tools has facilitated the identification of many new biosynthetic pathways through genome mining as well as revealing a wealth of information about biosynthetic enzymes [3–6]. Understanding the function and mechanisms of biosynthetic enzymes is a major topic of research in natural product biosynthesis. This work facilitates the utilization of biosynthetic enzymes as ‘parts’ in natural product pathway engineering and de novo pathway design [6–9]. However, as biosynthetic enzymes catalyse reactions which would be highly desirable in synthetic chemistry, they also represent a potential source of biocatalysts. This mini-review thus focuses on a sample of some important classes of enzymes involved in natural product biosynthesis, their unique chemistry and what challenges are still associated with enzyme discovery and predicting enzyme function.
Natural product assembly and introducing diversity
Microbial natural products result from secondary metabolic processes. They are structurally diverse and complex, usually containing multiple stereocentres and ring systems. Non-ribosomal peptides (NR-peptides), polyketides, terpenes and ribosomal peptides are four key families of microbial natural products. NR-peptides and polyketides are both biosynthesized by analogous large, modular, multi-enzymes dubbed non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKS) respectively. Both function as large enzymatic assembly-lines that build complex structures where the intermediates are covalently bound to the enzyme [9–11]. The assembly lines are divided into modules, with each module responsible for a specific round of chain extension. The domains within each module have predictable functions and the growing peptide or polyketide chain is covalently bound to a carrier protein via a flexible phosphopantetheinyl arm, which transfers the growing chain to the catalytic sites of other domains [9–11].
In polyketide biosynthesis the starting monomer is typically acetyl CoA and the extender units are malonyl or methyl malonyl CoA which are used to extend the polyketide chain via sequential Claisen type condensations. The domain architecture within each module then determines the level of reduction during each extension e.g. the stereoselective reduction to an alcohol catalysed by a ketoreductase domain. This process serves to increase structural diversity and complexity .
NR-peptides are primarily biosynthesized from amino acids. The adenylation domain in each module is selective for a specific amino acid as determined by active site residues . However, the range of amino acids used by NRPS enzymes extends far beyond the 20 proteinogenic amino acids . Frequently, amino acids are biosynthesized specifically for a given natural product pathway as for example in hormaomycin biosynthesis (Figure 1) . The peptide chain is extended through peptide bond forming reactions catalysed by condensation domains in each module. Other catalytic domains include N-methylation or epimerization which introduce further diversity during chain extension. The final peptide is cleaved from the NRPS and often cyclized, by a thioesterase domain.
Introducing structural diversity to microbial natural products
Terpenes are an increasingly prominent family of microbial natural products and it is now apparent that there is significant terpene biosynthetic potential in bacteria . Although, terpenes are not biosynthesized via an assembly line, there is a key ‘assembly’ step, which is cyclization of the farnesyl diphosphate (FPP), geranyl diphosphate or geranylgeranyl diphosphate acyclic precursor (Figure 1) . The terpene synthase responsible for this step determines the structure of the terpene scaffold. Thus the same precursor chain results in a diverse range of products depending on the nature of the terpene synthase enzyme. Following cyclization, the terpene scaffold is often further modified by other enzymes in the pathway, for example the extensive oxidation in pentalenolactone biosynthesis including an unprecedented cytochrome P450 catalysed oxidative rearrangement (Figure 1) .
Finally, there is an array of ribosomally produced peptide natural products including the microcins, lanthipeptides and thiopeptides . The scaffold peptide is encoded by a single structural gene and the ‘assembly’ step primarily occurs on the ribosome using proteinogenic amino acids. Diversity is introduced through extensive post-translational modification by enzymes encoded in the gene cluster. For example bottromycin A2 biosynthesis includes C-methylation, thiazole formation and macrolactamidination (Figure 1) [17,18].
In the four natural product families described above, a key enzyme, NRPS, PKS, terpene synthase or ribosome, is required to assemble the natural product scaffold. These biosynthetic systems are the focus of efforts by synthetic biologists to engineer and design pathways [6–9]. The enzymes required to modify natural product scaffolds (post-assembly) or create dedicated biosynthetic precursors such as non-proteinogenic amino acids, in the pre-assembly steps, can be collectively referred to as ‘tailoring enzymes’ (Figure 1). They introduce a variety of functional groups and increase diversity of natural product structures. In NR-peptide and polyketide biosynthesis, there are also tailoring enzymes which act during assembly by modifying biosynthetic intermediates that are covalently tethered to NRPS and PKS assembly enzymes . An unusual example is β-lactam formation in nocardicin biosynthesis (Figure 1) . The substrates of tailoring enzymes acting at the three different assembly stages differ significantly in size and complexity which has an impact on any potential application of the enzymes as biocatalysts.
In the next section we will focus on enzymes involved in pre-assembly biochemical steps that create unusual biosynthetic precursors. Many of these chemical transformations have no analogue in synthetic chemistry. Such enzymes have long inspired researchers who develop small molecule catalysts for chemical synthesis however there is an increasing interest in developing enzymes themselves as biocatalytic reagents . Preassembly tailoring enzymes could represent a good starting point for biocatalyst discovery from natural product pathways.
Biosynthesis of pathway specific small molecule precursors
Simple biomolecules such as amino acids, fatty acids and sugars are often modified to generate pathway specific precursors for natural product biosynthesis. This pre-assembly tailoring is a frequent feature of NR-peptide biosynthesis in which non-proteinogenic amino acids are frequently used as building blocks . Unusual pathway specific starter and extender units are also found in many PKS pathways and particularly in hybrid PKS–NRPS systems . Thiopeptide biosynthetic pathways frequently involve the biosynthesis of dedicated small molecule moieties which are subsequently added to the ribosomally produced peptide as a post-translational modification . In the following sections we discuss a sample of important transformations in the biosynthesis of dedicated small molecule precursors, to highlight the potential of biosynthetic enzymes as a source of novel chemistry and potential biocatalysts to generate functionalized, small molecules such as enantiomerically pure amino acid derivatives. Here, we focus on transformations which involve selective C─H bond activation, as this remains a significant challenge in chemical synthesis. The enzyme families highlighted below have evolved hugely successful strategies to activate such centres [22,23].
Oxygenation refers specifically to incorporating one or both atoms of molecular oxygen into a given substrate, a tailoring reaction involving an array of enzyme families including cytochrome P450 [24,25]; flavin-dependent oxidases and non-haem iron-dependent enzymes [26,27]. The range of reactions is extensive including hydroxylation, epoxidation and oxidative ring cleavage. Phenylalanine hydroxylase, for example is a non-haem iron-dependent enzyme that catalyses the regiospecific hydroxylation of L-phenylalanine to give meta-tyrosine in pacidamycin biosynthesis . Oxygenases typically utilize an abundant oxidizing agent, oxygen, and transform it into a highly reactive oxidant using a transition metal (e.g. haem or non-haem iron) or a flavin cofactor. Although significant progress has been made towards engineering oxygenases such as cytochrome P450s as biocatalysts, there are a number of factors that still hamper their biocatalytic use, including the air sensitivity of many non-haem enzymes and the requirement for electron transport proteins for P450s and certain non-haem iron oxidases. Thus the peroxidase, SfmD, in saframycin biosynthesis involved in oxidation of a tyrosine derivative is of interest . SfmD is a haem-dependent peroxidase which uses H2O2 as an oxidant and thus unlike dioxygen utilizing enzymes, does not require electron transport protein partners or nicotinamide adenine dinucleotide phosphate (NADPH) to reduce oxygen [24–26]. In the pathway, methylation to generate 5-methyltyrosine is followed by hydroxylation by SfmD to give 3-hydroxy-5-methyltyrosine, in sequential regioselective C─H bond activating reactions (Figure 2A) .
Regioselective aromatic oxygenation and halogenation
There are a large number of known halogenated natural products which have a variety of biological activities . Halogenation is a key transformation in organic synthesis but maintaining regioselectivity and stereoselectivity is difficult and thus the enzymes involved in natural product halogenation, could be part of a biocatalytic solution. There are at least four halogenase families including non-haem iron halogenases and haem-dependent haloperoxidases [30,31]. The best studied and perhaps most useful from a biocatalytic perspective, are the FADH2-dependent enzymes. There has been significant interest in engineering these enzymes as biocatalytic agents or for whole cell transformations (Figure 2B) [31,32]. FADH2-dependent halogenases typically catalyse chlorination but are often also capable of accepting bromide as a substrate and in rare cases iodide [30,31]. Fluorination requires a dedicated fluorinase such as the extensively studied enzyme, 5′-fluorodeoxyadenosine (5′FDA) synthase, from Streptomyces cattleya [33,34]. This fluorinase naturally produces a fluorinated derivative of S-adenosylmethionine (SAM) through SN2 attack by fluoride at C5′ of SAM. An analogous SAM-dependent chlorinase is involved in the biosynthesis of an unusual chlorinated PKS extender unit in salinosporamide biosynthesis . Although 5′FDA synthase has applications in biotechnology, a fluorination enzyme with broad substrate tolerance is yet to be discovered. Fluorination is an important transformation in the production of pharmaceuticals but often the reagents are toxic, and so a biocatalytic process is desirable .
Although there are many known nitro group containing natural products, few enzymes involved in the biosynthetic introduction of nitro groups have been characterized . The predominant mechanism involves the sequential N-oxidation of amines. For example, the polyketide aureothin requires the biosynthesis of an unusual nitro aromatic PKS starter unit, p-nitrobenzoic acid, which is produced by oxidation of the corresponding aromatic amine catalysed by the non-haem di-iron enzyme AurF (Figure 1) . Synthetic nitration of aromatic compounds is difficult due to the use of harsh conditions and a lack of selectivity, thus there is interest in a nitration biocatalyst [38,39]. Recently, the first example of direct regiospecific aromatic nitration was characterized in the thaxtomin A biosynthetic pathway (Figure 3A) . Thaxtomin A, is a NR-peptide phytotoxin produced by Streptomyces scabies. The precursor of thaxtomin A, L-4-nitrotryptophan, is biosynthesized by TxtD, a nitric oxide synthase and TxtE, a cytochrome P450 [40,41]. TxtD generates nitric oxide (NO) through the oxidation of L-arginine which is utilized by TxtE in the presence of oxygen to catalyse regiospecific aromatic nitration (Figure 3A) [40,41]. This is an unusual transformation for a cytochrome P450. The proposed TxtE mechanism involves reaction of the reduced oxy complex [Fe3+-O2•] formed during the P450 catalytic cycle, with NO to produce a peroxynitrite complex [Fe3+-ONO2]. Cleavage of the O─O bond would produce a nitronium cation or NO2 radical to nitrate the substrate . Although the structure of TxtE has been elucidated, there is still little insight as to why TxtE carries out nitration and is incapable of oxygenation [40,42,43].
Two very different modifications of L-tryptophan
SAM-dependent methyl transferases are a regular feature of natural product biosynthesis both as domains commonly found in NRPS enzymes and as stand-alone enzymes catalysing N-, O- and C-methylation . However, another family of methyltransferases, the radical SAM enzymes, are emerging as a fascinating and wide spread family of enzymes in natural product biosynthesis . There are three classes of radical SAM enzymes (A, B and C) based on primary sequence. Examples of B and C have been found in natural product biosynthesis . The enzymes contain a [4Fe-4S] cluster and cobalamin . TsrM, a recently characterized radical SAM enzyme, catalyses the first committed step in the biosynthesis of the quinaldic acid moiety of ribosomal peptide, thiostrepton (Figure 3B). TsrM regiospecifically methylates L-tryptophan at the 2 position . This is a challenging transformation as illustrated by a recent regioselective synthesis of 2-methyltryptophan, which requires a Pd-norbornene based system . The TsrM mechanism is proposed to involve transfer of the CH3 group from SAM to reduced cobalamin followed by methylation of tryptophan via radical addition (Figure 3B) . This differs from canonical radical SAM mechanisms which require radical cleavage of SAM and hydrogen abstraction from the substrate . The enzyme also appears to be catalytically more efficient than previously investigated radical SAM enzymes . Chemically and mechanistically these enzymes are fascinating although their inherent air sensitivity make their study and application challenging.
Challenges for biocatalyst discovery from biosynthetic pathways
There have been many exciting recent discoveries in natural product enzymology. Most of the examples above were discovered from the study of particular pathways rather than mining genomes for enzymes that catalyse a given reaction. There are a number of challenges associated with both genome mining for enzymes and realizing the potential of natural product pathways in the discovery of biocatalytic tools.
Genome mining of available sequence data can be problematic. Although gene synthesis is now inexpensive and avoids amplifying genes from genomic DNA, synthesizing correct genes relies on accurate sequence data and correctly annotated genes . Sequencing errors including even a single base error can result in a point mutation in an enzyme active site resulting in significant effects on activity.
Predicting enzyme function
<1% of annotated proteins have been characterized experimentally, thus accurately predicting substrate and function from the primary sequence or even the structure of an enzyme is still a significant challenge . The advantage associated with studying enzymes involved in natural product biosynthesis results from the clustering of biosynthetic genes in microbes. It is possible, based on the structure of a natural product, to carry out a bio-retrosynthetic analysis and predict a reasonable pathway . This prediction can be correlated with the annotated genes in the cluster thus aiding substrate and function prediction of pathway enzymes. Our understanding of the biosynthesis of many natural product families is such that we can also do the reverse and predict natural product structures from annotated gene clusters, an approach central to genome mining [3–6,49]. However, as we have seen from some of the examples above, an enzyme may be a member of an extremely well-studied family but still present with previously unknown chemistry. Our inability to predict enzyme function reflects the limitations of our understanding of the subtleties that affect enzyme chemistry and catalysis and it is also a barrier towards rational engineering of biocatalysts. Improvements in predictive models and a deeper understanding of enzyme mechanism, underpinned by experimental evidence, are required .
Determining enzyme function
The process of characterizing a new enzyme and determining product structure is still work-intensive. Given the number of enzymes that remain to be experimentally characterized, high-throughput analytical methods are needed. Current high-throughput biochemical methods are often based on spectroscopic assays which measure enzyme activity but give no information on the three dimensional structure of the product. For example, two enzymes may carry out the same chemical transformation but produce different regioisomers or an enzyme may catalyse unexpected chemistry which is not detected by the assay. Products of single enzyme reactions can be characterized by mass spectrometry and nuclear magnetic resonance (NMR) but this is also a time consuming process. Access to more sophisticated high-throughput techniques such as liquid chromatography tandem mass spectrometry (LC–MS/MS) and LC–MS–NMR (LCMS coupled to NMR) which are used in metabolite discovery, could facilitate small to large scale enzyme screens.
Following enzyme discovery there are of course many challenges associated with developing enzymes into biocatalysts including practical challenges such as air sensitivity e.g. radical SAM enzymes; enzymes which use rare or expensive cofactors e.g. PAPS-dependent sulfotransferases and enzymes which require accessory proteins e.g. cytochrome P450s that require electron transport proteins which are not usually coded for in gene clusters. Identifying optimal electron transfer partners is a major challenge. However, significant progress is being made to tackle these issues, for example cofactor regeneration systems have been developed for flavin adenine dinucleotide (FAD) and NADH among others. For more complex and air sensitive systems, using enzymes as parts for synthetic or modified biosynthetic pathways and producing molecules via whole cell transformations is an alternative. Recently, for example, a combination of protein and pathway engineering resulted in novel chlorinated derivatives of a plant alkaloid .
In spite of the challenges, there is huge potential to develop biosynthetic enzymes as biocatalysts or as parts for engineered whole cell systems, and ultimately become important catalytic tools for synthetic chemistry . Given the numbers of uncharacterized enzymes, it is likely that there remains a wealth of chemistry to be discovered and exploited.
We thank King's College London for a PhD studentship for Catherine B. Hubert.
This work was supported by the Royal Society [grant number RG120526]; and the KHP Challenge award (MRC confidence in concept, grant number MC_PC_14105 v.2).
Synthetic Biology UK 2015: Held at Kingsway Hall Hotel, London, U.K., 1–3 September 2015