Abstract
Glucuronoyl esterases (GEs) are microbial enzymes able to cleave covalent linkages between lignin and carbohydrates in the plant cell wall. GEs are serine hydrolases found in carbohydrate esterase family 15 (CE15), which belongs to the large α/β hydrolase superfamily. GEs have been shown to reduce plant cell wall recalcitrance by hydrolysing the ester bonds found between glucuronic acid moieties on xylan polysaccharides and lignin. In recent years, the exploration of CE15 has broadened significantly and focused more on bacterial enzymes, which are more diverse in terms of sequence and structure to their fungal counterparts. Similar to fungal GEs, the bacterial enzymes are able to improve overall biomass deconstruction but also appear to have less strict substrate preferences for the uronic acid moiety. The structures of bacterial GEs reveal that they often have large inserts close to the active site, with implications for more extensive substrate interactions than the fungal GEs which have more open active sites. In this review, we highlight the recent work on GEs which has predominantly regarded bacterial enzymes, and discuss similarities and differences between bacterial and fungal enzymes in terms of the biochemical properties, diversity in sequence and modularity, and structural variations that have been discovered thus far in CE15.
Introduction
Lignin–carbohydrate complexes and plant cell wall recalcitrance
The plant cell wall is a complex network mainly consisting of polysaccharides. Cellulose is typically the most abundant polysaccharide and it coalesces via strong hydrogen bonding into crystalline fibres that in turn are coated and cross-linked by different heteropolysaccharides [1]. The cell wall can be further reinforced by lignin, where lignin monomers polymerize in radical coupling reactions. In order to use the plant cell wall polymers as a nutrient source, microorganisms need to produce an arsenal of different hydrolytic and oxidative enzymes to tackle this recalcitrant matrix [2]. Likewise, full utilization of renewable biomass by enzymatic processing in biorefineries requires a variety of enzymatic activities, usually formulated in so-called enzyme cocktails. A feature that is often overlooked is that during lignification, covalent bonds are formed not only between the lignin monomers but also between lignin and the exposed carbohydrates of the cell wall, particularly xylan, forming so-called lignin–carbohydrate complexes (LCCs) [1,3]. It has been estimated that virtually all lignin in softwood, and a major fraction of lignin also in hardwood, can be found covalently bound to carbohydrates [4]. Correspondingly, a significant proportion of the plant cell wall carbohydrates can be bound to lignin, as in beechwood, where around a third of the glucuronic acid (GlcA) moieties are estimated to participate in LCCs [5]. LCCs greatly add to the recalcitrance of the cell wall, and while these structures are difficult to study, various lignin–carbohydrate bonds have been proposed, including benzyl ethers, benzyl esters, γ-esters, phenyl glycosides and ferulate esters (Figure 1) [3]. While ferulate-mediated cross-links to lignin can indirectly be cleaved by feruloyl esterases that cleave bonds between feruloyl and arabinofuranosyl moieties, when it comes to direct enzymatic cleavage of LCCs, the only known enzymes with this ability to date are the glucuronoyl esterases (GEs) which hydrolyse the ester bonds between GlcA moieties in glucuronoxylan and lignin. The importance of GlcA moieties for cell wall recalcitrance has recently been highlighted [6] and indeed points to these residues as being key in forming the LCC. Here, we summarize the current literature on GEs, with a focus on their activities, modularity and structure.
Lignin–carbohydrate complexes involving xylan
Discovery of glucuronoyl esterases and characterization of fungal enzymes
The history of the discovery of GEs has been covered in past reviews [7,8] and will only be briefly introduced here. The first GE was discovered in 2006 in a serendipitous study of the fungus Schizophyllum commune and was found to be able to cleave the methyl ester of 4-O-methyl-d-GlcA [9]. This pointed toward a biological role in cleaving LCCs, and together with follow-up characterization of the GE Cip2 from Trichoderma reesei (anamorph of Hypocrea jecorina, [10]) led to the establishment of the new carbohydrate esterase family 15 (CE15) in the Carbohydrate-Active Enzymes database (www.cazy.org; [11]). Characterization of several fungal CE15 enzymes followed and began shedding light on the substrate preferences and structures of fungal GEs.
Fungal GEs have been shown to be specific for glucuronate esters and show no activity on galacturonate esters [12,13] or other esterase substrates such as feruloyl- or acetyl esters [9,12,14,15]. The standard substrates used have been various GlcA-based esters, with or without 4-O-methylation and with variation of the alcohol substituent that represents the ‘lignin side’ of LCCs (Figure 1). As a general observation, GEs tend to prefer bulkier substrates that more closely resemble proposed LCC structures that include aromatic moieties, that is, with activity on benzyl glucuronate or larger substrates clearly preferred to smaller methyl or allyl esters [16–18]. 4-O-methylation has been shown to be important for activity in several cases, with faster hydrolysis rates of substrates containing this moiety compared with ones with unsubstituted hydroxyl groups at the 4 position [14,19]. On more complex substrates that may better mimic the LCCs found in the cell wall, GE activity has been demonstrated on chemically methyl esterified 4-O-methylglucuronoxylan [20], as well as LCC-rich substrates. These include the first demonstration of activity by the GE from Sodiomyces alcalophilus (Acremonium alcalophilum) on LCCs extracted from spruce- and birchwood by observing increased amounts of carboxylic acids using phosphorus NMR [18]. GEs have also been shown to improve sugar release from corn fibre by enzyme cocktails [17], release xylooligosaccharides from LCC extractions from birch wood and act synergistically with a xylanase [21,22]. Expression of fungal GEs in plants have interestingly shown clear phenotypes, including thinner cell walls with reduced cross-linking, changes in cell wall composition, and improved saccharification, which further support the role of GEs cleaving LCCs in muro [23,24]. As such, GEs are potential important components of enzyme cocktails in biorefineries, and indeed have been included in several patents related to utilization of renewable lignocellulose [25,26].
Bacterial glucuronoyl esterases
Until recently, bacterial GEs had not been studied apart from two cases. The first was the C-terminal domain from the multidomain CesA enzyme from Ruminococcus flavefaciens [20], a cellulosomal protein also containing an acetyl xylan esterase and a dockerin domain, and the second was a GE from a marine metagenome [27]. Broader and more in-depth studies followed in 2018, with the detailed characterization of ten GEs from the three bacterial species Opitutus terrae, Solibacter usitatus (Candidatus Solibacter usitatus Ellin6076) and Spirosoma linguale [28], each having been found in soil environments. Similar to fungal GEs, several of the bacterial enzymes displayed a strong preference for bulkier substrates rather than methyl derivatives, but interestingly some showed little or no discrimination toward the alcohol moiety. Of note, the previously held fact that CE15 enzymes were strictly limited to glucuronate esters was here revealed to not be the case, as several of the bacterial enzymes had comparable kinetic parameters for galacturonate esters, such as OtCE15A-C, SuCE15A and SlCE15C [28]. Additionally, significantly increased sugar release from corn cob biomass was shown when GE-lacking enzyme cocktails were supplemented with bacterial GEs, echoing the former results using fungal enzymes and showing that also bacterial enzymes can greatly boost plant cell wall degradation, presumably by loosening up the cell wall matrix. A correlation between boosting ability and activity on model substrates could be observed, indicating that the model substrates may serve as relevant screening candidates to identify industrially relevant enzymes [28]. Expression analysis of the S. linguale CE15 genes revealed different responses of all three genes, with corncob biomass eliciting the strongest up-regulation compared with when the bacteria were grown on glucose, which correlates with studies of fungi, such as Phanerochaete carnosa or Malbranchea cinnamomea whose GEs are mainly up-regulated during growth on complex biomass substrates compared with glucose or yeast extract and peptides [29,30].
The bacterial enzymes do not appear to have a strong need for 4-O-methylation on the substrate uronic acid, as they rapidly hydrolyse unmethylated substrates. One of the fastest GEs on these comes from Teredinibacter turnerae, a bacterium that has been isolated from the gills of a wood-boring shipworm. The TtCE15A enzyme clearly preferred bulkier substrates, and was also shown to be competitively inhibited by both xylooligosaccharides and aromatic compounds, indicating that it interacts intimately with both sides of the substrate [31]. In a follow-up study, the enzyme was shown to have the capacity to liberate sugars from extracted birchwood LCCs, without being hindered by the main chain xylan polysaccharide [32]. These results indicate that GEs are likely enzymes acting early during plant cell wall degradation, as they can clearly act on GlcA esters even when these are attached to xylan, rather than being used in later degradation stages to remove residual smaller lignin-bound sugars. TtCE15A was also found to be less prone than other proteins to permanently adsorb to biomass and LCCs [32], which suggests that the enzyme and perhaps GEs in general have the ability to avoid unproductive binding to or denaturation by lignin.
Another recently characterized bacterial GE was CkGE15A from the hyperthermophilic bacterium Caldicellulosiruptor kristjanssonii [33], a species with a growth optimum of 78°C [34]. As expected, the enzyme showed a high thermal stability, with a melting temperature (Tm) of 72°C [33]. That the Tm was clearly below the optimal growth temperature of the bacterium could be attributed to either missing glycosylation, which has been shown in related species [35], or that the catalytic domain was studied in isolation (see section on multimodularity below). While most fungal GEs characterized to date are unstable above ∼50°C, CuGE from Cerrena unicolor was shown to have a Tm of 70°C [16], which is very close to that of CkGE15A. The activity of CkGE15A on model substrates was comparatively low [33], though its full potential has not been evaluated as synthetic substrates auto-hydrolyse rapidly above ∼40°C. While this enzyme might show much higher activities at a temperature closer to its Tm, other enzymes only have very weak activity even when assayed close to their expected physiological temperature, such as OtCE15B [28] and the enzyme BeCE15A from the gut symbiont Bacteroides eggerthii [36]. The latter enzyme would be expected to act optimally at 37°C, though at this temperature no substrate saturation could be achieved, and only kcat/KM values were determined, which could point toward yet undiscovered LCC structures as true substrates.
GE diversity
The wider substrate range of bacterial enzymes is also reflected in their protein sequences. Several phylogenetic trees of CE15 have been constructed to visualize this diversity [28], sometimes including also enzymes not incorporated in CAZy [8], and also peptide pattern recognition has been used to map the family [37]. In Figure 2, the phylogenetic tree of CE15 and its current members is shown, to showcase the wider sequence diversity among bacterial enzymes compared with fungal counterparts within CAZy. Interestingly, for most species that encode more than one CE15 member, these are found in different parts of the tree which suggests that they could have different activities. The biochemical characterization that has been performed further supports this notion, as different enzymes from the same species tend to have significantly different specificity profiles toward model and natural substrates [28].
Phylogenetic tree of CE15
Also in terms of multimodularity, there is quite a lot of diversity within the family. While most GEs are found as single catalytic domains, fusion to other protein modules is not uncommon. Many fungal GEs have been shown to have appended modules from carbohydrate-binding module family 1 (CBM1) [8], and CBM-mediated binding to cellulose has been demonstrated for TrCip2 and CuGE [10,38]. For bacterial enzymes, the most common CBM partners come from CBM2, a family including cellulose-, chitin-, and xylan-binding proteins. No binding data are currently available for these however, and it is an open question what part(s) of the cell wall these CBM2 modules target. One studied bacterial GE that is coupled to several CBMs and that has been thoroughly investigated is CkGE15A [33,39]. The full-length CkXyn10C-GE15A enzyme does not comprise only the GE, but consists of two N-terminal CBM22 modules, linked to a glycoside hydrolase family 10 (GH10) xylanase, three additional CBM9 modules before the GE domain, which is followed by cadherin and surface-layer homology domains thought to anchor the large enzyme to the outer Gram-positive cell wall. Interestingly, the CBMs display various binding properties, and collectively enable the protein to bind virtually all major plant cell wall glycans [33,39]. Likely, also other CBM-appended GEs benefit from CBMs to help guide them to their complex LCC target substrate, or close proximity thereof.
The combination of more than one catalytic domain, as seen in the C. kristjanssonii enzyme, is fairly uncommon but not a unique case within CE15. Such multicatalytic architectures suggest a common substrate, and not surprisingly the majority of multicatalytic GEs consist of fusions to putative xylan-active enzymes such as xylanases, as in CkXyn10C-GE15A. The first studied bacterial GE, CesA from R. flavefaciens is linked to a CE3 acetyl xylan esterase [20,40], and likely both domains benefit from each other’s action as well as other enzymes within a larger multi-enzyme cellulosome. Also BeCE15A is found linked to a glycoside hydrolase domain, from GH8. GH8 is a polyspecific family, and the C-terminal domain of the full-length BeCE15A-Rex8A was found to be a reducing-end-specific xylose-releasing exo-oligoxylanase (Rex) [36], which as the name implies hydrolyses xylooligosaccharides but not xylan. While a xylanase-GE fusion appears logical to enable a tandem attack of the cell wall, the fusion of an oligosaccharide-targeting domain to a GE is more puzzling. The weak activity of BeCE15A on GE substrates could however indicate that the natural substrate for this two-domain enzyme is yet to be discovered, especially as several more such CE15-GH8 fusions exist in the family [11]. Also the characterised OtCE15C is fused to another catalytic domain (GH106), though the latter has not been investigated either as a single enzyme or in synergy with the OtCE15C GE domain. It is possible that deeper investigation of such fusions of different catalytic domains will be a useful strategy both to find new activities and LCC targets in the future.
Structural comparisons
Among CE families in the CAZy database, CE15 along with CE1, CE5, CE7 and CE19 belong to the α/β hydrolase (ABH) superfamily (http://www.ebi.ac.uk/interpro/entry/InterPro/IPR029058/) while CE2, CE3, CE6, CE12, CE16, CE17 and CE20 belong to the SGNH-hydrolase superfamily (http://www.ebi.ac.uk/interpro/entry/InterPro/IPR036514/), and remaining CE families have other folds (or have been deleted from CAZy) [11]. Despite some structural similarities (a β-sheet sandwiched by helical elements) and catalytic triads consisting of a Ser nucleophile, a His, and an acidic residue, the two superfamilies are clearly distinct, though sometimes mixed up in the literature. A key difference – aside from differences in arrangement and number of secondary structure elements – is that in SGNH-hydrolases, the His and acidic residues reside on the same loop, while in ABHs all catalytic residues reside on distinct loops (Figure 3) [41]. In canonical ABHs (according to secondary structure numbering in Figure 3), the Ser nucleophile is positioned at the end of strand β5, the acid at the end of strand β7 (Acid1 in figure) and the catalytic His in a long loop following strand β8. In contrast, the fungal GEs which were structurally determined initially have the catalytic Glu at the end of strand β6 (Acid2 in figure), though other fungal sequences and most bacterial sequences have an Asp at the canonical ABH position (Acid1). A subdivision of the family has been recently suggested based on the position of the acid, leading to the CE15-A (Acid1) and CE15-B (Acid2) groups of sequences [38]. Some bacterial enzymes, exemplified in Figure 3 by OtCE15A, have acids in both positions and might be evolutionary intermediates as mutagenesis shows some functional redundancy, though both residues are important for maximum catalytic turnover [42].
The overall fold of GEs
As already shown by initial crystallographic ligand binding studies (Figure 4) [43], the GlcA moiety of the esters cleaved by GEs binds in a pocket located above the main parallel β-sheet of the ABH fold (Figure 3), which is located on a fairly open surface in the structurally characterized fungal members of CE15 [38,43,44]. In structurally characterized bacterial enzymes, sequence insertions create high ridges on one or two sides of the substrate binding surface (Figure 3), indicating that the enzymes might interact more intimately with their complex substrate than their fungal counterparts [28,31,39,42,45,46].
Comparison of fungal and bacterial GE structures
The initial structure of StGE2 in complex with a small substrate highlighted extensive hydrogen bonding interactions (Figure 5A) involving all GlcA hydroxyls [43]. More recently, interactions with glucuronoxylooligosaccharides have been crystallographically characterized with OtCE15A and later CuGE (Figure 5B,C) [38,42], showing that GEs directly interact also with the xylan chain through van der Waals interactions – including through a conserved Trp residue – and thus, in contrast to some other CEs in CAZy, are true carbohydrate-active enzymes. However, hydrogen bond interactions are scarce, which perhaps allows the enzymes to accommodate diverse substrates as would be expected in LCCs. The interaction with the lignin portion of LCC substrates has been more difficult to assess, and thus far no structures of Michaelis complexes of GEs have been solved. However, the large inserts close to the active site in bacterial enzymes have been proposed to interact with lignin as they in several cases have a surface rich in aromatic residues facing the presumed ‘lignin side’ of the substrate (Figure 6) [28,31,39]. Possibly, these regions could be involved in LCC interactions, both productively to access the complex lignin–carbohydrate substrate, or to help avoid unproductive binding to lignin as described for TtCE15A, above [32]. Structure-guided inhibition studies to probe the interactions between TtCE15A and small aromatic molecules or glucuronoxylooligosaccharides have also highlighted residues in the active site presumed to interact with lignin, as well as the conserved Trp that interacts with the GlcA moiety and the xylan chain [31].
Close up on active sites structures of GEs with bound ligands in green sticks and binding/active site residues in purple sticks
Residues in bacterial insert regions putatively interacting with lignin
In addition to the aforementioned product complexes, covalent intermediates have been trapped in crystals with catalytically impaired versions of the same enzymes, where either the catalytic His or a conserved active site Arg had been mutated to Ala for OtCE15A [42,46], or using the wild-type CuGE [38]. The highly conserved Arg in the catalytic cleft is important for stabilization of the oxyanion hole during catalysis, a function which in many ABHs is fulfilled by main chain nitrogen atoms. Puzzlingly, some natural GEs are also devoid of this conserved residue (OtCE15B and BeCE15A) and as a result display low activity on model substrates [28,36]. Other ‘inactive’ GEs have been reported (with different sequence features), which may suggest yet undiscovered natural substrates in the family [47,48]. The accumulated structural information has been the basis for the first computational investigations of the catalytic mechanism of the model GE OtCE15A using QM/MM [46]. The study confirmed that for this enzyme, the deacylation step is rate-limiting, though only by a small margin. Experimentally, this is reflected by very little dependence of activity on the nature of the leaving group, which is characteristic of OtCE15A, while as stated above, many GEs strongly prefer bulky, lignin-like leaving groups. For these GEs, the rate-limiting step is likely to be acylation. Furthermore, the same study investigated in depth the effect of substituting the two catalytic acids and the conserved Arg, the latter being shown to be particularly important, as deacylation could not occur when this residue was mutated in silico and the covalent intermediate could be experimentally observed [46]. Another QM/MM study on the Thermothelomyces thermophila glucuronoyl esterase (TtGE) later confirmed the importance of the conserved Arg for stabilization of the oxyanion hole but found that for this specific GE the acylation step is rate-limiting [49]. Finally, MD simulations with OtCE15A highlighted the importance of considering dissociation from substrate when investigating overall performance on real biomass, as for a complex substrate this step had a similar energy barrier as catalysis itself [46].
As described in the ‘Diversity’ section above, some GEs are part of modular enzymes, the most extreme case characterized so far being CkXyn10C-GE15A [33,39]. Attempts to structurally investigate the full-length enzyme (not including C-terminal cell wall-binding domains) have failed due to production issues. However, the GE domain and one of the associated CBM9 modules could be crystallized, and small-angle X-ray scattering (SAXS) carried out on the smaller N-terminal CBM22-CBM22-Xyn10C construct [39]. The CE15 module of CkXyn10C-GE15A lacks two of the inserts commonly found in bacterial sequences but has a more protruding and aromatic rich middle insert (Figure 6), which was speculated to be involved in lignin interaction. Intriguingly, the SAXS study shows that the CBM22-CBM22-Xyn10C is not in an extended conformation at room temperature, but the biological significance is difficult to extrapolate to the full-length construct at elevated temperatures.
Outlook
GEs are intriguing enzymes that simultaneously need to interact with both carbohydrates and lignin, and by cleaving lignin–carbohydrate bonds, they can greatly reduce plant cell wall recalcitrance. Their ability to boost the action of other plant cell wall-degrading enzymes has been demonstrated for both fungal and bacterial enzymes, but more studies are needed to systematically compare the action of different GEs on different types of biomass. Furthermore, as lignin and polysaccharides are coupled through different bonds in the LCC, additional enzyme activities, yet to be discovered, might be needed to fully assess the action of GEs. The large inserts in bacterial GEs relative to fungal GEs may have roles in steering their interactions with lignin, which could also be clarified in future works. Lastly, the association of GEs with other catalytic domains and CBMs, as well as their co-localization in genomes with enzymes targeting other polysaccharides than xylan may point toward new activities remaining to be unravelled within CE15.
Summary
Glucuronoyl esterases (GEs) are enzymes belonging to carbohydrate esterase family 15 (CE15) that are able to cleave bonds between lignin and carbohydrates in the plant cell wall.
GEs interact both with the carbohydrate and lignin portions of their substrates, though the molecular basis for the interaction with the ‘lignin side’ remains elusive.
Bacterial GEs appear to be more diverse than fungal counterparts, both in terms of sequence and structure, where bacterial enzymes often have large inserts relative to the fungal enzymes.
Better defined substrates to mimic the lignin–carbohydrate complex (LCC) structures are needed to further characterize GEs, as several known CE15 enzymes have little activity on model substrates.
Though the importance of GEs for LCC breakdown is amply demonstrated, additional enzyme activities are probably needed to fully decouple lignin and polysaccharides in future biorefineries.
Competing Interests
The authors declare that there are no competing interests associated with the manuscript.
Acknowledgements
The authors would like to acknowledge Novo Nordisk Foundation, project numbers: NNF17OC0027698 and NNF21OC0071611.