Structure and function of microbial α-l-fucosidases: a mini review

Abstract Fucose is a monosaccharide commonly found in mammalian, insect, microbial and plant glycans. The removal of terminal α-l-fucosyl residues from oligosaccharides and glycoconjugates is catalysed by α-l-fucosidases. To date, glycoside hydrolases (GHs) with exo-fucosidase activity on α-l-fucosylated substrates (EC 3.2.1.51, EC 3.2.1.-) have been reported in the GH29, GH95, GH139, GH141 and GH151 families of the Carbohydrate Active Enzymes (CAZy) database. Microbes generally encode several fucosidases in their genomes, often from more than one GH family, reflecting the high diversity of naturally occuring fucosylated structures they encounter. Functionally characterised microbial α-l-fucosidases have been shown to act on a range of substrates with α-1,2, α-1,3, α-1,4 or α-1,6 fucosylated linkages depending on the GH family and microorganism. Fucosidases show a modular organisation with catalytic domains of GH29 and GH151 displaying a (β/α)8-barrel fold while GH95 and GH141 show a (α/α)6 barrel and parallel β-helix fold, respectively. A number of crystal structures have been solved in complex with ligands, providing structural basis for their substrate specificity. Fucosidases can also be used in transglycosylation reactions to synthesise oligosaccharides. This mini review provides an overview of the enzymatic and structural properties of microbial α-l-fucosidases and some insights into their biological function and biotechnological applications.


Introduction
Fucose (Fuc) is a 6-deoxy sugar that can be present as d or l enantiomer in nature. d-fucose (6-deoxy-d-galactose) is frequently found in plant glycosides such as convolvulin from Convolvulaceae plants and in antimicrobials including curamycin produced by Streptomyces curacoi [1]. l-fucose (6-deoxy-l-galactose) is ubiquitously found in mammals, plants, insects and microbes as part of oligosaccharides, glycoproteins such as mucins, or lipid forming glycoconjugates via α linkage [1], whilst β-l-fucose is rare and only seldomly reported in bacteria [2]. These structures are involved in a myriad of physiological processes, including immune recognition [3], development and neural functions [4,5] plant immunity [6,7] or host-microbe interactions (for a review see [8]). For example, Fuc has been implicated in bacteria colonisation by modulating chemotaxis [9], swimming motility [10], pathogenesis [11] or by acting as nutrient source for commensal or pathogenic bacteria [12][13][14]. In nature, Fuc can be linked to other sugar residues via various linkages in the non-reducing end through the action of fucosyltransferases [15,16]. Core Fuc, Le-type Fuc and O-Fuc have different biological functions and are associated with different diseases [17]. Terminal Fuc can be α-1,2 linked to β-Galactose (Gal) from lactose (Lac) or N-acetyllactosamine (LacNAc) in human milk oligosaccharides (HMOs) [18] and blood group antigens [14]. Terminal Fuc can also be α-1,3-linked to β-Glucose (Glc) and β-N-acetylglucosamine (GlcNAc) from HMOs [18], to β-GlcNAc from Lewis antigens [14] and β-Gal from HMOs [19] and Catalytic modules are shown in green and β-sandwich domains that may have carbohydrate binding properties in light brown and yellow. Catalytic nucleophile residues are coloured magenta and catalytic acid/base residues are coloured in orange. Where possible WT apo crystal structures (grey) have been aligned to their corresponding inactive mutant crystal structures (green) to highlight residue movements upon binding to a substrate like ligand. The N-and C-termini are indicated with blue and red spheres, respectively. Surface representation views are related by a 90 • rotation around the y axis. If a substrate complex is not available, the location of the active site is indicated with a black sphere. (A) GH29 fucosidase (SpGH29, apo PDB = 6ORG; D171N; E215Q mutant in complex with Le X PDB = 6ORF). The catalytic domain comprises residues 11-317 and the C terminal β-sandwich module comprises residues 318-451. The bound ligand is shown with Fuc (light red), Gal (yellow) and GlcNAc (light blue). (B) GH95 fucosidase (AfcA, apo PDB = 2EAB; E566A mutant in complex with substrate PDB: 2EAD). The catalytic domain comprises residues 80-133 and 387-778, the N-terminal domain (in light brown) residues 9-79 and 134-293, and the C-terminal β-sandwich module (in yellow) residues 779-896. There is a helical barrel protruding from the N-terminal domain, residues 80-133. The substrate is shown with Gal (yellow), Fuc (light red) and Glc (light blue). C) GH141 fucosidase (BT1002, apo PDB = 5MQP). The catalytic domain comprises residues 1-108 and 296-619, the ancillary β-sandwich domain, residues 109-295 (in yellow for residues 151-251 and in wheat for residues 109-251 and 252-295, according to visual separation into sub domains). D) GH151 fucosidase (ALfuk2, apo PDB = 6TVK). The catalytic domain covers residues 1-336, the C-terminal domain (in wheat), residues 560-660 and the Rossman fold domain (in teal), residues 341-558.
(Fibrobacteres-Chlorobi-Bacteroidetes super phylum) group (24%) and Proteobacteria group (27%) in agreement with previous analyses [44]. Compared with GH29 fucosidases, about half of sequences (4890) are assigned to the GH95 family, 97% of which are from bacteria, with a similar distribution as for the GH29 family between the Terrabacteria (46%), FCB (29%), Proteobacteria (18%) groups. In contrast, GH139, GH141 and GH151 are smaller families comprising 254, 1043 and 203 members, respectively, mostly from bacterial origin (95% of GH139, 98% of GH141 and 99% of GH151). Altogether, these data indicate that about 96.5% of known fucosidase sequences are of bacterial origin [45]. There is also high variation and level of redundancy of putative fucosidase-encoding genes within a given bacterial genome with up to 21 GH29 encoding genes and up to 10 GH95 encoding genes found per genome, while the reported number of genes encoding GH139, GH141 and GH151 does not exceed two per genome (see Supplementary Table S1). In this mini review, we will describe the enzymatic and structural properties of α-l-fucosidases produced by microbes and provide an overview of their biological function and biotechnological applications.

Enzymatic and structural properties of fucosidases GH29 fucosidases
Based on sequence analysis, GH29 fucosidases are predicted to be extracellular (secreted, membrane-attached or periplasm) or intracellular, depending on the metabolic pathways of microbes inhabiting various environments. However, this is rarely validated experimentally and the presence or absence of a signal peptide does not always accurately reflect their location [46]. Functionally characterised GH29 fucosidases from microbes are active within a broad pH range, from 3.3 to 9, with a majority of enzymes showing a preference for neutral conditions ( Table 1). The optimum temperature for GH29 gut microbial fucosidases is around 37 • C while marine-derived microbial fucosidases optimum temperatures are normally below 30 • C ( Table 1). The highest optimal temperature for microbial GH29 fucosidases reported so far is 95 • C, which is for Ssα-fuc isolated from Sulfolobus solextreme P2 in hot springs (Table  1).
GH29 enzymes display broad substrate specificities covering α-1,2, α-1,3, α-1,4 and α-1,6 fucosylated linkages. Based on sequence homology and substrate specificity, GH29 enzymes are divided into two subfamilies, GH29A and GH29B [47]. In general, GH29A enzymes show higher activity towards synthetic aryl substrates such as 4-nitrophenyl α-l-fucopyranoside (pNP-Fuc) or 2-chloro-4-nitrophenyl-α-l-fucopyranoside (CNP-Fuc) compared with GH29B enzymes, while it is common for GH29B not to be active on these chromogenic substrates [46,[48][49][50][51]. The K m values against aryl-Fuc for functionally characterised GH29 enzymes are in the μM to mM range, and the k cat values vary from 10 −3 to 10 2 s −1 . Their catalytic efficiency as estimated from k cat /K m varies from 10 −6 to 10 2 s −1 μM −1 (Table 1). In addition, GH29B enzymes usually act on α-1,3/4 fucosylated linkages rather than α-1,2, whereas members of the GH29A subfamily show a more relaxed linkage specificity ( Figure 3). To date, crystal structures are available from 16 microbial GH29 enzymes originating from 12 different microorganisms. Among them, BT2192 from B. thetaiomicron VPI-5482 [29] and BpGH29 from Bacteroides plebeius DSM 17135 [38] have α-galactosidase activities while ClAgl29A and ClAgl29B from Cecembia lonarensis LW9 were shown to be α-glucosidases [52]. GH29 enzymes are characterised by the lack of α-helix (α5) between β5 and β6 of TIM barrels [30,36]. The catalytic nucleophile and acid/base residues are located at the end of β4 and β6 strands, respectively. While the catalytic nucleophile in GH29 is a conserved Asp, the general acid/base residue is subfamily-dependent. In GH29B enzymes, the acid/base residue based is generally conserved based on sequence alignment with experimentally validated E249 of BT4136 and BT1625 from B. thetaiomicron VPI-5482. In SpGH29 from Streptococcus pneumoniae TIGR4, the assignment of E215 as acid/base was also confirmed by X-ray crystallography [34] (Figure 2A). Here, the D171 (nucleophile) and E215 (acid/base) of SpGH29 are located between the Fuc and GlcNAc residues, corresponding to the -1 and pseudo +1 subsite, respectively. The Gal within +2 subsite makes hydrophobic interactions with W211 and hydrogen bonds to the nucleophile and D257, which, together with the -1 subsite, contributes to the α-1,3/4 fucosidase activity [34]. In contrast to GH29B fucosidases, the acid/base residues of GH29A enzymes show poor alignment across primary sequences, although they can be spatially overlapped with the acid/base residues from GH29B enzymes in their substrate-bound states but not free states [53]. However, the GH29A/B classification does not always accurately predicts linkage preferences [46,54,55] as enzymes from the same subfamily can show various substrate specificities (Table 1 and Figure 3). Some functionally characterised bacterial GH29 fucosidases have only been reported to be active against artificial substrates, such as BF0810 from Bateroides fragilis NCTC 9343 [56], Fp240 and Fp251 from Paraglaciecola sp. [57]. Further investigation is required to determine their specificity towards natural substrates. GH29 fucosidases often present limited activity towards Lewis antigen glycan epitopes decorated with a sialic acid [48,58,59], which is ubiquitously found in antennary human N-and O-glycans. In contrast, the GH29 fucosidase E1 10125 from the gut Note: kinetic parameters were obtained using aryl-Fuc substrates. 1 Estimated from reported V max (μmol/L/min/mg) and molecular weight (g/mol, MW) using k cat (s −1 ) = V max × MW/1000/60.  symbiont Ruminococcus gnavus E1, was found to be active towards Lewis antigen glycan epitopes irrespective of the presence of terminal sialic acid [33]. Interestingly, E1 10125 showed stronger binding affinity and catalytic efficiency towards sialyl-Lewis X (sLe X ) than Lewis X (Le X ), as shown by isothermal titration calorimetry, saturation transfer difference NMR and kinetic assays [33]. X-ray crystallography, molecular dynamics simulation and docking showed that sLe X could be accommodated within the binding site of E1 10125 fucosidase. It is likely that other microbial fucosidases may also be able to accommodate a terminal sialic acid in their binding pocket although this remain to be demonstrated experimentally [50]. In addition, microbial GH29 fucosidases have been reported to carry out transglycosylation reactions due to their retaining mechanism of action, as recently reviewed elsewhere [60,61].

GH139 and GH141 fucosidases
Currently, there are two functionally characterised GH141 enzymes in the CAZy database. BT1002 from B. thetaiotaomicron VPI-5482, the founding member of the GH141 family, is an endo-acting enzyme releasing 2-O-methyl-d-xylose-α-1,3-l-fucose disaccharide from the chain A of the complex pectin rhamnogalacturonan-II (RG-II) [42]. The catalytic domain of BT1002 folds into a right-handed parallel β helix ( Figure 2C). The solvent-exposed surface representation of the catalytic centre of BT1002 reveals an extended catalytic pocket that may assist the accommodation of the disaccharide containing xylose and Fuc. Site directed mutagenesis revealed that putative nucleophile D523 and general acid/base D564 located in the binding pocket were critical for l-Rhap-α−1,3-d-Apif-α−1,4-d-MeXylp-l-Fucp hydrolysis [42]. The second member of the GH141 family is in fact a xylanase, Cthe 2195 from Acetivibrio thermocellus ATCC 27405 (previously known as Clostridium thermocellum) [72], which showed no activity on aryl-Fuc substrate.

Insights into the biological role of microbial fucosidases
Gut microbes such as Bifidobacteria species [63], B. thetaiotaomicron [47], R. gnavus [33] or Akkermansia muciniphila [75] have been shown to produce multiple fucosidases that cleave Fuc from host glycans, underscoring their importance for the fitness and adaptation of these bacteria to the gut environment (Supplementary Table S1). The capability of removing α-l-fucosyl residues from free oligosaccharides and glycoconjugates conferred fucosidase-possessing microbes a competitive advantage in mucin glycan foraging [14], and in turn help maintain intestinal homeostasis [76,77]. Fucosidases from commensal bacteria also play a role in cross-feeding with other members of the gut microbiota [78,79] or enteric pathogens such as Salmonella enterica serovar Typhimurium, Clostridium difficile, [80], Campylobacter jejuni [81,82] and other pathogens [83] facilitating their infection. Recently, α-l-fucosidases from the GH29 family were identified and characterised from the metagenome of faecal samples of breastfed infants. This analysis revealed a remarkably high number of GH29 α-l-fucosidases present in the infant intestinal environment with high sequences identity (above 98% identity) with α-l-fucosidases from B. thetaiotaomicron, Bacteroides caccae, Phocaeicola vulgatus, Phocaeicola dorei, R. gnavus, and Streptococcus parasanguinis (Supplementary Table S1). These enzymes showed different substrate specificities toward HMOs, blood group antigens, and glycoproteins [51]. GH95 fucosidases were also identified in the infant faecal microbiome from B. longum subsp. infantis, B. thetaiotaomicron, B. caccae, R. gnavus, P. vulgatus, and P. dorei (Supplementary Table S1). The variety of α-l-fucosidases may provide these species with an advantage in colonising the gut of infants and adults. Novel tools have been developed to further investigate the biological roles of microbial fucosidases. For example, activity-based probes (ABP) have been used to identify their functional state, spatial and temporal distribution [84]. Cyclophellitol epoxides/aziridine, 2-deoxy-2-fluoro glycosides and quinone methide have been employed to design covalent inhibitors of glycosidases [85]. Fucopyranose-configured cyclophellitol aziridines have been applied for in vitro and in vivo labelling of bacterial and mammal GH29 fucosidases [86]. More recently, a 2-deoxyl-fluoro fucosyl fluoride derivative named YL209 has been developed to match the versatile linkage specificity of GH29 enzymes, potentially extending its application to the identification of gut microbial fucosidases [87]. Lately, an ortho-quinone methide based probe with an azide mini-tag has been developed to label both retaining and inverting bacterial fucosidases [88].

Biotechnological applications of microbial fucosidases
With the development of glycan analytical tools, glycan profiling has gained momentum in the last decade as a potential strategy to monitor the state of diseases [89]. Some of the main glycan biomarker targets are human serum N-glycans containing two types of fucosylation, antennary Le X or sLe X epitopes and Fuc-α-1,6-GlcNAc (6FN). The fucosylation pattern of human serum N-glycans are indicators of immunological responses to diseases including cancer [90], diabetes [91], and Helicobacter pylori infection [92]. Fucosidases with distinct substrate specificities have been employed as one of the exoglycosidases used to validate and monitor these glycan biomarkers in a number of human studies [72,[93][94][95][96][97][98].
Another application of fucosidases is modulation of core fucosylation status in glycoproteins, such as antibodies, which is crucial for their functions such as antigen recognition [99]. So far, only human fucosidase FucA1 has been shown to release core fucose from intact glycoproteins albeit with low enzymatic activity [100]. No bacterial α-l-fucosidase has been described with the capability to remove the core Fuc from intact glycosylated IgG. However, recent work characterised four fucosidases showing high capacity to hydrolyse α-1,6-linked Fuc from the disaccharide 6FN [51]. These α-l-fucosidases might have applications in the development of therapeutic proteins with modified core fucosylation, although their capacity to act on core fucosylation in glycosylated antibodies needs further analysis. Recent glycosidase and glycoligase tools based on the site-specific GH29 core α-1,6-l-fucosidase AlfC from L. casei, have been developed to aid glycoengineering of antibodies for core fucosylation of the Fab and Fc fragments [23, 101,102].

Conclusions and perspectives
Fucosylated glycans influence a wide range of biological processes in health and diseases. Despite recent advances in the structure and function relationships of GH29 enzymes, our biochemical and structural understanding of the range of microbial α-l-fucosidases and of their natural substrates remains limited compared to the wealth of sequencing data available in metagenomic databases. Further enzymatic investigations of bacterial fucosidases should shed light on the type of fucosylated structures accessible to microbes and the specificity of α-l-fucosidases towards substrates with different modifications and linkages. A combination of metagenomics and glycomics approaches is warranted to advance our knowledge into the biological roles of microbial α-l-fucosidases. Harnessing the diversity of microbial α-l-fucosidases will provide powerful tools that can be exploited for glycan analysis, biomarker detection or new glycan-targeted therapies.

Summary
• Microbial α-L-fucosidases from soil, marine or gut origin are of great biological and biotechnological importance.
• Enzymatic investigations of GH29 α-L-fucosidases advanced our knowledge of the range of substrates and glycan utilisation strategies used by microbes to adapt to their environment while α-L-fucosidases from other GH families have been under-studied.
• α-L-Fucosidases have been developed as glycoenzyme tools for glycan analysis, biomarkers for diagnosis or glycan-targeted therapies as well as oligosaccharide synthesis and glycoengineering on glycoproteins.
• Further biochemical and structural characterisation of the variety of α-L-fucosidases produced by microbes is required to enhance our understanding of the mechanisms underpinning host-microbe interactions and harness the potential of these enzymes for biotechnological and biomedical applications.

Competing Interests
The authors declare that there are no competing interests associated with the manuscript.

Open Access
Open access for this article was enabled by the participation of John Innes Centre in an all-inclusive Read & Publish agreement with Portland Press and the Biochemical Society under a transformative agreement with JISC.