Biological carbohydrate polymers represent some of the most complex molecules in life, enabling their participation in a huge range of physiological functions. The complexity of biological carbohydrates arises from an extensive enzymatic repertoire involved in their construction, deconstruction and modification. Over the past decades, structural studies of carbohydrate processing enzymes have driven major insights into their mechanisms, supporting associated applications across medicine and biotechnology. Despite these successes, our understanding of how multienzyme networks function to create complex polysaccharides is still limited. Emerging techniques such as super-resolution microscopy and cryo-electron tomography are now enabling the investigation of native biological systems at near molecular resolutions. Here, we review insights from classical in vitro studies of carbohydrate processing, alongside recent in situ studies of glycosylation-related processes. While considerable technical challenges remain, the integration of molecular mechanisms with true biological context promises to transform our understanding of carbohydrate regulation, shining light upon the processes driving functional complexity in these essential biomolecules.
Introduction
Carbohydrate (glycan/saccharide/sugar) polymers are among the most ubiquitous and important macromolecules in nature, used without exception across all kingdoms of life. Complex biological carbohydrates, either in free form or as conjugates of proteins, lipids, and metabolites, mediate countless structural [1-3], signalling [4-9], and metabolic functions [10-12], well beyond their colloquial designations as dietary energy sources. The wide-ranging roles of biological carbohydrates arise from their astonishing molecular diversity, enabling interactions with a huge range of partners [13]. Carbohydrate polymers are constructed from monosaccharide building blocks, which can link together in multiple regio- and stereoisomeric variations, producing combinatorial complexity far exceeding that of other biomolecules (Figure 1). Such complexity reflects an equally extensive enzymatic network tasked with regulating carbohydrate structures.
Molecular diversity of biological carbohydrates.
(A) Regio- and stereo-isomeric combinations for a theoretical dimer of D-glucose. Eleven unique permutations are possible [Glc(α1→1)βGlc and Glc(β1→1)αGlc are equivalent], far exceeding the complexity of nucleic acids and proteins. (B) Selected glycan types found in eukaryotes, depicted in symbol nomenclature for glycans (SNFG) format [14]. Note that glycosaminoglycans are highly heterogenous and variable polymers – indicative sequences shown may not fully reflect natural complexity.
(A) Regio- and stereo-isomeric combinations for a theoretical dimer of D-glucose. Eleven unique permutations are possible [Glc(α1→1)βGlc and Glc(β1→1)αGlc are equivalent], far exceeding the complexity of nucleic acids and proteins. (B) Selected glycan types found in eukaryotes, depicted in symbol nomenclature for glycans (SNFG) format [14]. Note that glycosaminoglycans are highly heterogenous and variable polymers – indicative sequences shown may not fully reflect natural complexity.
Precise enzymatic control of carbohydrates is of paramount importance in biology. Glycosidic linkages between monosaccharides must be made or broken with exquisite selectivity, within the context of larger oligo/polymeric assemblies. In turn, glycans containing multiple linkage types must be constructed by networks of enzymes acting in concert. Unlike nucleic acids or proteins, carbohydrates are not directly templated by genetics, meaning that multiple related structures can be produced by each pathway, depending on the functional interactions arising between enzymes and their substrates. Despite this lack of templating, cells nevertheless exert fine control over their glycosylation repertoire, enabling distinct motifs to be presented in a tissue and context specific manner [15-22].
Here, we review advances in structural glycobiology that have shaped our understanding of carbohydrate processing, particularly in eukaryotes. We cover mechanistic insights gained from in vitro studies of glycosylation and examine how these link to new in situ approaches that enable the examination of glycosylation within native biological systems.
Carbohydrate processing enzymes – mechanistic considerations
Underscoring their importance, carbohydrate processing enzymes form a substantial part of the metabolic repertoire of all organisms, with 1–3% of the gene-coding content of any species typically dedicated to these functions [23,24]. The CAZy consortium has undertaken to classify all known Carbohydrate Active enZymes (CAZymes) within sequence-based families, providing a powerful framework on which to place mechanistic insights [25-27]. Within the CAZy classification, the glycoside hydrolases (GHs) and glycosyltransferases (GTs) make up the largest enzyme groupings, with 189 GH and 137 GT families annotated as of December 2024. These enzymes are responsible for the degradation and construction of biological carbohydrates, and accordingly, have spurred decades-long interest in understanding their functions. Structural techniques have contributed centrally to the mechanistic study of GHs and GTs, from Phillips’ groundbreaking elucidation of lysozyme by X-ray crystallography [28,29], to recent cryo-electron microscopy (cryo-EM) investigations of multicomponent GT complexes [30,31]. Mechanistic insights have, in turn, spurred the development of strategies for manipulation [32-35], supporting diverse applications across fundamental science, biomedicine and biotechnology [36-45].
GH mechanisms
GHs are highly optimised molecular machines that hydrolyse glycosidic bonds [46,47], and represent the primary route for glycan catabolism across life. Mechanistically, GHs can be classified as inverting or retaining, based on the relative configurations of the glycosidic anomeric centre from substrate to product. Retaining GHs mediate hydrolytic cleavage with net retention of anomeric stereochemistry (i.e., α substrate to α product, or β to β) while inverting GHs flip anomeric stereochemistry (α to β, β to α). These GH activities are accommodated within a diverse range of protein folds, broadly conserved across ‘clans’ (i.e., superfamilies) within the CAZy scheme [48], (Figure 2a).
Structures and mechanisms of glycoside hydrolase enzymes.
(A) Protein folds adopted by known GHs, with corresponding clans listed. One representative structure for each fold is shown [49-55]. PDB codes highlighted in red. (B) Schematic of the single SN2 displacement mechanism used by inverting GHs. (C) Schematic of the double-displacement mechanism used by retaining GHs, which results in net retention of anomeric stereochemistry after two SN2 attacks.
(A) Protein folds adopted by known GHs, with corresponding clans listed. One representative structure for each fold is shown [49-55]. PDB codes highlighted in red. (B) Schematic of the single SN2 displacement mechanism used by inverting GHs. (C) Schematic of the double-displacement mechanism used by retaining GHs, which results in net retention of anomeric stereochemistry after two SN2 attacks.
Most GHs utilise nucleophilic displacement to process their carbohydrate substrates, aided by conserved carboxylate (Asp/Glu) residues within the enzyme active site. Inverting GHs mediate hydrolysis using a single SN2-type attack, initiated by catalytic base-mediated deprotonation of an active site water, with loss of aglycone aided by protonation from a catalytic acid (Figure 2b). Conversely, retaining GHs typically employ double displacement mechanisms, in which initial SN2 attack on the substrate by a nucleophilic residue produces a transient glycosyl-enzyme intermediate, which is released by water after a second SN2 attack (Figure 2c). A smaller number of non-canonical mechanisms are also known, involving alternative nucleophiles [56], neighbouring group participation [57,58] or NAD+ cofactors [59-62].
Glycosyltransferase mechanisms
GTs catalyse the formation of new glycosidic linkages, transferring carbohydrate units from activated glycosyl donors onto specific acceptor molecules. In contrast to GHs, GT enzymes display much less structural diversity, with only three clans currently classified. The GT-A and GT-B clans largely comprise soluble proteins adopting α/β/α sandwich or paired-Rossman folds, respectively, and use sugar-nucleotide (‘Leloir’) substrates as glycosyl donors [63]. In contrast, clan GT-C exclusively comprises integral multipass transmembrane (TM) proteins, which use sugar-phosphate linked lipid (‘non-Leloir’) substrates as donors [64], (Figure 3a). Note that while GT-A and GT-B domains are themselves typically soluble, they can still link to membrane domains as part of broader protein structures, as observed for cellulose [69-71], and hyaluronan synthases (both GT-A) [72], wherein enzyme products are conjugated to closely abutting TM channels. Many GTs also display allosteric shifts upon the binding of either glycosyl donor [68,73-76] or acceptor substrate [77], such that reactive binding modes only occur upon the formation of a complete ternary complex. Care must, therefore, be exercised in the generation and interpretation of GT structures for mechanistic study (Figure 3b).
Structures and mechanisms of glycosyltransferase enzymes.
(A) Protein folds adopted by known GTs. One representative structure for each fold is shown [65-67]. PDB codes highlighted in red. (B) GTs and allosteric modulation: MGAT5 in complex with acceptor oligosaccharide alone or acceptor + UDP. In the absence of UDP, the acceptor oligosaccharide adopts an inert pose with no interaction to the catalytic base. A loop organised upon UDP binding (red) alters the acceptor binding pose to induce catalytically productive interactions [68]. (C) Schematic of canonical SN2 displacement mechanism in inverting GTs. Note the similarity to inverting GHs. (D) Schematic of SNi-like mechanism for retaining GTs involving front-face transfer.
(A) Protein folds adopted by known GTs. One representative structure for each fold is shown [65-67]. PDB codes highlighted in red. (B) GTs and allosteric modulation: MGAT5 in complex with acceptor oligosaccharide alone or acceptor + UDP. In the absence of UDP, the acceptor oligosaccharide adopts an inert pose with no interaction to the catalytic base. A loop organised upon UDP binding (red) alters the acceptor binding pose to induce catalytically productive interactions [68]. (C) Schematic of canonical SN2 displacement mechanism in inverting GTs. Note the similarity to inverting GHs. (D) Schematic of SNi-like mechanism for retaining GTs involving front-face transfer.
Like GHs, GTs can be classified as retaining or inverting, depending on the stereochemical relationship between the substrate and product. Inverting GTs operate analogously to inverting GHs, using a single SN2-type displacement to form glycosidic bonds, driven by base-mediated deprotonation of the acceptor substrate (Figure 3c). In contrast, most retaining GTs are proposed to operate via non-nucleophilic SNi-type front-face transfer, involving the constrained approach of the acceptor from the same ‘face’ as the glycosyl donor bond [78-81], (Figure 3d). While some unusual double displacement GT reactions have recently been identified for retaining Kdo-transferases involved in bacterial cell wall construction [82,83], the extent to which these mechanisms operate across other GT families remains to be determined.
Special consideration must be given to GT-C enzymes, which historically had fewer structures reported prior to the wider adoption of cryo-EM for studying membrane proteins. Classical GT-C folds comprise a core region containing seven conserved TM helices, followed by a variable number of additional helices, with the essential catalytic base residing in an extended loop following TM helix 1 (all known GT-C enzymes are inverting) [30,64,65,77,84-86]. Interestingly, an alternative arrangement of ten TM helices has recently been noted for bacterial GT-Cs RodA [87] and WaaL [88], highlighting potentially undiscovered structural diversity within this poorly characterised clan.
For all GT-Cs, binding of lipid-linked glycosyl donor substrates occurs within hydrophobic TM cavities, leading to placement of sugar headgroups near the glycosyl acceptor and catalytic base, in line with canonical inverting GT transfer, with occasional variations, e.g., a His base in WaaL rather than Asp/Glu [88]. Perhaps more intriguing is the ability of some GT-C enzymes to glycosylate poorly nucleophilic acceptors. The eukaryotic oligosaccharyltransferases OST-A/B [30,65], and bacterial homologues such as PglB, transfer glycans onto the unreactive amide nitrogen of Asn sidechains, which is putatively achieved via H-bond-mediated twisting of the amido nitrogen, breaking the conjugated π-system that otherwise dampens nucleophilicity [89]. Separately, CMTs, which catalyse tryptophan C-mannosylation, are proposed to function via an SNAr-type mechanism, with initial electrophilic substitution at tryptophan C2 closely followed by base-mediated deprotonation to restore aromaticity [77].
Understanding glycosylation networks in vitro
Although many simple glycan motifs, such as protein modification by O-GlcNAc [90], C-mannose [91] or O-fucose [92], play important physiological functions, e.g., regulation of transcription and epigenetics by nuclear O-GlcNAc [93-97], most biological carbohydrates are considerably more complex (Figure 1). These complex glycans are built by multi-step biosynthesis pathways, wherein the product of one enzyme forms the substrate of one or more downstream enzymes. Structural studies have now made considerable headway into glycosylation processes, with major pathways such as N-glycosylation and heparan sulfates (HS) nearing full characterisation, enabling understanding of how their activities evolve as glycan products mature (Figure 4). As an example, recent structures of EXTL3 and EXT1/2 inform upon the initial construction of the HS backbone by non-processive polymerisation [106,112], with further HS modification by deacetylation, epimerisation and sulfation rationalised by the structures of NDST1 [107], GLCE [108] and O-sulfotransferases [110,113-115], respectively.
Near-complete structural characterisation of selected glycosylation pathways.
(A) Protein N-glycan processing within the ER and Golgi to form complex-type biantennary glycans, with structures of responsible enzymes shown [98-105]. For clarity, processing pathways to form hybrid, bifurcated or tri-/tetra-antennary complex glycans have been omitted. (B) Heparan sulfate biosynthesis within the Golgi, from the first committed step after formation of the core tetrasaccharide, with structures of responsible enzymes shown [106-111]. PDB codes highlighted in red.
(A) Protein N-glycan processing within the ER and Golgi to form complex-type biantennary glycans, with structures of responsible enzymes shown [98-105]. For clarity, processing pathways to form hybrid, bifurcated or tri-/tetra-antennary complex glycans have been omitted. (B) Heparan sulfate biosynthesis within the Golgi, from the first committed step after formation of the core tetrasaccharide, with structures of responsible enzymes shown [106-111]. PDB codes highlighted in red.
Despite our considerable understanding of individual glycosylation enzymes, many aspects of overall carbohydrate regulation still remain unclear. In particular, in vitro characterisation cannot explain the diverse but regulated heterogeneity of complex biological carbohydrates. Without direct templating, glycosylation is likely controlled by the coordinated and controlled transfer of substrates between successive enzymes, directing certain motifs to be constructed whilst minimizing off-target reactions. Understanding this functional interplay between glycosylation enzymes, and the mechanisms that link catalysis with substrate transfer, can only be achieved by studying native systems.
Towards understanding (eukaryotic) glycosylation in situ
For eukaryotes, a major proportion of carbohydrate processing occurs within the secretory ER-Golgi network, involving the action of resident GHs and GTs upon nascent glycoproteins as they transit towards their destinations (Figure 4). Partitioning of glycosylation enzymes along the Golgi stack has long been understood, with most early-acting enzymes residing in the cis-Golgi and later enzymes in the medial- and trans-Golgi, providing some hierarchical control over their successive activities [116-118].
Considerable evidence also implicates ‘kin-recognition’ as another mechanism for regulating glycosylation, whereby enzymes within a pathway form homo- or heteromeric complexes within the ER-Golgi to coordinate their functions. Early work by Nilsson et al. used relocalisation tags to probe physical relationships between human glycosylation enzymes. By grafting the ER-directing sequence of p33 onto medial-Golgi enzymes MGAT1 or MAN2A1, which act successively during N-glycosylation, the non-tagged partner could also be relocated. Conversely, p33 tagging of the trans-Golgi resident β-galactosyltransferase failed to relocate either MGAT1 or MAN2A1, suggesting specific coordination between the former pair [119]. Many similar relationships have now been identified across yeast [120], plants [121] and human [122] cells. Notably, several examples of kin-recognition have been reported within the biosynthesis pathway of the glycosaminoglycan HS, with EXT1 and EXT2 known to form a stable heterodimer [106,112,123], EXTL3 forming a homodimer [111], and EXT2 and NDST1 [124], and GLCE and HS2ST1 [125,126] also postulated to interact (Figure 4b). Based on the rapidity of HS construction in mouse mastocytoma fractions (minutes) [127], it has been theorised that HS biosynthesis enzymes may form a functional supercomplex, dubbed the GAGosome, responsible for coordinated construction of this polysaccharide [128]. However, no clear evidence for such an assembly has yet emerged. Indeed, with the exception of some obligate dimers [106,111,112], almost nothing is known about how kin-recognition in general may operate within the ER-Golgi, reflecting the likely transient nature of these interactions [122,129] and their lability outside of native environments.
Recently, super-resolution microscopy studies have started to tackle the complexities of Golgi function, enabling prior observations of interactions to be contextualised within living cells [130-134]. From imaging, the main organisational component of the Golgi complex appears to be the so-called ‘Golgi unit’, which broadly corresponds to the cisternae structures commonly associated with this organelle. Individual Golgi units are bounded by markers including GPP130, Golgin84 and Giantin, and connect to adjacent units via tubules, forming a broader Golgi ribbon superstructure. Within each Golgi unit, glycosylation enzymes are distributed to punctate zones, with those catalysing earlier-stage reactions occupying more ‘peripheral’ locations compared with later-stage enzymes, indicating that lateral as well as transverse enzyme partitioning may operate to regulate carbohydrate biosynthesis. Golgi units are also observed to be highly dynamic, with both splitting and fusion occurring over timeframes of minutes, potentially facilitating the even distribution of enzymes [134]. It is thus clear that subcellular imaging can deliver powerful insights into Golgi glycosylation processes. However, the spatial resolutions offered by super-resolution light microscopy still fall below that required for directly visualizing enzyme interactions. More precise molecular identification requires resolutions currently only afforded by electron imaging.
Cryo-ET – direct visualisation of native biology
The recent emergence of cryo-electron tomography (cryo-ET) and associated techniques for biological imaging has provided a powerful toolkit for the study of intracellular organisation. Like the application of cryo-EM for single particle analysis, cryo-ET uses transmission electron microscopy (TEM) to image vitrified samples at near-atomic resolutions. Whereas single-particle data typically comprise many thousands of images of a given molecule for averaging, cryo-ET focuses on imaging a single site at multiple tilts, enabling subsequent three-dimensional reconstruction of a volume of interest.
Although cryo-ET projects remain significant undertakings, several recent advances have greatly increased the accessibility of this technique for interrogating biological environments. One of the most fundamental limitations of cryo-ET arises from the low mean free path (i.e., penetrating ability) of electrons, which limits the thickness of samples that can be studied to ~200 nm or less [135]. Thus, while viruses [136-139], vesicles [140], small bacteria [141,142] and cell peripheries [143-145] are readily imaged by cryo-ET, thicker specimens such as cell bodies or tissues must first be thinned to electron transparency. Diamond knife cryo-sectioning has historically been used to process samples to required thicknesses but can cause sample compression and crevassing artefacts that distort resulting images [146]. The advent of gallium [147], and more recently plasma-based [148,149] focussed ion beam (FIB) milling, coupled to scanning electron microscopy (FIB-SEM), has now largely replaced cryo-sectioning for biological samples, enabling cryo-ET-compatible lamellae to be rapidly prepared from cells with high throughput and minimal distortion [150]. By combining FIB milling with cryo-lift out strategies, sections of tissue or even whole organisms are now within reach [151-155]. Further integration of fluorescence microscopy also enables correlated light and electron microscopy (CLEM) approaches, whereby regions of interest can be targeted for FIB milling and cryo-ET based on incorporated fluorescent markers (e.g. an ER or Golgi stain to study glycosylation), thus bridging the imaging scales traditionally occupied by cellular and molecular biology [156-158].
A second major challenge for cryo-ET is the poor contrast arising from low-dose imaging of sensitive biological samples [159], compounded by the geometric increase in lamella thickness that occurs at higher tilts [160]. As with single-particle cryo-EM, recent developments in field emission guns, direct electron detectors, energy filters [161] and phase plates [162,163] have now substantially improved the data quality that can be obtained from vitrified biological samples. New processing methods have also streamlined workflows [164-166], with recent machine-learning tools significantly enhancing the speed and accuracy of cryo-ET annotations [167-171]. In favourable cases, with abundant well-resolved particles, near-atomic structures are now achievable using subtomogram averaging, wherein multiple copies of a particle are extracted from tomograms, aligned, and averaged to higher resolution. Stunning recent reconstructions of ribosomes at ~3–4 Å directly demonstrate the potential of in situ elucidation [141-143]. Where high resolutions are not possible, lower resolution volumes can still be used to dock coordinates from cryo-EM or X-ray crystallography, linking molecular models to in situ context. We direct interested readers to excellent recent reviews of this rapidly emerging field, including technical considerations not possible to cover here [172-174].
Cryo-ET studies of the ER-Golgi pathway
Due to substantial challenges (see below), studies of carbohydrate processing by cryo-ET are still in their infancy. However, several recent reports have highlighted the use of this technique to investigate eukaryotic ER-Golgi networks, shedding light on some key processes that impact glycosylation (Figure 5).
In situ glycosylation pathways and their examination by cryo-ET.
(A) Schematic of eukaryotic glycosylation within the ER-Golgi pathway. Nascent protein substrates are modified by GHs and GTs as they transit through the ER and successive Golgi cisternae. ‘Kin-recognition’ between enzymes may facilitate control, driving the production of certain glycan motifs, despite the overall non-templated nature of glycosylation. Boxes reference recent cryo-ET studies of systems related to glycan processing. (B) Tomogram of Golgi apparatus from C. reinhardii. Stack morphology of the organelle is clearly apparent, alongside COP vesicles that distribute cargo between cisternae. Adapted from EMD-3977 [175] (C) Top – tomogram of HeLa ER membrane and surroundings, showing attachment of ribosomes (red arrows), alongside detached cytosolic ribosomes (pink). Bottom – tomogram of HeLa Golgi at higher magnification. Defined densities are visible on the luminal membrane face, which may (in part) correspond to enzymes involved in glycosylation. From authors’ own work. Black bars – 100 nm. White bars – 20 nm.
(A) Schematic of eukaryotic glycosylation within the ER-Golgi pathway. Nascent protein substrates are modified by GHs and GTs as they transit through the ER and successive Golgi cisternae. ‘Kin-recognition’ between enzymes may facilitate control, driving the production of certain glycan motifs, despite the overall non-templated nature of glycosylation. Boxes reference recent cryo-ET studies of systems related to glycan processing. (B) Tomogram of Golgi apparatus from C. reinhardii. Stack morphology of the organelle is clearly apparent, alongside COP vesicles that distribute cargo between cisternae. Adapted from EMD-3977 [175] (C) Top – tomogram of HeLa ER membrane and surroundings, showing attachment of ribosomes (red arrows), alongside detached cytosolic ribosomes (pink). Bottom – tomogram of HeLa Golgi at higher magnification. Defined densities are visible on the luminal membrane face, which may (in part) correspond to enzymes involved in glycosylation. From authors’ own work. Black bars – 100 nm. White bars – 20 nm.
Native studies of the ER
A central function of the eukaryotic ER is the initiation of N-glycosylation, whereby a Glc3Man9GlcNAc2 oligosaccharide is transferred from dolichol-linked donors onto NXS/T sequons in nascent protein chains. This process is carried out by the oligosaccharyltransferase complexes OST-A/B, which associate with the ribosome, SEC61 translocon and translocon-associated protein (TRAP) complex to co-translationally (OST-A) or post-translationally (OST-B) modify peptides as they enter the ER lumen [65,176]. Seminal studies by Förster et al. have used cryo-ET to study the ribosome-translocon supercomplex in semi-purified ER microsomes, which provide a simpler system for analysis compared with whole cells while still maintaining membrane context. Although early efforts from 2011 achieved structures at only modest (~31 Å) resolutions, spatial relationships between ribosomes and the TRAP and OST complexes could still be identified, enabling both the stoichiometry and organisation of this macromolecular assembly to be verified [177-179]. With modern technical advances, the same team has recently achieved cryo-ET reconstructions of ER ribosomes to 4–10 Å, enabling at least ten decoding states across the ribosomal translation cycle to be classified, and polysomal networks to be traced [180]. Importantly for glycosylation, four ribosome-bound translocon states could also be classified, with 69% representing the SEC61-TRAP-OST-A supercomplex, enabling elucidation of a ~ 4.2 Å structure. The resulting model of the native translocon has enabled hypotheses regarding the essential role of TRAP in glycosylating proteins with weaker signal peptides [181]. Putatively, nascent polypeptides entering the ER are likely to encounter and push upon the TRAP α-subunit, which can interact with SEC61α to allosterically open its hydrophobic lateral gate, easing the entry of peptides into the SEC61 channel and towards glycosylation by OST-A.
Native studies of the Golgi
Electron micrographs of fixed Golgi have been reported since at least the 1950s [182-184], and vitrified cryo-sectioned Golgi since the 2000s [185], unveiling broad aspects of organellar morphology. Detailed cryo-ET imaging of Golgi from FIB-milled lamellae was first reported by Engel et al. in 2015, using the marine algae Chlamydomonas reinhardii, revealing a classical stack-like organisation, with progressively narrowing cisternae from cis to trans (Figure 5b) [186]. Intriguingly, close examination of the Chlamydomonas trans-Golgi revealed regularly arrayed densities, which were hypothesised to correspond to GTs, based on similarities in size to Golgi-resident FUT6, GMII and α3GalT. Whilst spatial organisation of enzymes within the Golgi arrays is a compelling hypothesis, the extent to which these densities truly correspond to GTs remains unverified, as does the generality of these observations beyond Chlamydomonas. Notably, intracisternal arrays have not been observed in other Golgi tomograms, such as those from mammalian HeLa or INS-1E cells, which also exhibit different organellar morphologies (Figure 5c) [187]. It is possible that the arrays observed in Chlamydomonas Golgi represent structural proteins, rather than metabolic enzymes. Characterisation of the Golgi across more species and cell types will be key to establishing general vs. specific features of this organelle, including the nature of macromolecular organisation within its cisternae.
Native studies of vesicular transport
Another major factor affecting eukaryotic glycosylation is the distribution of enzymes within Golgi [133], which depends on the dynamic network of COPI (retrograde) and COPII (anterograde) vesicles that move cargo throughout the organelle [188]. In situ reconstructions of Chlamydomonas COPI vesicles were reported in 2017 by Briggs et al. [175], highlighting the trimeric structures of their coat proteins, which closely matched vesicles previously generated in vitro using mouse proteins [189]. Interestingly, in situ COPI vesicles contained several additional densities on their luminal faces not seen in vitro, likely corresponding to bound cargo or cargo receptors, enabling some insights into trafficking. Further integration with functional studies will be needed to resolve the mechanisms of vesicular transport and understand how these influence enzyme distribution throughout the Golgi.
Challenges for studying glycosylation by cryo-ET
Size limitations
A major challenge studying glycosylation by cryo-ET is the small size of most relevant enzymes (typically 50–200 kDa; ~10 nm), which lies towards the lower bound of what can be easily identified within a crowded intracellular milieu. Consequently, relatively few insights have been made into glycosylation-specific processes, despite detailed tomograms of the ER-Golgi being available for nearly a decade. It is clear that improved tools and methodologies are needed to annotate complex cryo-ET datasets.
The principal strategy for locating smaller proteins by cryo-ET is to colocalise a recognisable marker at or near the site of interest. Historical efforts have widely employed Au nanoparticles to label tomograms, due to the inertness of Au and its high atomic number compared with biological elements, which creates strong signals under TEM imaging regimes. Au nanoparticles conjugated to proteins or antibodies have been used to study diverse phenomena by cryo-ET, including transport across the nuclear pore [190], viral biogenesis [191,192], thylakoid photosystems [193] and growth factor trafficking [194]. Although large Au nanoparticles can obscure biological features of interest in tomograms, Young et al. recently demonstrated that 1.4 nm Au clusters are sufficiently non-intrusive to support sub-tomogram averaging of ribosomes and nucleosomes [195]. Combined with ever-improving technologies to raise custom binders, e.g., nanobodies [196], the ability to tag any cellular component is now theoretically within reach, although delivery of abiotic nanoparticles into cells still remains non-trivial. Some reported strategies here include permeabilisation of cell membranes using streptolysin O [195] or the uptake of BSA-conjugated Au via the endo-lysosomal pathway [197]. Both approaches are likely limited by toxicity and can access only certain cellular compartments, limiting their general applicability.
Ultimately, the development of universal genetically encodable tags is likely to prove transformational for cryo-ET, similar to how fluorescent proteins revolutionised light microscopy. Encodable metallothionein [198] or ferritin-based [199,200] tags, fused to proteins of interest, have been demonstrated to improve cryo-ET contrast via their ability to sequester multiple transition metal ions with high affinity. However, biological systems studied using these tags must be enriched with relevant metals to enable loading, causing potential issues with toxicity or incompatibility. DNA origami labels have also shown promise as distinctive markers, which can readily be targeted to proteins of interest via hybridisation to a suitable RNA aptamer [201]. Although these DNA labels are not expressible in a strict biological sense and are limited to use on cell surfaces, the use of distinctive nucleic acid shapes highlights a potentially powerful strategy that may be further developed. Another elegant approach was recently reported by Fung et al., employing genetically encoded encapsulin multimers that conditionally bind GFP in the presence of a rapamycin analogue [202]. Use of these ‘GEM-tags’ drives colocalisation of a 25–45 nm icosahedral particle next to GFP-labelled proteins of interest, supporting CLEM strategies, as well as facile target detection in tomograms by template-picking [203]. The main drawback of the GEM system appears to be the multimeric nature of encapsulin itself, which may induce multimerisation of targets, perturbing them from their native locations.
Temporal limitations
The dynamic nature of glycosylation represents another significant hurdle for cryo-ET, which can only provide static snapshots of vitrified biological samples. Thus, while many relevant enzyme–substrate and enzyme–enzyme interactions may be identified from ER-Golgi tomograms, understanding how these link together over the course of glycan biosynthesis requires methods such as super-resolution microscopy, which has already delivered important insights into Golgi dynamics [31,32,35]. CLEM approaches that combine cryo-ET with live-cell imaging can provide powerful additional context that enriches both techniques, enabling molecular-level characterisations of key events to be placed within their correct temporal order [204]. One potential strategy to achieve this goal is the coupling of (super-resolution) light microscopy platforms with rapid cryo-fixation, enabling vitrification at exact time points of interest identified by fluorescence, supporting subsequent high-resolution dissection by cryo-ET [205].
Outlook
Although most studies of glycosylation to date have largely focussed on mechanisms of individual enzymes, the frontier increasingly lies in resolving how these enzymes operate together within their native environments. The ability of cryo-ET to visualise in situ biology at high resolution renders it a powerful tool to study multicomponent systems such as enzyme networks. Despite this promise, the study of carbohydrate regulation by cryo-ET remains in its infancy, largely owing to the difficulties of identifying small enzymes and dynamic processes within the ER-Golgi network. Most cryo-ET studies of ER-Golgi to date have either examined aspects of organellar or vesicular morphology [175,186-188], or larger entities such as ribosomes on ER membranes [180]. While underlying technical challenges are likely to persist for some time, the advent of improved imaging and processing methods, coupled with labelling and CLEM strategies, represent exciting developments for the future. Resolving dynamic patterns of enzyme organisation within ER-Golgi by live-cell and cryo-ET imaging, even at modest resolutions, will shed considerable light on how these organelles operate to regulate complex glycans. In time, elucidating in situ structures will also reveal the molecular nature of kin interactions between enzymes, allowing us to see how their mechanisms fit within the context of broader functional assemblies.
The rich history of structural glycobiology has provided the field with many powerful insights into the function of carbohydrate-processing enzymes. We envision that the advent of in situ methods will enable classical insights to be placed within the complex nature of the cell, bringing our understanding of carbohydrate regulation from the molecular scale to organelles and beyond.
Perspectives
Complex carbohydrates represent some of the most important biological macromolecules across life. Precise enzymatic regulation of carbohydrates is a critical process that underpins myriad biological functions, with an impact across health and disease.
Over the past decades, the mechanisms of carbohydrate processing enzymes have stimulated intense discussion and investigation. The identification of conserved molecular mechanisms underpinning carbohydrate construction and deconstruction, by glycosyltransferases and glycoside hydrolases respectively, highlights the fruits of such efforts.
Complex carbohydrates are constructed from networks of enzymes with no direct template. The frontier is increasingly understanding how these many enzymes coordinate within a complex intracellular milieu in a fashion that enables control over their products.
Conflicts of Interest
The authors declare no competing interest.
Funding
The Rosalind Franklin Institute is funded by UK Research and Innovation through the Engineering and Physical Sciences Research Council. L.W. and C.M.W. are supported by a Wellcome Trust Sir Henry Dale Fellowship (218579/Z/19/Z to L.W.).
CRediT Author Contribution
Conceptualization - L.W.; Writing - C.J.M.W., M.A.L. and L.W.; Figures M.A.L. and L.W.
Abbreviations
- CAZymes
Carbohydrate Active enZymes
- CLEM
Correlated Light and Electron Microscopy
- FIB
focussed ion beam
- FIB-SEM
FIB milling coupled with scanning electron microscopy
- GHs
gycoside hydrolases
- GTs
glycosyltransferases
- HS
heparan sulfates
- TEM
transmission electron microscopy
- TM
transmembrane
- TRAP
translocon-associated protein
- cryo-ET
cryo-electron tomography