Trypanosomatids are protozoan parasites that cause human and animal disease. Trypanosoma brucei telomeric ESs (expression sites) contain genes that are critical for parasite survival in the bloodstream, including the VSG (variant surface glycoprotein) genes, used for antigenic variation, and the SRA (serum-resistance-associated) gene, which confers resistance to lysis by human serum. In addition, ESs contain ESAGs (expression-site-associated genes), whose functions, with few exceptions, have remained elusive. A bioinformatic analysis of the ESAG5 gene of T. brucei showed that it encodes a protein with two BPI (bactericidal/permeability-increasing protein)/LBP (lipopolysaccharide-binding protein)/PLUNC (palate, lung and nasal epithelium clone)-like domains and that it belongs to a multigene family termed (GR)ESAG5 (gene related to ESAG5). Members of this family are found with various copy number in different members of the Trypanosomatidae family. T. brucei has an expanded repertoire, with multiple ESAG5 copies and at least five GRESAG5 genes. In contrast, the parasites of the genus Leishmania, which are intracellular parasites, have only a single GRESAG5 gene. Although the amino acid sequence identity between the (GR)ESAG5 gene products between species is as low as 15–25%, the BPI/LBP/PLUNC-like domain organization and the length of the proteins are highly conserved, and the proteins are predicted to be membrane-anchored or secreted. Current work focuses on the elucidation of possible roles for this gene family in infection. This is likely to provide novel insights into the evolution of the BPI/LBP/PLUNC-like domains.
Protists of the family Trypanosomatidae represent one of the most divergent groups of eukaryotes  and include several clinically relevant pathogens transmitted by insect vectors. Trypanosoma brucei (Figure 1A) causes the cattle disease nagana and the subspecies T. b. gambiense and T. b. rhodesiense cause HAT (human African trypano-somiasis) (also known as sleeping sickness ) in geo-graphically distinct regions of sub-Saharan Africa. Trypanosomatid genomes encode a family of proteins with BPI (bactericidal/permeability-increasing protein)/LBP [LPS (lipopolysaccharide)-binding protein]/PLUNC (palate, lung and nasal epithelium clone)-like domains. This gene family was termed (GR)ESAG5 [gene related to ESAG5 (expression-site-associated gene 5)] . To date, the only functionally characterized proteins with BPI/LBP/PLUNC-like domains are from metazoans, where they play important and diverse roles in lipid transfer and immune defence [4–6]. Protein sequences with homology with BPI/LBP/PLUNC-like domains are, however, present in a broader range of eukaryotic taxa ( and E. Gluenz, A.R. Barker and K. Gull, unpublished work). Elucidating the function of the (GR)ESAG5 proteins in trypanosomes is of interest for two reasons: (i) information about the biochemistry of these proteins in this protist may provide insights into the ancestral functions of the BPI/LBP/PLUNC-like domain and the evolution of the superfamily; and (ii) the founding member of the (GR)ESAG5 family, ESAG5, is located in tightly regulated polycistronic transcription units at telomeric ESs (expression sites) associated with immune evasion mechanisms. In the present review, we discuss recent progress in the characterization of the (GR)ESAG5 family in the context of the biology of the bloodstream-form parasite and its interactions with its host environment.
ESAG5 is located in T. brucei telomeric ESs
Telomere-associated contingency genes in African trypanosomes
In the human bloodstream, trypanosomes encounter potentially destructive serum proteins, most importantly immunoglobulins and the trypanolytic factor (discussed below). A number of important contingency genes located at telomeres  provide the parasite with mechanisms of escaping these host defence mechanisms.
African trypanosomes evade the adaptive antibody-mediated immune response by a mechanism of antigenic variation of their surface coat (reviewed recently in ). The surface of the T. brucei bloodstream-form parasite is fully covered in a coat of 5×106 dimers of a single VSG (variant surface glycoprotein). There are at least 1500 VSG genes in the genome  and periodic switching at a rate of between 10−2 and 10−7 per cell per generation  to a different VSG allows a small proportion of parasites to escape detection by the antibody response mounted against the predominant VSG type in the population. Multiple levels of regulation, including nuclear positioning  and epigenetic factors , control monoallelic expression of only one VSG from a single ES. Successive expansions of parasite populations carrying a different VSG type on their surface give rise to the characteristic waves of parasitaemia first described in a human patient a century ago . If untreated, African sleeping sickness is fatal.
Humans and some non-human primates are resistant to T. b. brucei infections because their serum contains a potent TLF (trypanolytic factor) (reviewed in [15,16]). TLF is associated with the densest subfraction of HDL (high-density lipoprotein) particles (HDL3) containing APOA1 (apolipoprotein A1), APOL1 (apolipoprotein L1) and HPR (haptoglobin-related protein). These HDL particles are taken up by the trypanosome through receptor-mediated endocytosis, via the haptoglobin–haemoglobin receptor, TbHpHbR . Within the acidic environment of the lysosome, the HDL particle dissociates and a conformational change in APOL1 causes its N-terminal domain to form an anion-selective channel in the lysosome membrane, leading to swelling of the lysosome and eventual trypanosome lysis [18,19]. Expression of the SRA (serum-resistance-associated) protein renders the human-infective subspecies T. b. rhodesiense resistant to TLF lysis ; SRA binds to APOL1 within the lysosome, thereby preventing its insertion into the membrane [18,19].
Active VSG genes and the SRA gene (which is thought to have evolved from a VSG gene ) are exclusively expressed from telomeric ESs (Figure 1B). At least 14 ESs have been fully sequenced in the most widely used laboratory strain of T. brucei (Lister 427) [22,23]. Each ES contains one telomere-proximal VSG and several ESAGs, which form a polycistronic transcription unit of 40–70 kb  transcribed from an RNA Pol I (polymerase I) promoter. ESAG5 is one of 12 distinct ESAGs that have been identified to date. Most of the different ESs in T. b. brucei 427 contain polymorphic variants of most of the known ESAGs in a largely conserved order [22,23]. Most ESAGs are predicted to encode membrane-associated or secreted proteins with possible roles in host–parasite interactions, but their specific functions have remained frustratingly elusive [9,24]. The best-characterized ESAGs are ESAG6 and ESAG7, which encode the two subunits of a heterodimeric transferrin receptor . Limited information is available about the function of most of the other ESAGs: ESAG4 encodes a receptor-type adenylate cyclase [26,27], ESAG8 encodes a nucleolar protein , ESAG9 encodes an N-glycosylated secreted protein up-regulated in the short stumpy bloodstream form , and ESAG10 encodes a protein homologous with the Leishmania biopterin transporter BT1 .
(GR)ESAG5s belong to a superfamily of proteins with LBP/BPI/PLUNC-like domains
Bioinformatic characterization of ESAG5 showed that it belongs to a large and diverse multigene family termed (GR)ESAG5, comprising a set of genes from different trypanosomatid species . All of the predicted (GR)ESAG5 proteins are of similar length (~480 amino acids) with a predicted N-terminal signal peptide. Classification of (GR)ESAGs as BPI/LBP/PLUNC-like domain proteins was based on sequence homology, established through sensitive iterative BLAST searches and hidden Markov models . This was independently confirmed in a cluster analysis of a wide range of BPI/LBP/PLUNC-like domain homologues . Prediction of two-dimensional (GR)ESAG5 structures showed high similarity to human BPI, and homology modelling using the BPI crystal structure  showed that ESAG5 could adopt a similar three-dimensional structure, the characteristic quasi-symmetric fold consisting of two barrels linked by a β-sheet .
The (GR)ESAG5 family in trypanosomatids
With recent advances in the sequencing of trypanosomatid genomes and addition of the important and previously underrepresented telomeric sequences of T. brucei , we can begin to build a fuller picture of the complexities of the (GR)ESAG5 family, which comprises the following three groups of genes.
(i) ESAG5 genes
ESAG5 genes are found in ESs of T. brucei and closely related species of African trypanosomes. TAR (transformation-associated recombination) cloning of T. b. brucei, T. b. gambiense and T. equiperdum ESs allowed for the first time a characterization of their full repertoire of ESAG sequences , yielding at least 13 ESAG5 genes in T. b. brucei 427 , 23 in T. b. brucei EATRO 2340, 14 in T. b. gambiense and 13 in T. equiperdum . In phylogenetic analyses of (GR)ESAG5 sequences, the ESAG5s form a distinct clade with two main groups [3,32]. At the level of protein sequence, ESAG5 sequences are >80% identical. Estimation of rates of non-synonymous and synonymous substitutions provided evidence for adaptive evolution of ESAG5 genes, and 34 positively selected codons were found distributed across the protein . The C-terminal domains tend to exhibit slightly more variability than the N-terminal domains, but variations of single amino acid residues were found throughout primary sequences of ESAG5 .
The reason for the location of ESAGs in ESs has remained a mystery. ES location might be expected to confer bloodstream-stage-specific expression and these sites are subject to frequent recombination; whether these factors are important for the function of ESAGs is unknown. ESAG5 transcripts were detected in both procyclic and bloodstream-form parasites in a recent large-scale microarray-based analysis of gene expression across developmental stages of T. brucei . The functional significance of this is not known, and it remains to be shown whether ESAG5 protein expression is stage-specific. The attractive hypothesis that a large and diverse ESAG repertoire may be adaptive to a large host range  has remained controversial, and recent comparisons between T. b. brucei, T. b. gambiense and T. equiperdum did not show a positive correlation between host range and ESAG sequence diversity . The R-ES of T. b. rhodesiense, which contains the SRA gene that enables survival in human serum, is the only known example where adaptation to a specific host is evident. Compared with the canonical T. b. brucei ES, the T. b. rhodesiense R-ES is truncated, containing, in addition to the SRA and VSG genes, only ESAG5, ESAG6 and ESAG7 (Figure 1C) . In the study of T. brucei 427 ESs, ESAG5, ESAG6 and ESAG7 were shown to belong to one of three linkage blocks of ESAGs that have only infrequently been split by recombination , and the linkage of ESAG5 to SRA is conserved in T. b. rhodesiense isolates from geographically distinct locations .
(ii) GRESAG5 genes in chromosome-internal positions in the T. brucei genome
Several genes related to ESAG5 were identified outside ESs, in chromosome-internal positions (genes Tb927.2.1920, Tb927.4.810, Tb927.5.340, Tb927.7.6860, Tb09.244.2120, Tb09.v4.0016 and Tb09.v4.0016). Transcripts for these T. brucei GRESAG5 genes were detected by RT (reverse transcription)–PCR in procyclic (insect-stage) and bloodstream forms (A.R. Barker, E. Gluenz and K. Gull, unpublished work). Quantitative mRNA expression profiling studies showed modest (no more than 2-fold) down-regulation of Tb927.2.1920 and Tb927.4.810 during the differentiation from bloodstream to procyclic forms, whereas Tb927.5.340 showed the reverse pattern [33,35,36]. Trypanosomatids control gene expression post-transcriptionally, and therefore developmental regulation of the GRESAG5 genes needs to be verified by measurement of protein levels.
(iii) GRESAG5 genes in chromosome-internal positions in other trypanosomatids
GRESAG5 genes have now been identified in every trypanosomatid genome that has been sequenced to date. This includes T. brucei, two other species of African trypanosomes, T. congolense and T. vivax, the American trypanosome T. cruzi and the Leishmania species L. major, L. infantum, L. mexicana and L. braziliensis. African trypanosomes have the largest repertoire of (GR)ESAG5 genes. Leishmania species and T. cruzi have only a single GRESAG5 gene per haploid genome (genes Tc00.1047053508257.220 and LmjF34.3930). There is considerable conservation of synteny between the genomes of T. brucei, T. cruzi and L. major and the T. cruzi and Leishmania GRESAG5 genes are in a genomic location syntenic with Tb927.4.810 .
Is ESAG5 required for parasite survival in the human bloodstream?
The conservation of ESAG5 in the R-ES of T. b. rhodesiense and its homology with proteins known to interact with human HDL and LDL (low-density lipoprotein) particles makes this the most interesting gene within the (GR)ESAG5 family. Does ESAG5 support parasite survival in the human bloodstream, and, if so, by what mechanism? Insight into the biochemical properties of ESAG5 may come from application of methods that defined the functions of well-characterized homologues. Mammalian CETP (cholesteryl ester-transfer protein) is a key component of reverse cholesterol transport mediating bidirectional transfer of cholesteryl esters and triacylglycerols among plasma lipoproteins [6,37], whereas PLTP (plasma lipid-transfer protein) mediates phospholipid transfer and HDL conversion . Lipid-binding activity has also been demonstrated for mammalian BPI and LBP, which function in antimicrobial defence and immune regulation. BPI is produced by myeloid precursors of polymorphonuclear leucocytes and has strong binding affinity for the lipid A portion of LPS. It exhibits selective toxicity for Gram-negative bacteria, neutralizes endotoxic activity and has opsonizing activity . LBP is an acute-phase plasma protein, produced by hepatocytes, which mediates transfer of LPS to MD2-CD14  and binds to a variety of amphiphilic bacterial compounds in addition to LPS (summarized in ). On the basis of sequence homology with these lipid-transfer proteins, one would predict a lipid-binding or -transfer function for the (GR)ESAG5s, yet what specific biological functions might be being served and whether putative ligands are of host or parasite origin are entirely open questions.
Lipid metabolism in trypanosomatids has been studied extensively, and the experimental evidence combined with genome analysis show that all major phospholipid classes are present in trypanosomes, as is the capability for de novo synthesis of all phospholipids and glycolipids required (reviewed in ). Lipids scavenged from the environment provide components for de novo lipid biosynthesis, and serum lipoproteins are essential for growth of T. brucei bloodstream forms in axenic culture . Uptake of plasma proteins in T. brucei bloodstream forms occurs via receptor-mediated endocytosis. Bloodstream forms acquire cholesterol from LDL particles, and a candidate LDL receptor has been characterized biochemically [43,44]. Biochemical studies also suggest the existence of another lipoprotein receptor that mediates uptake of HDL, LDL and TLF , but the genes encoding these receptors have not been identified. HDLs are taken up via the TbHpHbR, which was shown to mediate delivery of the trypanolytic APOL1 protein into the trypanosome . It is noteworthy that human PLTP, a regulator of phospholipid transfer and HDL conversion, was recently isolated in a complex with 28 proteins implicated in immunity and inflammation, including APOA1 and APOL1 . Future biochemical studies combined with genetic manipulation of trypanosomes will reveal whether (GR)ESAG5 proteins are surface-exposed or secreted proteins, as predicted by protein sequence analysis, and whether ESAG5 plays a role in the survival of T. b. rhodesiense in the human bloodstream. We raise the idea that (GR)ESAG5 proteins may be intimately linked in to the mechanism of serum component recognition, uptake and (in some species) resistance in trypanosomes. The new telomere sequence data obtained by TAR cloning [22,23] will greatly facilitate the design and interpretation of relevant experiments. Testing whether ESAG5 or any of the GRESAG5 proteins could interact with host lipoprotein particles in a manner similar to PLTP might be a first step towards establishing whether they play any role in lipid scavenging, intracellular lipid metabolism or modulation of host innate immunity.
In many organisms with BPI/LBP/PLUNC-like domain proteins, their functional studies are complicated by the presence of multiple copies of a gene or paralogues that may compensate for the loss of function of a single gene. Investigation of ESAG5 function faces a similar challenge. However, studying the function of the related GRESAG5 proteins in organisms with a single family member, such as Leishmania, might provide new insights into the function of ESAG5 in T. brucei. Finally, dissection of the evolutionary relationships between proteins with a BPI/LBP/PLUNC-like domain and measurement of biochemical activities of hitherto uncharacterized family members will no doubt contribute to a deeper understanding of their biology from trypanosomes to humans.
Proteins with a BPI/LBP/PLUNC-Like Domain: Revisiting the Old and Characterizing the New: A Biochemical Society Focused Meeting held at New Business School, University of Nottingham, U.K., 5–7 January 2011. Organized and Edited by Colin Bingle (Sheffield, U.K.) and Sven-Ulrik Gorr (University of Minnesota School of Dentistry, Minneapolis, MN, U.S.A.).
gene related to ESAG5
plasma lipid-transfer protein
palate, lung and nasal epithelium clone
I, polymerase I
Trypanosoma brucei haptoglobin–haemoglobin receptor
variant surface glycoprotein
We thank Samantha Griffiths for the image in Figure 1(A) and Athina Paterou for comments on the paper.
Work in K.G.'s laboratory is supported by the Wellcome Trust and EP Abraham Trust. A.R.B. was the recipient of a studentship from the Biotechnology and Biological Sciences Research Council.