Architectural proteins play an important role in compacting and organizing the chromosomal DNA in all three kingdoms of life (Eukarya, Bacteria and Archaea). These proteins are generally not conserved at the amino acid sequence level, but the mechanisms by which they modulate the genome do seem to be functionally conserved across kingdoms. On a generic level, architectural proteins can be classified based on their structural effect as DNA benders, DNA bridgers or DNA wrappers. Although chromatin organization in archaea has not been studied extensively, quite a number of architectural proteins have been identified. In the present paper, we summarize the knowledge currently available on these proteins in Crenarchaea. By the type of architectural proteins available, the crenarchaeal nucleoid shows similarities with that of Bacteria. It relies on the action of a large set of small, abundant and generally basic proteins to compact and organize their genome and to modulate its activity.
Organisms of all three kingdoms of life (Bacteria, Archaea and Eukarya) need to organize and compact their genomic DNA to reduce its dimensions below that of a cell or a nucleus. Yet the genetic material needs to be flexibly and dynamically organized to permit DNA-based processes such as transcription, replication and repair.
Genome compaction in archaea is currently poorly understood. The size of the archaeal genome is in the order of megabases, comparable with that of bacteria, and also the micrometre-sized dimensions of the cell are shared. Archaea compact their genomic DNA into a structure generally referred to as the nucleoid, which is not membrane-enclosed. Compaction is achieved by a diverse set of architectural proteins associated with the nucleoid. None of these proteins is conserved among all archaeal species, and most of the ones identified to date have a limited phylogenetic distribution (Figure 1).
Phylogenetic distribution of the main chromatin proteins in Archaea
The two main archaeal phyla, Euryarchaea and Crenarchaea, employ mechanistically different approaches to organize and compact their genome. Euryarchaea synthesize histone homologues, which, by wrapping DNA, form nucleosomal beads-on-a-string-like structures in vitro . Therefore compaction of the genome in species of this phylum is considered to be similar to that in eukaryotes. In eukaryotes, these nucleosomal filaments arrange into 30 nm fibres, which are subject to further higher-order compaction. Crenarchaea do not synthesize histone homologues (with rare exceptions), but instead employ a wide array of small NAPs (nucleoid-associated proteins). Genome organization in these species is considered to be similar to that of bacteria. In bacteria, NAPs organize the genome predominantly by bending or bridging DNA . The DNA-bridging activity is likely to be involved in forming higher-order looped structures representing topologically isolated domains. The DNA within these domains is additionally compacted because of the presence of supercoils . Hardly anything is known about higher-order organization in archaea. However, considering the functional conservation of chromatin proteins with homologues from both bacteria and eukaryotes, it is likely to show similarities in higher-order structures as well.
Architectural proteins in general modulate the spatial path of the DNA by wrapping, bridging or bending it. These properties, rather than sequence homology, appear to be conserved among kingdoms . Not only do the architectural properties of these proteins explain compaction, they also provide a framework for understanding how different proteins can work together in providing a dynamic genome. The delicate interplay between architectural proteins (e.g. related to their differential expression in different conditions) permits them to modulate the degree of compaction and thereby regulate the accessibility of the genomic DNA for different DNA-based processes.
A few words on euryarchaeal histones
Although covering euryarchaeal nucleoid organization and compaction in detail is not the scope of the present review, some key aspects are given in this section. This information will aid the reader in grasping the differences in the archaeal kingdom and the similarities of the euryarchaeal phylum to eukaryotes in genome compaction. Euryarchaea encode two homologues of eukaryotic core histones. Whereas eukaryotic histones assemble as octamers, archaeal histones arrange themselves into tetrameric structures, composed of two dimers . The tetramer assembled on the DNA wraps approx. 90 bp of DNA around its surface , in contrast with the ~150 bp found in eukaryotic nucleosomes . Archaeal histones lack the N- and C-terminal tails that are subject to post-translational modifications in their eukaryotic counterparts [7,8]. Such modifications are key to modulating chromatin compaction in eukaryotes in relation to gene expression. In Euryarchaea, the same could be accomplished by varying the stoichiometry of the two expressed histone proteins in the tetramer. As the composition of the tetramers affects the DNA-binding affinity, varying expression levels of individual histone monomers provides a mechanism to regulate transcription. Indeed, in some species, the expression level of the different histone proteins is dependent on growth conditions .
NAPs in Crenarchaea
Crenarchaea encode a variety of small, abundant and generally basic architectural proteins that modulate their genome . Although these proteins are not universally conserved among all Crenarchaea, each species contains at least two NAPs (generally a bender and a bridger) to jointly compact the genomic DNA and modulate the accessibility of the genome. In this section, we give an overview of the proteins identified and, if known, of their architectural and biochemical properties. Unfortunately, data regarding the in vivo functionality of these proteins are scarce.
The most widely distributed NAP in Crenarchaea is Alba (acetylation lowers binding affinity). It has been identified in all crenarchaeal species sequenced so far. In addition, homologues of the Alba family are found in most euryarchaeal species. This protein was originally identified in Sulfolobus acidocaldarius and Sulfolobus solfataricus as Sac10b and Sso10b respectively, but later renamed to Alba. Alba is a small (10 kDa) basic protein, highly abundant in these species, comprising ~5% of cellular proteins, corresponding roughly to one Alba dimer per 5 bp. Alba exists as a dimer (Figure 2A) in solution and binds co-operatively without apparent specificity to DNA. Models of Alba binding to DNA predict interaction of the body of the dimer with the major groove, flanked by additional interactions of the individual β-hairpins of the monomers with the minor groove . EM (electron microscopy) studies have shown the formation of two structurally different types of Alba–DNA complexes depending on the concentration of the protein. At low and intermediate concentrations, Alba bridges DNA duplexes, resulting in compaction because of the formation of looped structures. At high protein concentrations, the protein coats DNA fully, giving rise to stiff filaments [12,13], which, at first sight, do not seem to be involved in compaction.
Structures of crenarchaeal nucleoid-associated proteins
Binding of Alba to DNA can be modulated by its acetylation/deacetylation at a single residue, Lys16, situated at the DNA-binding surface (hence the name Alba). Acetylation of Alba by acetyltransferase Pat , reducing the positive charge of the binding surface, decreases the binding affinity to DNA approx. 30-fold [11,15]. The action of Sir2 leads to deacetylation of this residue and restores the DNA-binding affinity of Alba. It has been suggested that binding of Alba represses transcription. Switching of the acetylation status at Lys16 could provide a mechanism to regulate the binding affinity of the protein and thus its ability to repress transcription. Indeed, acetylation of Alba relieves repression of transcription and deacetylation restores it in vitro [14,15]. In addition to hindering transcription, binding of Alba to DNA probably affects any other DNA-based processes. For instance, Alba possibly interferes with replication. It has been shown that progression of the helicase MCM (minichromosome maintenance) is impeded on Alba–DNA complexes in vitro. Also, in this case, acetylation of Alba reduces its binding to DNA and therefore allows MCM to unwind DNA . The switching of the acetylation status of Alba thus emerges as an important mechanism in modulating DNA accessibility.
A number of archaeal species encode two Alba homologues. For instance, S. solfataricus expresses Alba1, which is highly conserved, and Alba2, which is less conserved. Alba2 is expressed at levels 20-fold lower than that of Alba1 . The exact role of Alba2 is not known, but the data currently available suggest that it modulates the activity of Alba1. At physiological conditions, Alba2 is found uniquely as heterodimer with Alba1. These heterodimers only form bridged DNA–protein complexes, and even at high concentrations do not assemble into the coated DNA filaments seen with Alba1 homodimers. These differences in DNA binding observed for the Alba heterodimer and homodimer can probably be attributed to changes in dimer–dimer interactions. The likely interaction surface is highly conserved in Alba1, but not in Alba2. Similar to the effects seen with Alba2, a substitution in the Alba1 dimer/dimer interface (Phe60 to Ala) leads to a decrease in effective DNA-binding affinity . These results underline the important role of dimer–dimer interactions in formation of densely packed protein–DNA filaments. Changing the ratios of Alba1 and Alba2 might be another mechanism by which to regulate gene expression next to Alba1 acetylation.
Several Alba homologues that lack the key residue in modulating Alba binding to DNA (Lys16) have been identified. Ape10b2 from Aeropyrum pernix and Mma10b from the euryarchaeal Methanococcus maripaludis are the only ones characterized [18,19]. Owing to the absence of Lys16, the DNA-binding affinity of these Alba homologues cannot be modulated by changing the acetylation status at this site. Interestingly, Mma10b exhibits a preference for sequence specific binding to DNA; its binding site corresponds to an AT-rich palindrome of 18 bp . It is less abundant (~0.01% of total cellular proteins) than other architectural proteins and has a low affinity for generic DNA. It has been shown in ChIP-chip (chromatin immunoprecipitation on chip) studies that Mma10b-binding sites locate within genes. Deletion mutants of Mma10b in this species show up/down-regulation of some 15 genes, similar to Mvo10b-deletion mutants in Methanococcus voltae . The relatively low expression levels of Mma10b compared with Alba, together with its sequence-specific binding, suggest that Mma10b has evolved towards a more specific role in transcription regulation deviating from the primary role of Alba proteins in genome compaction.
Sac10a homologues are widely distributed among Crenarchaea and also encoded in some euryarchaeal species. Similarly to Alba, this protein has been shown to facilitate DNA bridging in EM studies . Little is known about Sac10a at the biochemical level and about how it contributes to genome compaction. However, some characteristic features are evident in the structure of the protein. The protein exists in solution as homodimer and its dimerization occurs via an antiparallel coiled coil (Figure 2B). The DNA-binding winged helix structures are on opposite sides of the coiled coil [21,22]. This arrangement of the dimer probably facilitates the DNA bridging observed .
Sulfolobus species express a number of highly conserved architectural proteins of ~7 kDa, which are collectively referred to as Sul7 proteins. Every species encodes one to three genes (resulting from gene duplications) for Sul7. Sul7 proteins are expressed at high levels (up to 5% of the total cellular protein). They are highly basic and bind to DNA without apparent sequence specificity.
Sul7 is a simple protein, consisting of two antiparallel β-sheets and a C-terminal α-helix . It exists as a monomer in solution and the monomer is also the form in which it binds to DNA (Figure 2C). Binding of Sul7 to DNA is facilitated by insertion of one of the β-sheets into the minor groove with concomitant intercalation of Val26 and Met29. The binding of the protein results in a DNA bend of approx. 66° [23,24]. The binding of Sul7 also affects local DNA topology, as the distortion of the DNA results in the constraint of negative supercoiling .
Sul7 is subject to post-translational modifications. It is methylated on several lysine residues. Differently from Alba, these modifications do not change the DNA-binding affinity of the protein. Instead, this modification might help to protect the protein from thermal denaturation, as Sul7 methylation is found to increase after heat shock . Sul7 can also protect DNA from thermal denaturation, as DNA–Sul7 complexes have an increased melting temperature compared with bare DNA . This protection is relevant in the light of the hyperthermophilic natural habitat of Sulfolobus species.
Some additional functions have been proposed for Sul7. In addition to protecting itself from denaturation, Sul7 has been shown to ensure integrity and activity of other proteins in vitro. Sul7 is able to disaggregate proteins that have been denatured by high temperatures in an ATP-hydrolysis-dependent manner [28,29]. Sul7 has also been shown to be able to repair UV-induced damage. Upon exposure to UV, a conserved tryptophan residue at the DNA-binding interface is oxidized and acts as an electron donor. This electron can subsequently be transferred to a thymidine dimer, which leads to the reversal and thus repair of the thymidine dimer . It remains to be shown whether either of these activities is relevant in vivo.
Cren7 is an architectural protein found in almost all crenarchaeal species. It is interesting to note that Cren7 is absent from only the few crenarchaeal species that encode histone proteins (e.g. Thermofilum pendens [31,32]). It was discovered only recently as a gene product co-purifying with Sul7 . The two proteins are completely different at the level of amino acid sequence, yet very similar in structure and, not unexpectedly, in biochemical properties. As many other architectural proteins, Cren7 is abundantly expressed (1% of cellular protein) and has no apparent sequence specificity. Since Cren7 is highly conserved in crenarchaea, it is likely to play a key role in chromatin organization and gene regulation.
As mentioned, the structure of Cren7 resembles that of Sul7. It contains two antiparallel β-sheets and an extended flexible loop (Figure 2D). The major differences between Cren7 and Sul7 are the C-terminal α-helix unique to Sul7 and the aforementioned loop present only in Cren7.
Cren7 binds to DNA through one of the β-sheets and the flexible loop. Its DNA binding is similar to that of Sul7 with the β-sheet inserted into the minor groove and Leu28 and Val36 intercalating. As a consequence, binding of Cren7 to DNA induces a bend of approx. 53°. Binding of Cren7 also constrains negative supercoils, yet twice as efficiently as Sul7 .
CC1 (crenarchaeal chromatin protein 1)
CC1 is yet another small (6 kDa) basic DNA-binding protein. It was discovered in a search for single-stranded-DNA-binding proteins in archaeal species and found to bind single-stranded and double-stranded DNA with equal affinities . CC1 and its homologues have been found in a limited number of species: Thermoproteus tenax, Pyrobaculum aerophilum and Aeropyrum pernix. It exists in solution and binds to DNA as a monomer. The only structural information available is that CC1 is rich in β-sheet structure. It is unknown how this protein binds to DNA, but it constrains negative supercoiling and increases the melting temperature of duplex DNA .
It is fair to say that our understanding of the archaeal nucleoid is limited. It is based mostly on data from in vitro experiments and extrapolation from our knowledge of the bacterial nucleoid and eukaryotic chromatin. Nevertheless, together, this provides a good framework for building structural models of archaeal nucleoid organization. Currently, such models cannot go much further than incorporating the activities of individual NAPs on DNA. Evidently, there is a need for more information regarding the action of these proteins in vivo, in vitro and in silico. The questions that await an answer in the close future are related to the interplay of different NAPs in modulating genome structure and accessibility, and the possible existence of higher-order structures.
A thorough understanding can only be reached if an integrated approach is taken where knowledge obtained at different scales is combined. With the advent of single-molecule techniques, such as optical or magnetic tweezers, it is now possible to identify the general architectural properties of a chromatin protein, such as bending, bridging or wrapping the DNA, in a straightforward manner . For instance, the force–extension curve of a protein–DNA complex (the typical outcome of a micromanipulation experiment) provides a ‘signature’ of such generic properties (Figure 3) [36–38]. In addition, the data obtained from such studies can give detailed quantitative kinetic information about protein–DNA interactions. A first level of added complexity, and the first step to understanding in vivo genome dynamics, is to study the joint action of multiple architectural proteins using these methods. The detailed knowledge obtained at the single-molecule level alone is not sufficient. It is essential to obtain direct information on the situation in vivo. This can be achieved by using microarray or deep-sequencing-based techniques that probe genome-wide binding and gene activity. Modern high-resolution imaging techniques can yield important information of nucleoid structure at the single-cell level. Evidently, each of these techniques relies heavily on the availability of tools to manipulate the genetic material in the archaeal species of interest. Structural and kinetic information, integrated with two-dimensional and three-dimensional position information of (sets of) architectural proteins will form the basis of accurate predictive models of spatial genome organization.
Simulations of force–extension curves: signatures of DNA–protein complexes corresponding to ‘model representatives’ from the three classes of architectural proteins
Molecular Biology of Archaea II: A Biochemical Society Focused Meeting held at Robinson College, Cambridge, U.K., 16–18 August 2010. Organized and Edited by Stephen Bell (Oxford, U.K.) and Finn Werner (University College London, U.K.).
We are grateful to Mariliis Tark-Dame and Nora Goosen for comments and a critical reading of the paper. We apologize to those people whose interesting publications could not be cited owing to the defined scope of this work and size constraints.
This work was financially supported by NWO (Netherlands Organization for Scientific Research) through a Vidi grant to R.T.D. [grant number 864.08.001].