Archaeal viruses, or archaeoviruses, display a wide range of virion morphotypes. Whereas the majority of those morphotypes are unique to archaeal viruses, some are more widely distributed across different cellular domains. Tailed double-stranded DNA archaeoviruses are remarkably similar to viruses of the same morphology (order Caudovirales) that infect many bacterial hosts. They have, so far, only been found in one phylum of the archaea, the Euryarchaeota, which has led to controversial hypotheses about their origin. In the present paper, we describe the identification and analysis of a putative provirus present in the genome of a mesophilic thaumarchaeon. We show that the provirus is related to tailed bacterial and euryarchaeal viruses and encodes a full complement of proteins that are required to build a tailed virion. The recently discovered wide distribution of tailed viruses in Euryarchaeota and the identification of a related provirus in Thaumarchaeota, an archaeal phylum which might have branched off before the separation of Crenarchaeota and Euryarchaeota, suggest that an association of these viruses with Archaea might be more ancient than previously anticipated.

Introduction

Viruses of Archaea, the third domain of life, exhibit an impressive range of diversity [1,2]. On the basis of their various morphologies and genomes, these archaeoviruses have been classified into many different, often novel, families [1,3]. Their unique nature has inspired speculation on the evolution and diversification of viruses in general [1,4]. Whereas many viruses isolated from thermophilic archaea are not reminiscent of any known bacterial or eukaryotic virus, some morphotypes of archaeoviruses are found in the other two domains. In particular, tailed ds (double-stranded) DNA archaeoviruses display morphologies that are indistinguishable from the well-characterized head-and-tail viruses infecting bacteria [57]. Until recently, these tailed viruses were found to be associated with archaeal species belonging to only two classes of the phylum Euryarchaeota. This was in contrast with the dominance of tailed viruses in bacteria. The narrow phylogenetic distribution as well as the relatively high sequence similarity to tailed bacterial viruses led to the suggestion that these archaeoviruses emerged in archaea as a result of a recent interdomain transfer from bacteria [1]. However, with accumulation of complete archaeal genome sequences, it became recently clear that tailed viruses are associated with organisms from the majority of euryarchaeal classes and that they co-evolved with their hosts [8].

Since tailed dsDNA viruses have only been found in Euryarchaeota, but not in the second major phylum of archaea, the Crenarchaeota, the study of archaeoviruses from additional phyla might provide clues about the ancestry of tailed viruses. Recently, a novel archaeal phylum, the Thaumarchaeota, has been proposed on the basis of comparative genomics and phylogenomic analyses of three genomes of ammonia-oxidizing archaea residing in marine and thermophilic environments [9,10]. They are representatives of an abundant and widespread group of organisms in terrestrial and marine environments [11,12]. Although these ammonia-oxidizers were long considered to be affiliated with Crenarchaeota, the detailed analysis of information-processing genes revealed that they constitute a separate phylum that might have emerged before the divergence of Crenarchaeota and Euryarchaeota [9,10]. Interestingly, no proviruses can be identified in the genomes of the two available thaumarchaeal genomes. In the present paper, we describe the first archaeal provirus identified in the genome of a thaumarchaeote that has recently been obtained in laboratory culture. Its structure and similarity to known tailed bacterial and archaeal viruses sheds light on the evolution and distribution of these entities in the three domains of life.

A draft genome sequence of the ammonia-oxidizing thaumarchaeon Candidatus ‘Nitrososphaera viennensis’ strain EN76 (M. Tourna, M. Stieglmeier, A. Spang, T. Rattei and C. Schleper, unpublished work) was obtained by 454 pyrosequencing (Genome Sequencer FLX system with GS FLX Titanium Methods, Roche) and subsequent assembly [13]. The investigation of this draft genome sequences of Ca. ‘Nitrososphaera viennensis’ revealed a contig of ~24 kb encoding a relatively high proportion (9/30; 30%) of putative proteins with counterparts in bacterial and archaeal viruses (Figure 1A and see Supplementary Table S1 at http://www.biochemsoctrans.org/bst/039/bst0390082add.htm). This suggested that the respective region might represent a provirus integrated into the host chromosome, which we designate as Nvie-Pro1.

Nvie-Pro1 is related to tailed bacterial and archaeal viruses

Figure 1
Nvie-Pro1 is related to tailed bacterial and archaeal viruses

(A) Genomic alignment of the putative thaumarchaeal provirus Nvie-Pro1 (bottom map) with the bacterial myovirus Mu (upper map). For convenience, only the morphogenetic module of Mu is shown (GenBank® accession number: AF083977; nucleotide co-ordinates 12316–33502). Open reading frames of Nvie-Pro1 that are present in other tailed bacterial and archaeal viruses are in blue, and those that are considered to be of cellular origin are in green. Functionally equivalent genes of myovirus Mu and Nvie-Pro1 are connected via shadings. MTP, major tail protein; TMP, tape-measure protein. (B) Multiple sequence alignment of the conserved motifs characteristic to the large subunit of the terminase complex. The ATPase and the nuclease domains are boxed. The groups of bacterial, archaeal and eukaryotic (pro)viruses are indicated with letters B, A and E respectively. GenBank® identifiers and abbreviations: T4 (Enterobacterial myovirus T4; GI:9632591), S-PM2 (Synechococcus myovirus S-PM2; GI:58532911), P74–26 (Thermus siphovirus P74–26; GI:157265496), phiEa21–4 (Erwinia myovirus phiEa21–4; GI:219681305), phiE12–2 (Burkholderia myovirus phiE12–2; GI:134288710), Nvie-Pro1 (Ca. ‘Nitrososphaera viennensis’ provirus Nvie-Pro1), ψM2 (Methanobacterium siphovirus ψM2; GI:3249594), Mace-Pro1 (Methanosarcina provirus Mace-Pro1; GI:20092622), HF1 (halobacterial myovirus HF1; GI:32453919), Mvul-Pro1 (Methanocaldococcus provirus Mvul-Pro1; GI:261402679), HSV-1 (herpes simplex virus 1; GI:9629397), HCMV (human cytomegalovirus; GI:270355841), VZV (varicella zoster virus; GI:9625919), HHV-6 (human herpesvirus 6; GI:9633135), EBV (Epstein–Barr virus; GI:23893636). The alignment was constructed using PROMALS3D [36] and MUSCLE [37] and then adjusted manually.

Figure 1
Nvie-Pro1 is related to tailed bacterial and archaeal viruses

(A) Genomic alignment of the putative thaumarchaeal provirus Nvie-Pro1 (bottom map) with the bacterial myovirus Mu (upper map). For convenience, only the morphogenetic module of Mu is shown (GenBank® accession number: AF083977; nucleotide co-ordinates 12316–33502). Open reading frames of Nvie-Pro1 that are present in other tailed bacterial and archaeal viruses are in blue, and those that are considered to be of cellular origin are in green. Functionally equivalent genes of myovirus Mu and Nvie-Pro1 are connected via shadings. MTP, major tail protein; TMP, tape-measure protein. (B) Multiple sequence alignment of the conserved motifs characteristic to the large subunit of the terminase complex. The ATPase and the nuclease domains are boxed. The groups of bacterial, archaeal and eukaryotic (pro)viruses are indicated with letters B, A and E respectively. GenBank® identifiers and abbreviations: T4 (Enterobacterial myovirus T4; GI:9632591), S-PM2 (Synechococcus myovirus S-PM2; GI:58532911), P74–26 (Thermus siphovirus P74–26; GI:157265496), phiEa21–4 (Erwinia myovirus phiEa21–4; GI:219681305), phiE12–2 (Burkholderia myovirus phiE12–2; GI:134288710), Nvie-Pro1 (Ca. ‘Nitrososphaera viennensis’ provirus Nvie-Pro1), ψM2 (Methanobacterium siphovirus ψM2; GI:3249594), Mace-Pro1 (Methanosarcina provirus Mace-Pro1; GI:20092622), HF1 (halobacterial myovirus HF1; GI:32453919), Mvul-Pro1 (Methanocaldococcus provirus Mvul-Pro1; GI:261402679), HSV-1 (herpes simplex virus 1; GI:9629397), HCMV (human cytomegalovirus; GI:270355841), VZV (varicella zoster virus; GI:9625919), HHV-6 (human herpesvirus 6; GI:9633135), EBV (Epstein–Barr virus; GI:23893636). The alignment was constructed using PROMALS3D [36] and MUSCLE [37] and then adjusted manually.

The large subunit of the terminase

Nvie-Pro1 was found to encode a homologue of the large subunit of the terminase (TerL) (protein Nvie-2; Supplementary Table S1). TerL is one of the hallmark proteins exclusively encoded by tailed bacterial and euryarchaeal dsDNA viruses of the order Caudovirales as well as eukaryotic herpesviruses [8,14]. TerL proteins are composed of two functionally distinct domains: the ATPase domain and the nuclease domain [14]. The ATPase domain powers the translocation of the viral genomic DNA into empty procapsids, whereas the nuclease domain is responsible for cutting the concatameric viral DNA into genome-length units. The two domains display a set of conserved motifs [14,15]. Alignment of the putative TerL from Nvie-Pro1 with homologues encoded by diverse bacterial, euryarchaeal and eukaryotic viruses showed that all of the motifs characteristic to TerL proteins are also conserved in the proviral sequence (Figure 1B and see Supplementary Figure S1 at http://www.biochemsoctrans.org/bst/039/bst0390082add.htm).

Additionally, a BLAST search against the environmental sequence database at NCBI revealed several TerL homologues in the marine metagenome. Notably, the Nvie-Pro1 TerL displays considerably higher sequence identity with proteins from the environmental database (e.g. ECU80075, 45% identity over 362 amino acids) than it does with proteins from known bacterial and euryarchaeal (pro)viruses (Supplementary Table S1). It is therefore possible that marine thaumarchaea are also infected by head-and-tail viruses, although it cannot be excluded that these homologues represent marine bacterial (pro)viruses.

A multifunctional MCP (major capsid protein)

In tailed dsDNA viruses and herpesviruses, virion assembly starts with a scaffolding-protein-dependent construction of an empty procapsid which subsequently undergoes maturation by proteolytic cleavage of the scaffolding protein [16]. The protease responsible for this maturation step is usually encoded immediately upstream of the genes for scaffolding and MCPs [17]. The majority of tailed dsDNA viruses of the order Caudovirales and herpesviruses encode a specific capsid maturation protease, which is structurally distinct from the known cellular proteases [18,19]. Interestingly, the herpesvirus-like protease genes have been found to be rather frequently displaced in tailed bacterial viruses by genes encoding ClpP-like serine proteases while preserving their location in the viral genome as well as the role in the capsid maturation process [19]. Besides the TerL homologue, Nvie-Pro1 also encodes a putative MCP related to those encoded by viruses of the order Caudovirales and herpesviruses [8,20,21], another hallmark protein unique to this viral lineage [22]. This putative MCP (Nvie-8; Supplementary Table S1) is 576 amino acids long and appears to be composed of at least two distinct domains.

Protease domain

The N-terminal part of the protein (residues 1–202) shares significant sequence similarity with chymotrypsin-like serine proteases (Figure 2). Notably, the three residues constituting the catalytic triad of chymotrypsin-like proteases are perfectly conserved in the Nvie-Pro1 protein (His44, Asp81 and Ser178; Figure 2). The position of nvie-8 gene in the proviral genome as well as its fusion to the gene for the MCP (see below) strongly suggest that it might have been involved in the capsid maturation of the virus that gave rise to Nvie-Pro1. Notably, even though chymotrypsin-, ClpP- and herpesvirus-like proteases all belong to the serine protease superfamily, they have distinct structural folds and are believed to have originated independently [23]. To the best of our knowledge, Nvie-8 represents the first example of a chymotrypsin-like maturation protease encoded by a (putative) head-and-tail (pro)virus.

Domain organization of the putative protease–MCP encoded by Nvie-Pro1

Figure 2
Domain organization of the putative protease–MCP encoded by Nvie-Pro1

The protease, scaffolding and MCP domains are in blue, grey and green respectively. Yellow bars denote the predicted catalytic triad residues characteristic of chymotrypsin-like serine protease. Sequence alignment of the N-terminal domain of the putative Nvie-Pro1 protease–MCP fusion with chymotrypsin-like proteases is shown below the schematic diagram. Aligned chymotrypsin-like proteases are designated by their PDB codes: 3RP2 (rat mast cell protease II), 2RDL (hamster chymase-2) and 3FZZ (granzyme C). Amino acid residues constituting the catalytic triad are indicated with red arrowheads. The alignment was constructed using PROMALS3D [36] and then adjusted manually. Above the MCP domain of Nvie-Pro1 is a three-dimensional model of the indicated protein region. The model is based on the X-ray structure of the MCP gp5 of siphovirus HK97 (PDB code 3E8K). The modelled structure is coloured according to the sequence similarity (blosum30 matrix) to the corresponding protein of HK97.

Figure 2
Domain organization of the putative protease–MCP encoded by Nvie-Pro1

The protease, scaffolding and MCP domains are in blue, grey and green respectively. Yellow bars denote the predicted catalytic triad residues characteristic of chymotrypsin-like serine protease. Sequence alignment of the N-terminal domain of the putative Nvie-Pro1 protease–MCP fusion with chymotrypsin-like proteases is shown below the schematic diagram. Aligned chymotrypsin-like proteases are designated by their PDB codes: 3RP2 (rat mast cell protease II), 2RDL (hamster chymase-2) and 3FZZ (granzyme C). Amino acid residues constituting the catalytic triad are indicated with red arrowheads. The alignment was constructed using PROMALS3D [36] and then adjusted manually. Above the MCP domain of Nvie-Pro1 is a three-dimensional model of the indicated protein region. The model is based on the X-ray structure of the MCP gp5 of siphovirus HK97 (PDB code 3E8K). The modelled structure is coloured according to the sequence similarity (blosum30 matrix) to the corresponding protein of HK97.

MCP domain

The C-terminal half of the protein is occupied by a putative MCP domain. The identity of this domain could not be deduced using conventional BLAST searches. We therefore exploited a more sensitive structural-fold-recognition-based approach for distant homology prediction. For this purpose, the protein sequence was submitted to the Structure Prediction Meta Server [24]. Using this approach, the C-terminal domain (residues 283–576) was recognized as a homologue of the MCP of siphovirus HK97 (PDB code 3E8K) with a highly significant score of 177.75 (scores above 50 are considered to be significant) [24]. In order to verify the validity of this prediction, we performed a homology-based structural modelling experiment. The three-dimensional model of the putative MCP of Nvie-Pro1 was generated with MODELLER program [25] using the X-ray structure of the MCP gp5 of HK97 [21] as a template (Figure 2). The stereochemical quality of the model was then assessed using ProSA-web [26] and compared with that of the template X-ray structure. The ProSA-web quality score (Z) for the Nvie-Pro1 model (Z=−5.24) was similar to that calculated for the template structure (Z=−5.88) and was well within the score range calculated for other experimentally determined structures, which is from −2 to −11.2 for proteins of ~300 amino acids in length [26]. The good quality of the model indicates that the C-terminal half of the protein Nvie-8 can adopt the HK97 gp5-like topology without extensively violating the known protein folding rules and is therefore likely to represent the MCP of Nvie-Pro1. Using a similar bioinformatic approach, we have shown previously that euryarchaeal tailed viruses also utilize the HK97-like structural fold for capsid construction [8]. The observation that Nvie-Pro1 encodes both the TerL homologue and the HK97-like MCP strongly suggests that this thaumarchaeal provirus is related to tailed bacterial and euryarchaeal viruses.

Putative scaffolding domain

The scaffold protein of bacterial tailed viruses is often fused to either the maturation protease protein (e.g. myovirus P2) [27] or the MCP as in the case of siphovirus HK97 [28]. The presence of the protease and the MCP domains in the same polypeptide is infrequent, but not unprecedented. A similar domain organization has been recently reported for the protein encoded by bacterial virus Gifsy-2 [29]. In the latter case, however, the protease domain is related to ClpP-like serine proteases. Interestingly, the region between the protease and MCP domains in the Gifsy-2 protein was suggested to play a role of a scaffolding protein. It is therefore tempting to speculate that the linker region (residues 202–282) between the N-terminal protease domain and the C-terminal MCP domain in Nvie-8 (Figure 2) might also perform a scaffolding function.

Genome synteny with tailed bacterial and euryarchaeal viruses

We have defined previously the set of genes conserved in tailed (pro)viruses infecting bacteria and euryarchaea [8]. Careful examination of the putative gene product sequences of Nvie-Pro1 revealed that the provirus encodes an entire protein complement required to build a functional head-and-tail virion (see Supplementary Table S1). We were able to identify genes for capsid assembly (MCP, portal) and maturation (putative prohead protease and a homologue of the Mu protein gpG), genome packaging (TerL) as well as tail formation (major tail protein, tail tape measure protein, baseplate and tail fibres). The sequence similarity of these gene products to their counterparts in bacterial and archaeal (pro)viruses was in the range 23–35% (see Supplementary Table S1). The low pairwise sequence similarity indicates that Nvie-Pro1 is not closely related to any tailed (pro)viruses characterized to date. Nevertheless, the organization of these morphogenetic genes is remarkably syntenic when compared with those of tailed bacterial viruses (Figure 1A). These observations suggest that the virus at the origin of Nvie-Pro1 provirus also relied on similar strategies for capsid and tail assembly, maturation and genome packaging as tailed dsDNA viruses infecting bacteria and euryarchaea.

Defective or not?

In addition to the morphogenetic module necessary for building of a head-and-tail virion, tailed euryarchaeal viruses usually encode modules for genome replication and, in the case of temperate viruses, also modules for the integration into the host chromosome [8]. However, not all euryarchaeal tailed viruses encode apparent genome-replication proteins. For example, the complete genome sequence of siphovirus ψM2, infecting the euryarchaeon Methanothermobacter marburgensis, did not reveal any candidate proteins for genome replication [5]. Similarly, we were not able to identify any putative genome-replication module in Nvie-Pro1.

Tailed dsDNA viruses generally integrate their genomes into the cellular chromosome with the aid of serine or tyrosine recombinases, which are usually encoded by the virus. In fact, all tailed euryarchaeal (pro)viruses for which complete genome sequences are available, even those that are considered to be lytic (e.g. HF1 and HF2) [6], encode identifiable tyrosine recombinases of the phage integrase family [8]. Nvie-Pro1, on the other hand, does not possess an apparent integrase gene. Furthermore, the putative attachment sites (such as direct repeats flanking the provirus), signatures of tyrosine integrase-mediated recombination reaction, could not be identified. Consequently, it is not possible to define the precise borders of Nvie-Pro1. We considered that open reading frames with numerous homologues in other archaea (green arrows in Figure 1A) signify the termini of the provirus.

At the moment, it is not possible to predict with confidence whether the putative provirus is defective or not. On one hand, absence of directly associated modules for genome replication and integration would argue against the possibility for Nvie-Pro1 to be inducible. On the other hand, lack of apparent disrupted genes and presence of genes for all major structural proteins suggests that, in principle, Nvie-Pro1 might be capable of producing tailed virions. Yet another possibility is that Nvie-Pro1 represents a GTA (gene-transfer agent) rather than a provirus. GTAs resemble head-and-tail viruses in their appearance, but, unlike viruses, they do not encapsidate the genomic sequence that encodes their virus-like particles. Instead, GTAs carry random cellular DNA and transfer it horizontally from one cell to another [30]. Morphogenetic GTA proteins are encoded on a cellular chromosome where they are under control of cellular promoters and transcriptional regulators [31]. Nevertheless, the structural proteins are homologous with those of tailed dsDNA viruses [30]. Notably, GTAs were documented not only in bacteria, but also in the methanogenic euryarchaeon Methanococcus voltae [32]. We have identified previously a cryptic provirus in the M. voltae genome and suggested that it might represent the genomic region encoding the GTA [8] observed by Bertani [32]. As in the case of Nvie-Pro1, the cryptic provirus contains all major virion structural-protein-coding genes, but no genes for an integrase or genome-replication proteins.

Evolutionary considerations

Whatever the nature and function of Nvie-Pro1, its evolutionary relationship to tailed viruses of the order Caudovirales is hardly questionable. The identification of this provirus in a thaumarchaeal genome sheds more light on the evolution of tailed archaeal viruses and revives the question of their origin in Archaea. Given the high degree of morphological and genomic similarity between bacterial and archaeal tailed dsDNA viruses [8], the possibility of independent origins for these viruses in the two cellular domains can be ruled out with certainty. Consequently, two alternative routes for the origin of tailed archaeal viruses have been proposed [1]. The first possibility is that the ancestor of tailed viruses predated the divergence of bacteria and archaea and, as cellular organisms diversified into distinct domains of life, tailed viruses co-evolved and diversified with their hosts. The second scenario posits that tailed viruses emerged in archaea as a result of horizontal transfer across the domain boundary from bacteria [1]. The latter hypothesis was in part based on the fact that, at the time, tailed viruses were isolated only from archaeal species belonging to the classes Halobacteria and Methanobacteria, as opposed to the global distribution of tailed dsDNA viruses in bacteria. The observation that this morphotype corresponded to less than 1% of virus-like particles in a hypersaline environment was also considered to be supporting evidence for the horizontal-transfer hypothesis [33]. However, our recent survey of archaeal proviruses related to tailed dsDNA viruses of the order Caudovirales has indicated that these viruses are also in contact with members of the archaeal classes methanococci and methanomicrobia [8], as well as archaeoglobi (M. Krupovic, unpublished work). It therefore appears that organisms belonging to the majority of classes of the phylum Euryarchaeota possess tailed viruses associated with them. Furthermore, comparative genome analysis indicated that tailed archaeal (pro)viruses tend to form groups that follow the taxonomic grouping of the cellular organisms that they infect, suggesting co-evolution of these tailed viruses with their hosts [8]. The wide distribution of head-and-tail viruses in Euryarchaeota and the identification of Nvie-Pro1 in Thaumarchaeota, an archaeal phylum which might have branched off before the separation of Crenarchaeota and Euyarchaeota [9,10], suggest that association of these viruses with archaea might be more ancient than anticipated previously (Figure 3). Although no relatives of Nvie-Pro1 are found in the two available genomes from marine Thaumarchaeota, the presence of several close homologues in the environmental GOS (Global Ocean Sampling Expedition) marine sequence database (Table S1) suggests a high and still unexplored diversity.

Distribution of tailed dsDNA viruses in the domain Archaea

Figure 3
Distribution of tailed dsDNA viruses in the domain Archaea

Schematic representation of a consensus archaeal phylogeny based on recent phylogenomic analyses [9]. Archaeal phyla are written in capital letters. The eight recognized taxonomic classes of Euryarchaeota are boxed. Tree branches corresponding to the taxonomic units of Archaea in which organisms are known to be infected by tailed viruses or contain related proviruses are shown in black.

Figure 3
Distribution of tailed dsDNA viruses in the domain Archaea

Schematic representation of a consensus archaeal phylogeny based on recent phylogenomic analyses [9]. Archaeal phyla are written in capital letters. The eight recognized taxonomic classes of Euryarchaeota are boxed. Tree branches corresponding to the taxonomic units of Archaea in which organisms are known to be infected by tailed viruses or contain related proviruses are shown in black.

Conclusions

It has been suggested previously that structurally related viruses infecting hosts from different domains of life descend from a common ancestor that existed before the divergence of cellular organisms [22,34,35]. Such structurally related viruses were suggested to be grouped into viral lineages, and one of these lineages unites tailed viruses of the order Caudovirales and eukaryotic herpesviruses [20,35]. The identification of Nvie-Pro1 in the thaumarchaeon Ca. ‘Nitrososphaera viennensis’ suggests that tailed viruses might have been present in Archaea from the very emergence of this cellular domain (Figure 3). If this is the case, at the time of the last common ancestor of bacteria and archaea, the population of tailed viruses already consisted of individuals with different tail structures. Indeed, viruses with contractile and non-contractile tails (families Myoviridae and Siphoviridae respectively) have been isolated from both cellular domains [1]. It is obvious that evolutionary history of tailed viruses in Archaea is far from simple. It appears to consist of an element of vertical descent from a common ancestor with bacterial tailed viruses, but also of horizontal gene exchange between bacterial and archaeal viruses. Multiple instances of interdomain transfer of tailed viruses from bacteria to archaea are highly unlikely, due to fundamentally different transcription and replication machineries in the two domains. It is perhaps more reasonable to envisage that transfer of bacterial tailed virus genes has occurred as a result of recombination between archaeal tailed virus genomes and provirus-containing exogenous bacterial DNA which could have been acquired by archaeal cells from the environment in a course of natural transformation. Obviously, more genomic sequences of tailed (pro)viruses from archaeal species covering a wider phylogenetic range are required in order to understand the relationship of these viruses to their bacterial relatives.

Molecular Biology of Archaea II: A Biochemical Society Focused Meeting held at Robinson College, Cambridge, U.K., 16–18 August 2010. Organized and Edited by Stephen Bell (Oxford, U.K.) and Finn Werner (University College London, U.K.).

Abbreviations

     
  • ds

    double-stranded

  •  
  • GTA

    gene-transfer agent

  •  
  • MCP

    major capsid protein

  •  
  • TerL

    large subunit of the terminase

We thank Marion Engel and Michael Schloter for support in genome sequencing and assembly and Thomas Rattei for bioinformatic help on genome annotation.

Funding

This work was supported by the European Molecular Biology Organization [Long-Term Fellowship ALTF 347–2010 to M.K.] and the Austrian Academy of Sciences (DOC-fForte fellowship to A.S).

References

References
1
Prangishvili
D.
Forterre
P.
Garrett
R.A.
Viruses of the Archaea: a unifying view
Nat. Rev. Microbiol.
2006
, vol. 
4
 (pg. 
837
-
848
)
2
Zillig
W.
Prangishvilli
D.
Schleper
C.
Elferink
M.
Holz
I.
Albers
S.
Janekovic
D.
Götz
D.
Viruses, plasmids and other genetic elements of thermophilic and hyperthermophilic Archaea
FEMS Microbiol. Rev.
1996
, vol. 
18
 (pg. 
225
-
236
)
3
Prangishvili
D.
Garrett
R.A.
Koonin
E.V.
Evolutionary genomics of archaeal viruses: unique viral genomes in the third domain of life
Virus Res.
2006
, vol. 
117
 (pg. 
52
-
67
)
4
Forterre
P.
Prangishvili
D.
The great billion-year war between ribosome- and capsid-encoding organisms (cells and viruses) as the major source of evolutionary novelties
Ann. N.Y. Acad. Sci.
2009
, vol. 
1178
 (pg. 
65
-
77
)
5
Pfister
P.
Wasserfallen
A.
Stettler
R.
Leisinger
T.
Molecular analysis of Methanobacterium phage psiM2
Mol. Microbiol.
1998
, vol. 
30
 (pg. 
233
-
244
)
6
Tang
S.L.
Nuttall
S.
Dyall-Smith
M.
Haloviruses HF1 and HF2: evidence for a recent and large recombination event
J. Bacteriol.
2004
, vol. 
186
 (pg. 
2810
-
2817
)
7
Torsvik
T.
Dundas
I.D.
Bacteriophage of Halobacterium salinarium
Nature
1974
, vol. 
248
 (pg. 
680
-
681
)
8
Krupovičc
M.
Forterre
P.
Bamford
D.H.
Comparative analysis of the mosaic genomes of tailed archaeal viruses and proviruses suggests common themes for virion architecture and assembly with tailed viruses of bacteria
J. Mol. Biol.
2010
, vol. 
397
 (pg. 
144
-
160
)
9
Brochier-Armanet
C.
Boussau
B.
Gribaldo
S.
Forterre
P.
Mesophilic crenarchaeota: proposal for a third archaeal phylum, the Thaumarchaeota
Nat. Rev. Microbiol.
2008
, vol. 
6
 (pg. 
245
-
252
)
10
Spang
A.
Hatzenpichler
R.
Brochier-Armanet
C.
Rattei
T.
Tischler
P.
Spieck
E.
Streit
W.
Stahl
D.A.
Wagner
M.
Schleper
C.
Distinct gene set in two different lineages of ammonia-oxidizing archaea supports the phylum Thaumarchaeota
Trends Microbiol.
2010
, vol. 
18
 (pg. 
331
-
340
)
11
Karner
M.B.
DeLong
E.F.
Karl
D.M.
Archaeal dominance in the mesopelagic zone of the Pacific Ocean
Nature
2001
, vol. 
409
 (pg. 
507
-
510
)
12
Leininger
S.
Urich
T.
Schloter
M.
Schwark
L.
Qi
J.
Nicol
G.W.
Prosser
J.I.
Schuster
S.C.
Schleper
C.
Archaea predominate among ammonia-oxidizing prokaryotes in soils
Nature
2006
, vol. 
442
 (pg. 
806
-
809
)
13
Walter
M.C.
Rattei
T.
Arnold
R.
Güldener
U.
Münsterkötter
M.
Nenova
K.
Kastenmüller
G.
Tischler
P.
Wölling
A.
Volz
A.
, et al. 
PEDANT covers all complete RefSeq genomes
Nucleic Acids Res.
2009
, vol. 
37
 (pg. 
D408
-
D411
)
14
Rao
V.B.
Feiss
M.
The bacteriophage DNA packaging motor
Annu. Rev. Genet.
2008
, vol. 
42
 (pg. 
647
-
681
)
15
Mitchell
M.S.
Matsuzaki
S.
Imai
S.
Rao
V.B.
Sequence analysis of bacteriophage T4 DNA packaging/terminase genes 16 and 17 reveals a common ATPase center in the large subunit of viral terminases
Nucleic Acids Res.
2002
, vol. 
30
 (pg. 
4009
-
4021
)
16
Dokland
T.
Scaffolding proteins and their role in viral assembly
Cell. Mol. Life Sci.
1999
, vol. 
56
 (pg. 
580
-
603
)
17
Casjens
S.R.
Comparative genomics and evolution of the tailed-bacteriophages
Curr. Opin. Microbiol.
2005
, vol. 
8
 (pg. 
451
-
458
)
18
Chen
P.
Tsuge
H.
Almassy
R.J.
Gribskov
C.L.
Katoh
S.
Vanderpool
D.L.
Margosiak
S.A.
Pinko
C.
Matthews
D.A.
Kan
C.C.
Structure of the human cytomegalovirus protease catalytic domain reveals a novel serine protease fold and catalytic triad
Cell
1996
, vol. 
86
 (pg. 
835
-
843
)
19
Liu
J.
Mushegian
A.
Displacements of prohead protease genes in the late operons of double-stranded-DNA bacteriophages
J. Bacteriol.
2004
, vol. 
186
 (pg. 
4369
-
4375
)
20
Baker
M.L.
Jiang
W.
Rixon
F.J.
Chiu
W.
Common ancestry of herpesviruses and tailed DNA bacteriophages
J. Virol.
2005
, vol. 
79
 (pg. 
14967
-
14970
)
21
Wikoff
W.R.
Liljas
L.
Duda
R.L.
Tsuruta
H.
Hendrix
R.W.
Johnson
J.E.
Topologically linked protein rings in the bacteriophage HK97 capsid
Science
2000
, vol. 
289
 (pg. 
2129
-
2133
)
22
Bamford
D.H.
Grimes
J.M.
Stuart
D.I.
What does structure tell us about virus evolution?
Curr. Opin. Struct. Biol.
2005
, vol. 
15
 (pg. 
655
-
663
)
23
Wang
J.
Hartling
J.A.
Flanagan
J.M.
The structure of ClpP at 2.3 Å resolution suggests a model for ATP-dependent proteolysis
Cell
1997
, vol. 
91
 (pg. 
447
-
456
)
24
Ginalski
K.
Elofsson
A.
Fischer
D.
Rychlewski
L.
3D-Jury: a simple approach to improve protein structure predictions
Bioinformatics
2003
, vol. 
19
 (pg. 
1015
-
1018
)
25
Marti-Renom
M.A.
Stuart
A.C.
Fiser
A.
Sanchez
R.
Melo
F.
Sali
A.
Comparative protein structure modeling of genes and genomes
Annu. Rev. Biophys. Biomol. Struct.
2000
, vol. 
29
 (pg. 
291
-
325
)
26
Wiederstein
M.
Sippl
M.J.
ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins
Nucleic Acids Res.
2007
, vol. 
35
 (pg. 
W407
-
W410
)
27
Chang
J.R.
Spilman
M.S.
Rodenburg
C.M.
Dokland
T.
Functional domains of the bacteriophage P2 scaffolding protein: identification of residues involved in assembly and protease activity
Virology
2009
, vol. 
384
 (pg. 
144
-
150
)
28
Duda
R.L.
Martincic
K.
Hendrix
R.W.
Genetic basis of bacteriophage HK97 prohead assembly
J. Mol. Biol.
1995
, vol. 
247
 (pg. 
636
-
647
)
29
Effantin
G.
Figueroa-Bossi
N.
Schoehn
G.
Bossi
L.
Conway
J.F.
The tripartite capsid gene of Salmonella phage Gifsy-2 yields a capsid assembly pathway engaging features from HK97 and lambda
Virology
2010
, vol. 
402
 (pg. 
355
-
365
)
30
Stanton
T.B.
Prophage-like gene transfer agents-novel mechanisms of gene exchange for Methanococcus, Desulfovibrio, Brachyspira, and Rhodobacter species
Anaerobe
2007
, vol. 
13
 (pg. 
43
-
49
)
31
Lang
A.S.
Beatty
J.T.
Genetic analysis of a bacterial genetic exchange element: the gene transfer agent of Rhodobacter capsulatus
Proc. Natl. Acad. Sci. U.S.A.
2000
, vol. 
97
 (pg. 
859
-
864
)
32
Bertani
G.
Transduction-like gene transfer in the methanogen Methanococcus voltae
J. Bacteriol.
1999
, vol. 
181
 (pg. 
2992
-
3002
)
33
Sime-Ngando
T.
Lucas
S.
Robin
A.
Tucker
K.P.
Colombet
J.
Bettarel
Y.
Desmond
E.
Gribaldo
S.
Forterre
P.
Breitbart
M.
Prangishvili
D.
Diversity of virus-host systems in hypersaline Lake Retba, Senegal
Environ. Microbiol.
2010
 
doi:10.1111/j.1462–2920.2010.02323.x
34
Krupovičc
M.
Bamford
D.H.
Virus evolution: how far does the double β-barrel viral lineage extend?
Nat. Rev. Microbiol.
2008
, vol. 
6
 (pg. 
941
-
948
)
35
Krupovičc
M.
Bamford
D.H.
Order to the viral universe
J. Virol.
2010
, vol. 
84
 (pg. 
12476
-
12479
)
36
Pei
J.
Kim
B.H.
Grishin
N.V.
PROMALS3D: a tool for multiple protein sequence and structure alignments
Nucleic Acids Res.
2008
, vol. 
36
 (pg. 
2295
-
2300
)
37
Edgar
R.C.
MUSCLE: a multiple sequence alignment method with reduced time and space complexity
BMC Bioinformatics
2004
, vol. 
5
 pg. 
113