Recombinases of the RecA family are essential for homologous recombination and underpin genome stability, by promoting the repair of double-stranded DNA breaks and the rescue of collapsed DNA replication forks. Until now, our understanding of homologous recombination has relied on studies of bacterial and eukaryotic model organisms. Archaea provide new opportunities to study how recombination operates in a lineage distinct from bacteria and eukaryotes. In the present paper, we focus on RadA, the archaeal RecA family recombinase, and its homologues in archaea and other domains. On the basis of phylogenetic analysis, we propose that a family of archaeal proteins with a single RecA domain, which are currently annotated as KaiC, be renamed aRadC.
Recombinases are enzymes that are essential for HR (homologous recombination), since they catalyse homologous base-pairing and strand exchange (Figure 1). Recombinases are ubiquitous throughout bacteria (RecA), eukaryotes (Rad51) and archaea (RadA). Although they show differences in their N- and C- terminal domains (Figure 2), recombinases all possess a well-conserved core region known as a RecA fold . This fold contains the Walker A and B motifs required for ATP binding and hydrolysis. Recombinases form helical filaments on ssDNA (single-stranded DNA), binding three nucleotides per recombinase monomer .
Double-strand break repair model for HR
Conserved structural domains in RecA family proteins
RecA was discovered in 1965 by screening for recombination-deficient Escherichia coli mutants . Mutants of recA were shown to be sensitive to UV irradiation and deficient in homologous recombination. RecA binds to ssDNA in a sequence non-specific manner, polymerizing in a 5′→3′ orientation to form right-handed helical filaments. A contiguous RecA nucleoprotein filament is required for efficient strand exchange with homologous duplexes, forming regions of triplex DNA . In vivo, RecA-mediated strand exchange requires a minimum of 23–40 bp of homology . ATP increases the binding affinity of RecA to DNA and stabilizes the nucleoprotein filament, but ATP hydrolysis is not required for DNA binding or strand exchange. Instead, it is required for strand exchange to proceed in a unidirectional manner, and for the disassociation of RecA from DNA .
Rad51 was discovered in Saccharomyces cerevisiae, on the basis of sequence homology with E. coli RecA, and has been found subsequently in all sequenced eukaryotic genomes. Yeast Rad51 mutants exhibit defects in HR and meiosis, and are highly sensitive to DNA-damaging agents . Depletion of Rad51 in chicken DT40 cells results in the accumulation of spontaneous chromosomal breaks  and Rad51-knockout mice are embryonic lethal.
Rad51 binds DNA in nucleoprotein filaments analogous to bacterial RecA . However, Rad51 filament formation proceeds in the opposite polarity to RecA filaments, the former in a 3′→5′ direction . RecA and Rad51 share homology in the core of the protein containing the Walker A and B motifs, but there are key differences elsewhere (Figure 2). RecA has a C-terminal extension that is important for DNA binding, which is absent from Rad51 . Conversely, the N-terminus of Rad51, required for DNA binding, does not have a corresponding region in RecA .
RadA is the archaeal homologue of RecA and Rad51 . Deletion of Haloferax volcanii radA results in increased sensitivity to chemical mutagens and UV light, and mutant cells are deficient in HR . RadA shows greater similarity to Rad51 than to RecA (∼40 and ∼20% identity respectively), and RadA and Rad51 have strikingly similar structures. For example, the polymerization motif in both RadA and Rad51 features an invariant phenylalanine residue that is inserted into a conserved hydrophobic socket of an adjacent Rad51/RadA monomer (Figure 3) . This motif is not seen in bacterial RecA.
Crystal structure of RadA recombinase
RadA from both Crenarchaeota and Euryarchaeota has DNA-dependent ATPase activity, forms nucleoprotein filaments and catalyses homologous pairing and strand exchange [16,17]. Like Rad51 and RecA, RadA forms right-handed filaments on DNA , although, in the absence of DNA, left-handed filaments have been observed . The stoichiometry of RadA nucleoprotein filaments is the same as Rad51 and RecA, with one RadA monomer per three nucleotides .
Many organisms possess paralogues of their respective RecA family recombinases. Although they are functionally different from one another, these paralogues all appear to play a role in HR. In many cases, paralogues function as recombination mediators that help to load their respective recombinase on to ssDNA.
Bacteria contain a RecA paralogue called Sms (or bacterial RadA). Sms shares homology with both RecA and Lon protease  and is essentially a RecA domain fused to a zinc ribbon and a predicted serine protease domain . Sms is found only in bacteria, and, although its function is poorly characterized, mutants are sensitive to X-rays, UV and methyl methanesulfonate, suggesting a role in DNA repair . A more recent study has suggested that Sms is required to stabilize the strand-invasion intermediate during HR .
Fungal genomes contain two Rad51 paralogues: Rad55 and Rad57. These two proteins form a stable heterodimeric complex that has been shown to interact with Rad51 and stimulate strand exchange . The Rad55–Rad57 complex acts as a recombination mediator by stabilizing Rad51 nucleoprotein filaments, specifically by overcoming the inhibition of Rad51 ssDNA binding by the ssDNA-binding protein RPA (replication protein A) .
Rad51B/Rad51C/Rad51D and XRCC2/XRCC3
Higher eukaryotic genomes encode a total of five Rad51 paralogues, which are distinct from fungal Rad55/57. XRCC2 and XRCC3 were identified on the basis of complementation of X-ray irradiation sensitivity of mutant hamster cell lines, and three further paralogues were identified on the basis of their sequence similarity to Rad51, namely Rad51B, Rad51C and Rad51D . Cell lines mutated for any of the five paralogues are viable, but show a reduction in DNA-damage-induced Rad51 foci and increased sensitivity to DNA-cross-linking agents and ionizing radiation, suggesting a role as Rad51 accessory proteins [27–29].
Caenorhabditis elegans encodes only one Rad51 paralogue, Rfs-1. Rfs-1 is dispensable for Rad51 recruitment to dsDNA (double-stranded DNA) breaks, but is essential for recruitment of Rad51 to replication fork blocks induced by DNA-cross-linking agents . A recent study suggests that Rfs-1 functions in a manner similar to that of Rad51D .
In addition to RadA, the archaea possess a second Rad51 homologue, RadB. This protein was discovered on the basis of sequence similarity to Rad51 . RadB is found throughout Euryarchaeota, but is absent from Crenarchaeota. Deletion of radB from Haloferax volcanii results in slow growth, increased sensitivity to UV light and recombination defects, consistent with the role of RadB as a recombination mediator (S. Haldenby, H.P. Ngo and T. Allers, unpublished work). The primary structure of RadB shares considerable similarity to the RecA-like core of Rad51 and RadA, but RadB lacks the N-terminal domain found in RadA and Rad51 (Figure 2). Pyrococcus furiosus RadB binds both ssDNA and dsDNA with a higher affinity than RadA. RadB binds ATP, but has an extremely weak ATPase activity, in contrast with RadA . However, mutation of the Walker A motif of H. volcanii RadB results in a null phenotype, suggesting that ATP binding is required for RadB activity .
Like other Rad51 paralogues, RadB does not catalyse strand exchange and is therefore not a recombinase. Biochemical analyses have suggested that RadB acts as a recombination mediator, analogous to the fungal Rad55–Rad57 complex, since RadA and RadB have been demonstrated to interact in vitro . Yeast two-hybrid and immunoprecipitation analyses have also shown that RadB interacts with Hjc, an archaeal Holliday junction resolvase, and DP1, the proofreading subunit of archaeal DNA polymerase, PolD [35,36]. RadB features a patch of basic residues near its C-terminus (K/RHR motif) that has been implicated in DNA binding . The crystal structure of Thermococcus kodakaraensis RadB has been published  and was shown to crystallize as a dimer.
In addition to RadA and its paralogue RadB, a number of other RecA family proteins are found in archaea (Figure 4). The majority of archaeal RecA family proteins that are neither RadA nor RadB belong to the clade we have termed aRadC (archaeal RadC). These proteins are similar to the KaiC circadian clock protein found in cyanobacteria such as Synechococus elongatus . KaiC-like proteins are found throughout archaea and bacteria, and occur in two forms. The prototypic KaiC from cyanobacteria is ∼500 amino acids in length and is composed of two RecA-like domains joined head to tail (Figure 2). It is found also in archaea and bacteria that do not posses circadian rhythm, since they lack the other circadian clock proteins KaiA and KaiB. The second form of KaiC is a ∼250 residue protein that contains a single RecA-like domain. These KaiC-like proteins are exclusive to archaea (with the sole exception of an apparent lateral gene transfer to the thermophilic bacterium Thermotoga maritima) and form a phylogenetic clade that is distinct from the two-domain KaiC proteins (results not shown).
Phylogeny of RecA family proteins from archaea
It has been suggested by Koonin and colleagues that ancestral KaiC was a single-domain protein that was laterally transferred from bacteria to archaea, and that two-domain KaiC originated by gene duplication and fusion within the archaea . Since the single-domain protein is found only in archaea, we respectfully suggest that it is more parsimonious to suppose that KaiC originated in archaea as an ancient paralogue of the RecA/RadA precursor. However, we concur with the proposal that the two-domain KaiC developed in archaea. Apart from the exception of Hyperthermus butylicus, archaeal two-domain KaiC proteins are found only in Euryarchaeota and are particularly common in Haloarchaea, which are prone to frequent lateral gene transfer.
If the archaeal single-domain protein is the ancestral form, then referring to it as KaiC is clearly putting the cart before the horse. It is improbable that this protein functions in circadian rhythm, since there are no photosynthetic archaea. The role of KaiC as a circadian pacemaker is likely to be a derived function, which required the evolution of KaiA and KaiB in the cyanobacterial lineage . This leaves open the question as to the original function of (single-domain) KaiC. Recent work has shown that Sto0579, an archaeal single-domain protein, interacts with both RadA and the ssDNA-binding protein SSB, and helps to overcome the barrier to RadA-catalysed strand exchange due to excess SSB . Furthermore, expression of Sto0579 is induced by UV light, and the equivalent proteins from Sulfolobus solfataricus and Thermoproteus tenax bind tightly to ssDNA [40,41]. These results suggest that archaeal single-domain proteins of the KaiC family function in DNA repair, possibly as recombination mediators similar to RadB. For this reason, we propose that they be renamed aRadC (archaeal RadC). We are aware that an unrelated protein found in bacteria is annotated as RadC; however, it has been suggested that this name is misleading since is unclear whether it functions in DNA repair .
Other archaeal paralogues
In the crenarchaeal genera Pyrobaculum and Thermoproteus, there is a protein with significant homology with RadA (26% identity, 43% similarity). Since it contains the N-terminal domain found in RadA (and eukaryotic Rad51), we have termed it RadA2 (Figure 4). However, RadA2 proteins do not have the conserved RadA/Rad51 polymerization motif shown in Figure 3 , suggesting that they do not function as recombinases. Interestingly, species that contain RadA2 are found in a distinct early-branching clade of the RadA tree, suggesting that radA gene duplication occurred early in crenarchaeal evolution.
Sulfolobales contain RecA family proteins that cannot be placed reliably on a dendrogram with the other major groups shown in Figure 4. Three of these proteins, which include Sso0777, are closely related (43% identity, indicated in light blue on Figure 4). All three are comparatively small (∼180 residues), since they are truncated at the C-terminus (relative to RadA). In their core domain they show limited similarity to bacterial Sms, a paralogue of RecA .
Sto2522 and Sto0838 from Sulfolobus tokodaii are more problematic, since in our phylogenetic analysis neither protein displays similarity consistently with aRadC (or any other RecA family protein). Inspection of the core domain sequence of Sto2522 and Sto0838 suggests that they do not belong to the aRadC family, but might instead be distant relatives.
Structural analysis of the archaeal recombinase RadA has proven invaluable for studies on eukaryotic Rad51 , and work on the RadA paralogue RadB has the potential to provide novel insights into recombination mediator proteins. However, the function of archaeal RecA family proteins that are neither RadA nor RadB, which we have termed aRadC, remains a mystery. Biochemical analysis of aRadC proteins is in its infancy, and genetic analysis is required urgently to confirm whether they act in HR and DNA repair.
We thank the Royal Society [grant number 516002.KS687], Biotechnology and Biological Sciences Research Council [grant number BB/C501641/1] and The Leverhulme Trust [grant number F/00 114/AT] for support.
Molecular Biology of Archaea: Biochemical Society Focused Meeting held at University of St Andrews, U.K., 19–21 August 2008. Organized and Edited by Stephen Bell (Oxford, U.K.) and Malcolm White (St Andrews, U.K.).