Branched DNA structures that occur during DNA repair and recombination must be efficiently processed by structure-specific endonucleases in order to avoid cell death. In the present paper, we summarize our screen for new interaction partners for the archaeal replication clamp that led to the functional characterization of a novel endonuclease family, dubbed NucS. Structural analyses of Pyrococcus abyssi NucS revealed an unexpected binding site for ssDNA (single-stranded DNA) that directs, together with the replication clamp, the nuclease activity of this protein towards ssDNA–dsDNA (double-stranded DNA) junctions. Our studies suggest that understanding the detailed architecture and dynamic behaviour of the NucS (nuclease specific for ssDNA)–PCNA (proliferating-cell nuclear antigen) complex with DNA will be crucial for identification of its physiologically relevant activities.
Branched DNA structures are formed by the direct action of DNA-damaging agents or occur as intermediates during DNA replication, repair and recombination. Since even a single DNA structure containing unpaired 5′- or 3′-extremities can signal cell death, these branched structures are highly toxic. Thus several molecular machines have evolved independently in the three domains of life to efficiently recognize and repair these abnormal DNA molecules to ensure faithful duplication of genetic material before cell division.
In Archaea, the best characterized nucleases that recognize and cleave DNA structures carrying 5′- and/or 3′-flaps and/or splayed arm substrates without apparent sequence specificity are Fen-1 (flap endonuclease 1) [1,2] and XPF (xeroderma pigmentosum complementation group F) [3–5] family nucleases (Figure 1). Although Eukarya contain many Fen-1 and XPF family members, archaeal genomes have revealed the presence of a single homologue. Archaeal Fen-1 is thought to participate in archaeal DNA metabolism by eliminating the primers of Okazaki fragments during DNA replication or the damaged DNA during repair processes [2,6–8]. Recent genetic studies in yeast have also suggested that archaeal Fen-1 proteins may participate in 5′-end processing during base excision repair . The preferential substrate for Fen-1 contains a single unpaired 3′-nucleotide [3′-flap (indicated with the filled oval in Figure 1)] and a longer region of 5′-ssDNA (single-stranded DNA) . This DNA structure is presumably created by the strand-displacement activity of DNA polymerases. The structural and site-directed mutagenesis analyses of Fen-1 proteins have revealed that the unpaired 3′-flap binds to the extrahelical pocket that is evolutionarily conserved in archaeal and human Fen-1 proteins . This binding event is thought to regulate DNA substrate specificity by opening and kinking the DNA . Fen-1 can generate nicked DNA structures that can be directly ligated to form a continuous DNA strand. It is important to note that the cleavage activity of archaeal Fen-1 proteins is also regulated through highly specific interactions with the sliding clamp PCNA (proliferating-cell nuclear antigen), as exemplified by the observation that PCNA can stimulate Fen-1 activity by increasing the affinity of the enzyme for its substrates . Indeed, this functional interaction is mediated by the so-called PIP-motif (PCNA-interacting peptide motif), a relatively short peptide motif found in a large number of PCNA-interaction partners [13,14].
The branched DNA structures processed by XPF, Fen-1 or NucS family of nucleases
XPF-family nucleases [e.g. XPF–ERCC1 (excision repair cross-complementing 1) and Mus81–Eme1 complexes in Eukarya] play important roles in the repair of DNA damaged caused by UV light or DNA cross-linking agents by acting on a range of 3′-flap or forked DNA structures. Archaea contain a single family member, but the domain structures of crenarcheal and euryarchaeal XPF proteins differ drastically [15,16]. Whereas crenarcheota Sulfolobus solfataricus and Aeropyrum pernix carry a ‘short-form XPF’ homologue consisting only of a nuclease domain , hyperthermophilic Pyrococcus species and other euryarcheota contain a ‘long-form XPF’ dubbed Hef [3,18,19]. Hef proteins have a functional N-terminal helicase domain, followed by the nuclease domain. In Sulfolobales, the heterotrimeric replication clamp PCNA  activates XPF by increasing its catalytic rate (kcat) by four orders of magnitude . Interestingly, archaeal XPF homologues may have rather broad substrate specificities when compared with eukaryotic XPF–ERCC1 and/or Mus81 proteins [4,5,20].
Although archaeal genomes seem to contain only one Fen-1 and XPF homologue, both fen1 and xpf/hef genes can readily be deleted in Haloferax volcanii [8,21]. This could simply reflect the functional redundancy between already identified DNA repair/replication enzymes, as exemplified by the recent findings revealing that archaeal Hef and Holliday junction resolvases may provide an alternative means to restart stalled DNA replication forks . In the present paper, we summarize a new approach that used a combination of in silico predictions and combinatorial peptide synthesis to identify new interaction partners for the P. abyssi replication clamp. Unexpectedly, this approach led to functional and structural characterization of a novel DNA endonuclease that acts on branched DNA substrates.
Identification of novel interaction partners for Pab (P. abyssi) PCNA
Like replication clamps in general, PabPCNA is a ring-shaped processivity factor. The outer surface of this PCNA homotrimer is negatively charged, whereas the central cavity accommodating DNA carries positive charges. In order to be functionally active, several distinct PCNA-interacting proteins must bind and dissociate in a highly specific and ordered manner. In view of this fact, it is surprising that the identified peptide domain motifs of PIP-motifs are quite short. We thus designed a functional screen for identification of peptide sequences naturally coded by P. abyssi genome . In particular, we immobilized a large number of peptides designed using genome information on to a solid support and tested their binding to PabPCNA experimentally. This approach led to identification of naturally occurring peptides that contained the sequence motif QX2LX2[WFT][LFT] and showed high affinity for the replication clamp. These peptides were often flanked by positively charged residues. Many of these ‘high-affinity’ peptides are present in P. abyssi proteins such as RFC (replication factor C), Fen-1, PolB (RNA polymerase B) and DNA ligase that are known to interact with the PCNA. However, we also found that peptides from PabRNase HII and PAB2263, two proteins that have not previously been identified as interacting with PCNA, contain a PIP-motif with a strong PCNA-binding activity. In agreement with the previous mutagenesis studies , our approach indicated that the PIP-motif sustains a considerable amount of substitutions without having an inhibitory effect on the association with PCNA. This observation is in agreement with earlier structural observations indicating that PIP-motif peptides interact with the PCNA mainly with peptide backbone hydrogen bonds .
P. abyssi and human RNase HII interact with the replication clamp
Using SPR (surface plasmon resonance), we demonstrated that the aforementioned RNase HII peptide mediates interactions with PCNA in its natural context. In particular, our results indicated that PabRNase HII (as well as PabFen-1 and PabDNA ligase) physically interacted with PabPCNA at nanomolar concentrations . This is of interest, as it has been proposed recently that PabRNase HII may operate or act as a dual-functional enzyme in DNA replication and repair . The RNase HII variant where the PIP-motif was removed through site-directed mutagenesis failed to interact with PCNA and the competition experiments, in the presence of a 10-fold excess of the peptide carrying a PIP-motif, decreased the rate of complex formation and/or accelerated dissociation of RNase HII from PCNA. This experiment therefore established that the PIP-motif was necessary and sufficient for the formation of the PCNA–RNase HII complex. Although we have also noticed that RNase HII and Fen-1 precipitated together with PCNA from cell-free extracts, the physiological relevance of the PCNA–RNase HII complex formation remains unclear. In this respect, it is noteworthy that biochemical and genetic studies have revealed that a key physiological function of yeast RNase HII, possibly together with Fen-1, is to remove misincorporated ribonucleotides from chromosomal DNA [23,24]. As the human PCNA–RNAse HII complex formation has also been detected , PCNA may recruit RNase HII to chromatin sites where ribonucleotides impede DNA replication not only in archaea, but also in human cells.
Identification of PAB2263 as the novel endonuclease interacting with the PCNA
Our peptide screening experiments also suggested the formation of a high-affinity PAB2263–PabPCNA complex. PAB2263, and other representatives of a DUF91 (domain of unknown function 91) family, contains the C-terminal domain that carries the characteristic residues of the RecB-family nucleases followed by the putative PIP-motif. This predicted nuclease domain is found in a variety of endonucleases and DNA-repair enzymes . The latest UniProtKB version lists 335 annotated members of the DUF91 family that are mainly found in euryarcheota (54 homologues), crenarcheota (29 homologues), actinobacteria (178 homologues) and proteobacteria (38 homologues).
Using SPR and pull-down experiments, we have demonstrated the formation of a stable PAB2263–PCNA complex in vitro and in P. abyssi cell-free extracts . Our early activity measurements indicated that PAB2263 possesses a potent nuclease activity specific for ssDNA, leading us to name this protein NucS (nuclease specific for ssDNA). It is nevertheless of note that later studies indicated that NucS activity on ssDNA is modulated by interactions with dsDNA (double-stranded DNA) (see below).
In Pyrococcus species, NucS is encoded nearby the evolutionarily conserved replication origin, and the genome sequence of Thermococcus kodakarensis KOD1_4 indicated that the NucS orthologue is likely to be translationally coupled with RadA recombinase. These observations suggest that NucS homologues may function in recombinatorial repair of DNA. Although direct evidence for this notion is still missing, we have shown that PabNucS co-precipitates in cell free extracts with PabHef (a long form of XPF) that may act in resolving recombination intermediates in Archaea . Although the activity profiles of archaeal XPF and NucS proteins seem to at least partially overlap, it is of note that, in Sulfolobales, XPF and NucS are differentially transcribed , thus providing a feasible way for controlling their nuclease activities.
Structural description of NucS proteins
In order to gain insight into structure–function relationships of NucS protein, we have solved the crystal structure of the P. abyssi representative to 2.6 Å (1 Å=0.1 nm) resolution . As indicated in Figure 2, this protein is composed of two independent domains (N- and C-terminal domains), separated by a small polypeptide linker (12 residues). The C-terminal domain (coloured blue in Figure 2) possesses an α/β structure composed of a five-stranded central β-sheet and four flanking α-helices. This is the minimal endonuclease fold described previously . Structural alignment with the Escherichia coli RecB nuclease domain gives an RMSD (root mean square deviation) of 3.2 Å for 82 Cα atoms. This domain hosts the active site with a sequence motif conserved in the RecB-like nucleases , including residues Asp160, Glu174, Lys176, Gln187 and Tyr191. A conserved cluster of basic activesite residues (Lys176, Arg177, Arg178 and Lys179) flanks the cleft on one side and may help in the binding of nucleic acids. On the other hand, the N-terminal domain (coloured cyan in Figure 2) displays a half-closed β-barrel. Eight β-strands are arranged in two antiparallel β-sheets that are packed orthogonally on to each other. This fold has not been described previously, albeit it can be considered distantly related to the Sm-fold found in some RNA-binding proteins  and/or the OB (oligonucleotide/oligosaccharide)-fold found in many ssDNA-binding proteins . Structural comparison of these distantly related proteins raised the possibility that all these proteins bind to nucleic acids in an analogous way. In PabNucS, this potential binding site for ssDNA involves two patches of basic residues (Lys68-Arg69-Arg70 and Arg93-Arg94-Arg95), two conserved aromatic residues (Tyr39 and Trp75) and a conserved arginine residue (Arg42). The validity of this prediction was confirmed using site-directed mutagenesis experiments on residues 42, 70 and 75, thus revealing a high-affinity (non-catatalytic) ssDNA-binding site. Finally, a stretch of the interdomain polypeptide (residues 120–125) is not visible in the crystallographic structure, indicating that it is non-structured and flexible, which is corroborated by the large interdomain distance (~28 Å) and other residues displaying a stretched conformation.
The structure of PabNucS
PabNucS possesses a large hydrophobic patch exposed on the six-stranded N-terminal β-sheet, and orientated to the C-terminal domain. This patch is used by the protein to form a dimer by domain swapping (see Figure 2). Dimer formation seems critical for the folding and the stabilization of NucS structure, burying the two patches away from water in a large hydrophobic core (20% of the surface area of the subunit is involved in the dimerization). This dimerization results in the fact that the N-terminal domain of one molecule of the dimer brings an extra basic residue (Lys44) to the catalytic site of the other molecule, and the flexible linker between the two domains caps the active site. As a result, catalytic site becomes a ‘closed’ channel, which indicates that the substrate for the enzyme must have a free end, such as broken DNA strand. Note also that NucS dimers aggregate further in the crystals to form a dimer of dimers (or tetramer), resembling a ring. However, in agreement with the dimeric solution structure, weak interdimer interactions indicate that tetramer formation only reflects the formation of crystal contacts.
ssDNA binding and formation of a PCNA–NucS complex direct the cleavage towards the ssDNA–dsDNA junctions
Considering that the active-site channel of NucS proteins is too narrow to accommodate dsDNA, we investigated the substrate preference of PabNucS using a large variety of substrates carrying 5′- or 3′-single stranded region. Surprisingly, in a manner analogous to eukaryotic Mus81–Mms4  and Dna2 , PabNucS protein cleaves both 5′- or 3′-flaps . This result implies that the active-site channel of NucS proteins does not specifically recognize DNA extremities that slide to the active site before cleavage . It is nevertheless of note that NucS cleaves long ssDNA substrates to regularly spaced products, suggesting that the protein can somehow ‘measure’ the distance from the DNA end to the cleavage site. When a single-stranded oligonucleotide was hybridized to form the so-called splayed-arm structure that carries both the 3′- and 5′-protruding ends, different results were observed, suggesting that the binding of NucS to the dsDNA and/or ssDNA–dsDNA junctions modulates the cleavage specificity. In agreement with this notion, we showed that the non-catalytic ssDNA-binding site (see above) is required for the specific cleavage at the ssDNA–dsDNA junction of the splayed-arm substrate. Finally, we also found that the addition of PCNA, which is loaded on to dsDNA in the cellular context, directs the cleavage of PabNucS towards the ssDNA–dsDNA junction .
Conclusion and future challenges
Recent studies have indicated that archaea may contain many more DNA-specific nucleases than expected previously on the basis of homology searches [27,34]. Our studies revealed that PabNucS protein is a founding member of the new family of the structure-specific DNA endonucleases that, together with the replication clamp, can act on the branched DNA structures. Determining how NucS proteins recognize specific DNA substrates and identification of their physiologically relevant activities is an important challenge for future studies. Towards this goal, understanding the detailed architecture and dynamic behaviour of the NucS–PCNA complex with DNA will be crucial. Investigating the functional and physical interplay between NucS and RPA (replication protein A), the factor that presumably coat the ssDNA substrate of NucS, will be also of importance to fully characterize the enzymatic activity and the specificity of this new nuclease.
Molecular Biology of Archaea II: A Biochemical Society Focused Meeting held at Robinson College, Cambridge, U.K., 16–18 August 2010. Organized and Edited by Stephen Bell (Oxford, U.K.) and Finn Werner (University College London, U.K.).
domain of unknown function 91
excision repair cross-complementing 1
flap endonuclease 1
nuclease specific for single-stranded DNA
proliferating-cell nuclear antigen
PCNA-interacting peptide motif
surface plasmon resonance
xeroderma pigmentosum complementation group F
We thank R. Landenstein, B. Ren and G. Hennecke for helpful discussions.
Our work on NucS proteins is supported by Agence Nationale de la Recherche [grant number ANR-07-BLAN-0371 CSD 8].