CRISPR–Cas systems are adaptive immune systems in prokaryotes that provide protection against viruses and other foreign DNA. In the adaptation stage, foreign DNA is integrated into CRISPR (clustered regularly interspaced short palindromic repeat) arrays as new spacers. These spacers are used in the interference stage to guide effector CRISPR associated (Cas) protein(s) to target complementary foreign invading DNA. Cas1 is the integrase enzyme that is central to the catalysis of spacer integration. There are many diverse types of CRISPR–Cas systems, including type I-F systems, which are typified by a unique Cas1–Cas2–3 adaptation complex. In the present study we characterize the Cas1 protein of the potato phytopathogen Pectobacterium atrosepticum, an important model organism for understanding spacer acquisition in type I-F CRISPR–Cas systems. We demonstrate by mutagenesis that Cas1 is essential for adaptation in vivo and requires a conserved aspartic acid residue. By X-ray crystallography, we show that although P. atrosepticum Cas1 adopts a fold conserved among other Cas1 proteins, it possesses remarkable asymmetry as a result of structural plasticity. In particular, we resolve for the first time a flexible, asymmetric loop that may be unique to type I-F Cas1 proteins, and we discuss the implications of these structural features for DNA binding and enzymatic activity.

INTRODUCTION

Prokaryotes are constantly faced with the threat of viral infection. Viruses that infect bacteria are called bacteriophages (or phages) and are the most abundant biological entities on earth, outnumbering their hosts by at least an order of magnitude [13]. The high rate of phage infections means there is a rapid turnover of prokaryotes due to cell lysis [4] and therefore, diverse phage resistance mechanisms have evolved that combat phage infection [5,6].

Clustered regularly interspaced short palindromic repeats (CRISPRs) and their associated proteins (Cas) constitute defence systems against phages and other mobile genetic elements (reviewed by [7,8]) and have been found in most archaeal and half of bacterial genomes [9]. These systems make use of the complementary binding of small CRISPR RNAs (crRNAs) to identify invading nucleic acids, which results in invader degradation mediated by Cas proteins. CRISPR–Cas systems have recently been re-classified into two classes consisting of at least five types (I to V) based on the presence of proteins of characteristic sequence (Cas3, Cas9, Cas10, Csf1 and Cpf1 respectively), with further subtypes defined by the presence of subtype-specific proteins [10]. However, new bioinformatic searches of ever-increasing genomic data are resulting in the discovery of novel systems of new types [11].

CRISPR arrays encode the guide crRNA components of the system and are composed of a succession of ∼30 bp repeats separated by unique spacer sequences derived from foreign genetic elements such as phages and plasmids. In the expression stage of defence, a promoter present within an A/T-rich leader sequence drives the expression of the CRISPR array into a long pre-crRNA that is processed by Cas nucleases into individual crRNAs. Each crRNA contains one spacer sequence and, depending on the system, remnants of either one or both of the adjacent repeats. In the subsequent interference stage, each crRNA is loaded into a ribonucleoprotein complex (called Cascade in type I systems), guiding it to complementary invading nucleic acids, which are subsequently destroyed, either by the ribonucleoprotein complex or by a recruited effector nuclease (reviewed by [12,13]).

The first and critical stage of defence in CRISPR–Cas systems is adaptation (or acquisition), where invading viral or other foreign DNA is recognized and a fragment of it incorporated into a CRISPR array as a new spacer [8]. This spacer forms part of the CRISPR ‘memory bank’ that protects the bacterium against further invasion by elements bearing a complementary sequence [14]. The universal Cas proteins, Cas1 and Cas2, are found across most CRISPR–Cas systems types and are essential for adaptation [10,15,16]. Separately, these proteins have been shown to have non-specific nuclease activity against a variety of double-stranded and single-stranded substrates [1720]. We previously provided the first evidence that Cas1 can interact with a Cas2 domain-containing protein (Cas2–3) to form a complex predicted to be involved in adaptation in the type I-F system of Pectobacterium atrosepticum [21]. The type I-E Cas1 and Cas2 proteins from Escherichia coli were subsequently shown to form a complex that can catalyse spacer integration in vitro [22,23].

The mechanism of spacer acquisition can be divided into two phases. In the first phase, ‘capture’, a sequence destined to become a new spacer (termed a protospacer) is selected from the invading DNA and incorporated into the Cas1–Cas2 complex. Two settings have been identified for this step. In the first setting, called ‘naïve acquisition’, foreign DNA is recognized de novo by the action of the RecBCD recombination machinery on stalled replication forks [24]. In the second setting, called ‘primed acquisition’, targets that have mutated to evade interference are recognized by Cas proteins that recruit Cas1–Cas2 to stimulate rapid spacer acquisition [16,2528]. In both settings, capture is dependent on a short protospacer adjacent motif (PAM) which seems to be a requirement for excision by Cas1–Cas2 [25,29,30]. Following capture, recent evidence suggests the protospacer consists of a central dsDNA region bound by Cas2, the 3′ ssDNA flanks of which are each bound by one Cas1 homodimer [29,31].

In the second phase of acquisition, ‘integration’, the captured protospacer is inserted into the CRISPR array as a new spacer. This integration is usually at the repeat adjacent to the leader sequence and results in duplication of the repeat [15,32]. Integration proceeds via two trans-esterification reactions, where the 3′-OH ends of the protospacer attack each end of the CRISPR repeat [23,33]. Host factors, including DNA polymerase I, are then involved in duplicating each repeat [34]. The intermediates of this process have been detected in vivo in E. coli [35].

Although the mechanism of spacer integration is gradually emerging for type I-E systems, our understanding of other types is lacking. Adaptation in type I-F systems is of particular interest, since in this subtype the Cas2 protein is fused with Cas3, the effector nuclease that mediates target degradation [10,21]. This Cas2–3 protein interacts with Cas1 [21], creating an adaptation complex unique to type I-F systems that may help promote the primed mode of spacer acquisition [27].

In the present study we investigate the type I-F Cas1 protein from P. atrosepticum, an important model organism for studying type I-F CRISPR–Cas adaptation [21,27]. Structure determination by X-ray crystallography showed the characteristic Cas1 fold, but also revealed remarkable structural plasticity in a conserved loop unique to type I-F Cas1 proteins, leading to a highly asymmetrical Cas1 dimer. We further show that Cas1 is essential for adaptation in type I-F CRISPR–Cas systems and relies on a conserved aspartic acid residue.

MATERIALS AND METHODS

Culture conditions

All strains and plasmids used in the present study are provided in Table 1 and all oligonucleotides used are shown in Table 2. Unless otherwise stated, P. atrosepticum strains were grown at 25°C and E. coli strains at 37°C in Lysogeny Broth (LB) at 180 rpm or on LB-agar (LBA) plates containing 1.5% (w/v) agar. As required, media were supplemented with antibiotics as follows: ampicillin (Ap; 100 μg/ml), kanamycin (Km; 50 μg/ml), streptomycin (Sm; 50 μg/ml) and tetracycline (Tc; 10 μg/ml).

Table 1
Bacterial strains and plasmids used in the present study
Strain/plasmid Genotype/phenotype Reference 
Escherichia coli 
CC118 λpir araD, Δ(ara, leu), ΔlacZ74, phoA20, galK, thi-1, rspE, rpoB, argE, recA1, λpir [40
DH5α F, ϕ80ΔdlacZM15, Δ(lacZYAargF)U169, endA1, recA1, hsdR17 (rKmK+), deoR, thi-1, supE44, λ, gyrA96, relA1 Gibco/BRL 
HH26 Marker exchange mobilization strain for conjugal transfer [39
Pectobacterium atrosepticum 
PCF80 Δcas::cat derivative of SCRI1043 [42
PCF196 Δcas1, markerless derivative of SCRI1043 This study 
SCRI1043 Wild type (WT) [61
Plasmids 
pBAD30 Arabinose inducible vector, p15a replicon, ApR [44
pBluescript II KS+ Cloning vector, ColE1 replicon, ApR Stratagene 
pKNG101 Suicide vector, sacBR, mobRK2, R6K ori, SmR [36
pNJ5000 Mobilizing plasmid for marker exchange, TcR [39
pPF170 pTRB30 with N-His6-tagged PatCas1, KmR (aka pJSC1) [42
pPF571 pQE-80LoriT-mCherry-derivative, TcR [27
pPF574 pQE-80LoriT-mCherry-derivative, CRISPR1spacer 1 F priming plasmid, TcR [27
pPF565 Δcas1 knockout intermediate, pBluescript II KS+, ApR This study 
pPF568 Δcas1 knockout in pKNG101, SmR This study 
pPF626 pTRB30 with N-His6-tagged PatCas1 D269A, KmR This study 
pPF692 pBAD30 with N-His6-tagged PatCas1, ApR This study 
pPF693 pBAD30 with N-His6-tagged PatCas1 D269A, ApR This study 
pTRB30 pQE-80L (Qiagen) based expression vector, ApR replaced by KmR, KmR [42
Strain/plasmid Genotype/phenotype Reference 
Escherichia coli 
CC118 λpir araD, Δ(ara, leu), ΔlacZ74, phoA20, galK, thi-1, rspE, rpoB, argE, recA1, λpir [40
DH5α F, ϕ80ΔdlacZM15, Δ(lacZYAargF)U169, endA1, recA1, hsdR17 (rKmK+), deoR, thi-1, supE44, λ, gyrA96, relA1 Gibco/BRL 
HH26 Marker exchange mobilization strain for conjugal transfer [39
Pectobacterium atrosepticum 
PCF80 Δcas::cat derivative of SCRI1043 [42
PCF196 Δcas1, markerless derivative of SCRI1043 This study 
SCRI1043 Wild type (WT) [61
Plasmids 
pBAD30 Arabinose inducible vector, p15a replicon, ApR [44
pBluescript II KS+ Cloning vector, ColE1 replicon, ApR Stratagene 
pKNG101 Suicide vector, sacBR, mobRK2, R6K ori, SmR [36
pNJ5000 Mobilizing plasmid for marker exchange, TcR [39
pPF170 pTRB30 with N-His6-tagged PatCas1, KmR (aka pJSC1) [42
pPF571 pQE-80LoriT-mCherry-derivative, TcR [27
pPF574 pQE-80LoriT-mCherry-derivative, CRISPR1spacer 1 F priming plasmid, TcR [27
pPF565 Δcas1 knockout intermediate, pBluescript II KS+, ApR This study 
pPF568 Δcas1 knockout in pKNG101, SmR This study 
pPF626 pTRB30 with N-His6-tagged PatCas1 D269A, KmR This study 
pPF692 pBAD30 with N-His6-tagged PatCas1, ApR This study 
pPF693 pBAD30 with N-His6-tagged PatCas1 D269A, ApR This study 
pTRB30 pQE-80L (Qiagen) based expression vector, ApR replaced by KmR, KmR [42
Table 2
Oligonucleotides used in this study
Name Sequence (5′–3′) Notes Restriction site(s) (underlined) 
PF138 CACACTTTGCTATGCCATAG F for pBAD30 MCS  
PF139 GCTACTGCCGCCAGG R for pBAD30 MCS  
PF209 TCGTCTTCACCTCGAGAAATC F for pQE-80L (and derivatives) MCS  
PF210 GTCATTACTGGATCTATCAACAGG R for pQE-80L (and derivatives) MCS  
PF213 CAACTTAACGTAAAAACAACTTCAGA F for pKNG101 MCS  
PF214 TACACTTCCGCTCAGGTCCTTGTCCT R for pKNG101 MCS  
PF217 CGACGTAAAACGACGGCCAGT F for pBluescript II KS+ MCS  
PF218 GGAAACAGCTATGACCATG R for pBluescript II KS+MCS  
PF247 CGTCCTGCTCACCGAC R for Δcas1 confirmation  
PF292 GCTGGCCACTGTACGATTC F for Δcas1 confirmation  
PF390 AGGTGGATCCATGGATAACGCCTTTAGCC F for His-tagged cas1 BamHI 
PF391 AGGTCTGCAGCAGAATGTTCATCGCACTAC R for cas1 PstI 
PF427 CATAATGTATTTTCTTCCGTAA R cas1 500 bp upstream  
PF441 TTTCTCGAGGGATCCCTCTGTTATTCCCCAACTG F cas1 500 bp upstream XhoI/BamHI 
PF671 AGTTGCATGTGAAAGATGATG F for CRISPR1 leader  
PF741 GCTCTAGAGCTCACCGACAAAGGCCTG R cas1 500 bp downstream XbaI 
PF1342 TTTACGGAAGAAAATACATTATGAACATTCTGCTGATTTC F cas1 500 bp downstream  
PF1435 GCTGGTGTTTGATGTCGCCGCTTTAATTAAAGATGCGCTCGTGC F cas1 D269A mutagenesis  
PF1436 GGCGACATCAAACACCAGC Overlapping R cas1 D269A mutagenesis  
PF1450 GATCCGGCAAACAAACC R for frequently acquired spacer from pPF574  
PF1479 TGGCATCGTTAGAGTGATCGGGCTAC F for CRISPR1 leader  
PF1488 AGGAGCGGGATTCTACAACCCTAATTTC R for CRISPR1 spacer 2  
Name Sequence (5′–3′) Notes Restriction site(s) (underlined) 
PF138 CACACTTTGCTATGCCATAG F for pBAD30 MCS  
PF139 GCTACTGCCGCCAGG R for pBAD30 MCS  
PF209 TCGTCTTCACCTCGAGAAATC F for pQE-80L (and derivatives) MCS  
PF210 GTCATTACTGGATCTATCAACAGG R for pQE-80L (and derivatives) MCS  
PF213 CAACTTAACGTAAAAACAACTTCAGA F for pKNG101 MCS  
PF214 TACACTTCCGCTCAGGTCCTTGTCCT R for pKNG101 MCS  
PF217 CGACGTAAAACGACGGCCAGT F for pBluescript II KS+ MCS  
PF218 GGAAACAGCTATGACCATG R for pBluescript II KS+MCS  
PF247 CGTCCTGCTCACCGAC R for Δcas1 confirmation  
PF292 GCTGGCCACTGTACGATTC F for Δcas1 confirmation  
PF390 AGGTGGATCCATGGATAACGCCTTTAGCC F for His-tagged cas1 BamHI 
PF391 AGGTCTGCAGCAGAATGTTCATCGCACTAC R for cas1 PstI 
PF427 CATAATGTATTTTCTTCCGTAA R cas1 500 bp upstream  
PF441 TTTCTCGAGGGATCCCTCTGTTATTCCCCAACTG F cas1 500 bp upstream XhoI/BamHI 
PF671 AGTTGCATGTGAAAGATGATG F for CRISPR1 leader  
PF741 GCTCTAGAGCTCACCGACAAAGGCCTG R cas1 500 bp downstream XbaI 
PF1342 TTTACGGAAGAAAATACATTATGAACATTCTGCTGATTTC F cas1 500 bp downstream  
PF1435 GCTGGTGTTTGATGTCGCCGCTTTAATTAAAGATGCGCTCGTGC F cas1 D269A mutagenesis  
PF1436 GGCGACATCAAACACCAGC Overlapping R cas1 D269A mutagenesis  
PF1450 GATCCGGCAAACAAACC R for frequently acquired spacer from pPF574  
PF1479 TGGCATCGTTAGAGTGATCGGGCTAC F for CRISPR1 leader  
PF1488 AGGAGCGGGATTCTACAACCCTAATTTC R for CRISPR1 spacer 2  

Construction of a cas1 deletion mutant

The Δcas1 markerless deletion mutant was created using allelic exchange mutagenesis based on a method described previously [36,37]. Overlapping flanks of cas1 were created by overlap-extension PCR [38] using PF441 + PF427 and PF1342 + PF741 for upstream and downstream fragments respectively. The overlap product was generated using primers PF441 + PF741 and the two PCR products as template. The resulting product was digested with BamHI and XbaI and ligated into pBluescript II KS+. E. coli DH5α was transformed with this construct (designated as pPF565). The sequence was confirmed using primers PF217 + PF218 before transferring the overlap-product into the suicide vector pKNG101 [36] at the same restriction enzyme sites (BamHI and XbaI). E. coli CC118λpir was transformed with the pKNG101 ligation construct, giving pPF568 and its sequence was confirmed using primers PF213 + PF214. Tri-parental mating using the helper strain E. coli HH26 (pNJ5000) [39] was performed to conjugate pPF568 from the E. coli CC118λpir donor [40] to the wild type (WT) P. atrosepticum recipient. After mating on LBA, single cross-over integrants were isolated on minimal agar containing 0.2% w/v glucose and Sm. Integrants were then grown in LB without selection before plating on 10% w/v sucrose minimal medium to select for the second cross-over event. Mutants were confirmed by PCR using primer pairs PF292 + PF247 and PF390 + PF391 as shown in Figure 1(a) and by sequencing.

PatCas1 is required for spacer acquisition in vivo

Figure 1
PatCas1 is required for spacer acquisition in vivo

(a) Creation of the markerless Δcas1 mutant (PCF196) by allelic exchange mutagenesis. The mutant was confirmed by PCR and sequencing using primers external to cas1 (PF247 and PF292) and internal to cas1 (PF390 and PF391), as indicated on the schematics of the cas operon. (b) Loss of the primed plasmid (pPrimed; pPF574) containing an mCherry gene and a protospacer with a mutant PAM was measured in the WT and Δcas1 mutant as described previously [27]. Data shown are the means ± S.D. (n=3). (c) Spacer acquisition in the WT and Δcas1 mutants shown in (b) was detected by PCR amplification of the CRISPR1 array as depicted schematically and described previously [27]. As a control, spacer acquisition from a plasmid (pNaïve; pPF571) lacking any protospacer was also measured in the WT.

Figure 1
PatCas1 is required for spacer acquisition in vivo

(a) Creation of the markerless Δcas1 mutant (PCF196) by allelic exchange mutagenesis. The mutant was confirmed by PCR and sequencing using primers external to cas1 (PF247 and PF292) and internal to cas1 (PF390 and PF391), as indicated on the schematics of the cas operon. (b) Loss of the primed plasmid (pPrimed; pPF574) containing an mCherry gene and a protospacer with a mutant PAM was measured in the WT and Δcas1 mutant as described previously [27]. Data shown are the means ± S.D. (n=3). (c) Spacer acquisition in the WT and Δcas1 mutants shown in (b) was detected by PCR amplification of the CRISPR1 array as depicted schematically and described previously [27]. As a control, spacer acquisition from a plasmid (pNaïve; pPF571) lacking any protospacer was also measured in the WT.

In vivo plasmid loss assays

The P. atrosepticum WT strain was transformed with the control naïve plasmid (pPF571). In addition, both the WT and the Δcas1 strains were transformed with the primed plasmid (pPF574) as described previously [41]. Following growth in LB with Tc, the transformed strains were frozen at −80°C in 25% (v/v) glycerol then used as inoculum for the plasmid loss assay. Five millilitres of non-selective LB was inoculated with each strain and grown for 24 h at 25°C with shaking (160 rpm). After 24 h incubation, 5 μl of each culture was transferred to 5 ml fresh LB and −80°C glycerol culture stocks were prepared. In addition, a dilution of each culture was spread on LBA containing 1 mM IPTG to induce mCherry expression and plates were incubated for 2 days at 25°C until colonies were visible. Data were collected at 0, 7 and 14 days. The percentage plasmid loss was calculated by determining the ratio of white (no mCherry plasmid) colonies compared with total number of colonies. Experiments were performed in triplicate.

Detection of general CRISPR1 expansion using PCR

Spacer acquisition was assessed by PCR using primers specific for CRISPR1. The OD600 of the −80°C stocks was measured and a volume corresponding to 1 ml of an OD600=2.6 was pooled, pelleted by centrifugation and resuspended in 30 μl water. Five microlitres were used as a template for the subsequent PCR using primers PF1479 + PF1488 and Phusion High-Fidelity DNA Polymerase (ThermoScientific). The resulting products were separated on 3% TAE agarose gels stained with ethidium bromide.

Site-directed mutagenesis of Cas1

The construction of pPF170 (previously named pJSC1) has been described [42]. A single D269A point mutation was introduced into pPF170 by overlap-extension PCR [43]. Firstly, the plasmid pPF170 was used as a template in a PCR using primers PF391 and PF1435, the later containing a single mismatch to introduce the desired mutation. A second product was amplified with primers PF390 and PF1436. The two purified products were used as template and amplified by overlap-extension PCR with primers PF390 and PF391, giving a full length Cas1 insert (1009 bp) containing the desired D269A mutation. This was cloned into the pTRB30 vector using PstI and BamHI and introduced into E. coli DH5α by transformation. The plasmid (pPF626) was sequenced using primers PF209 + PF210, and introduced into P. atrosepticum PCF80 by electroporation for protein expression [41].

Detection of specific spacer acquisition into CRISPR1 using PCR

To perform PatCas1 complementation and PatCas1 D269A in vivo assays, a dual plasmid priming assay was required. The existing expression plasmids for PatCas1 (pPF170) and the D269A mutant (pPF626) were incompatible with the priming plasmid, pPF574. Therefore, the genes encoding His-tagged PatCas1 and PatCas1 D269A were subcloned from pPF170 and pPF626 respectively, following digestion with EcoRI and HindIII and ligated into pBAD30 [44]. The resulting PatCas1 (pPF692) and PatCas1 D269A (pPF693) plasmids were confirmed with primers PF138 + PF139. P. atrosepticum WT and Δcas1 strains harbouring the primed plasmid (pPF574) were then transformed with pBAD30, pPF692 or pPF693 as described previously [41]. Transformants were confirmed by PCR, inoculated into 5 ml LB with Ap, Tc, and 0.2% (w/v) glucose and incubated at 25°C with shaking (180 rpm). After 24 h incubation, genomic DNA (gDNA) was prepared from 0.5 ml of culture (Day 0 sample). The culture was also diluted into 5 ml fresh LB with Ap and 0.1% (w/v) arabinose to give a starting OD600 of 0.01 and incubated at 25°C for 24 h. The culture was passaged similarly until Day 5, then gDNA was prepared similarly to the Day 0 sample. To detect spacer acquisition, 100 ng of gDNA from Days 0 and 5 were analysed by PCR with primers PF671 and PF1450 and Taq DNA polymerase (Roche). The resulting products were separated on 2% TAE agarose gels stained with ethidium bromide.

Cas1 expression and purification

His-tagged Cas1 was expressed from P. atrosepticum PCF80 (Δcas) harbouring pPF170. Bacteria were grown at 25°C at 180 rpm in LB supplemented with 50 μg/ml kanamycin. At an OD600 of 0.5, cells were induced with 1 mM (final concentration) IPTG, incubated overnight at 16°C at 180 rpm, and then harvested by centrifugation (4°C, 3840 g for 10 min). Cell pellets were resuspended in buffer A (300 mM NaCl, 5% (v/v) glycerol, 10 mM Tris/HCl pH 8.0) supplemented with 20 mM imidazole, DNase I, lysozyme, 0.1 mM phenylmethylsulfonyl fluoride, and a protease inhibitor cocktail (Roche). Cells were lysed by sonication and the lysate clarified by centrifugation at 9820 g for 30 min. Cell lysate was loaded on to a HisTrap FF Crude column (GE Healthcare) equilibrated with buffer A and a linear gradient of 20–500 mM imidazole was applied to eluted Cas1. Fractions found to contain the purified protein were pooled and further purified by size-exclusion chromatography on a Superdex 200 column (GE Healthcare) before storage at 4°C in buffer A.

Crystallization

Purified His-tagged Cas1 was concentrated to 10 mg/ml using a 10 kDa cut-off Vivaspin 20 (GE Healthcare). Hanging drops were set up against a reservoir containing 800 μl of 100 mM Tris/HCl pH 8.6, 10% (w/v) polyethylene glycol 6000, 0.1 mM SrCl2 by mixing 1 μl of protein solution with 1 μl of reservoir solution and incubated at 18°C. Crystals grew within 2 days and were cryoprotected by soaking in mother liquor supplemented with 15% (v/v) glycerol for 1–2 min. Crystals were flash-frozen in liquid nitrogen for data collection. Incubation of the protein at 4°C for approximately 3 weeks prior to creation of hanging drops was required for optimal crystallization. Analysis of the crystallized protein by SDS-PAGE suggested that limited truncation had taken place (results not shown); this was supported by the lack of observable electron density for residues 1–11 in the final structure.

Data collection and processing

A data set was collected at the Australian Synchrotron on the MX1 beamline equipped with an ADSC Quantum 210r detector. Data were collected at 13 keV with an oscillation angle of 0.5° and an exposure time of 2 s per image. Diffraction images were integrated using XDS [45]. The integrated dataset was merged and scaled using Aimless within the CCP4 suite [46].

Structure determination and refinement

The Cas1 structure was solved by molecular replacement with Phaser [47] using a dimer of Pseudomonas aeruginosa Cas1 (PDB ID: 3GOD) as the search model, to which it has a sequence identity of 65%. The model was refined by cycles of manual rebuilding in Coot [48] and maximum-likelihood refinement in REFMAC5 [49] and PHENIX [50] with non-crystallographic symmetry restraints. Solvent was modelled into spherical density if hydrogen bonds could be formed to atoms between 2.2 and 3.5 Å (1 Å=0.1 nm) away. Flexible loop regions were initially omitted from the model while the structure was refined, and were then built conservatively over many cycles of subsequent refinement. Structural figures were prepared using PyMOL [51] and UCSF Chimera [52].

Accession codes

Coordinates and structure factors for PatCas1 have been deposited in the Protein Data Bank under accession code 5FCL.

RESULTS

Cas1 is essential for spacer acquisition in vivo in type I-F CRISPR–Cas systems

Although Cas1 has been shown to function in spacer acquisition in type I-B, I-E, and II-A CRISPR–Cas systems [15,5355], an analogous role has not been shown for type I-F systems. In P. atrosepticum we constructed a markerless deletion mutant of cas1 by allelic exchange mutagenesis using a double recombination strategy and confirmed the mutant by PCR and sequencing (Figure 1a). To assess whether this mutant was impaired for adaptation, we used a high-efficiency assay to measure spacer acquisition during priming [27]. In this assay, P. atrosepticum is supplied with a plasmid containing a sequence complementary to an existing spacer (spacer 1) in the CRISPR1 array (the most adaptive of three I-F arrays in this strain), but with a single nt polymorphism in the PAM. We have shown previously that this plasmid promotes rapid primed acquisition of new spacers from the plasmid that lead to interference and plasmid loss [27]. In the wild type strain, 10–15% plasmid loss was observed by 7 days, whereas less than 5% was detected in the Δcas1 mutant (Figure 1b). By 14 days there was still a substantial decrease in plasmid loss in the Δcas1 mutant compared with the wild type. When a PCR of CRISPR1 was performed, expanded CRISPR arrays were detected in the wild type, representing multiple spacer acquisition events (Figure 1c). In contrast, no spacer acquisition was observed in the Δcas1 mutant (Figure 1c). As a further control, a plasmid lacking a priming protospacer (pNaïve) was also tested in the WT and no spacer acquisition was detected (Figure 1c). Therefore, cas1 is essential for adaptation in the type I-F system of P. atrosepticum in vivo.

Crystal structure of P. atrosepticum Cas1

The structure of a type I-F Cas1 protein from P. aeruginosa has been solved previously, but is missing density in a conserved loop region (residues 101–112, detailed below) [17]. To increase our structural understanding of spacer acquisition by type I-F CRISPR–Cas systems, we therefore solved the structure of P. atrosepticum Cas1 (PatCas1) by X-ray crystallography to 2.7 Å resolution. Data collection and refinement statistics are shown in Table 3. The asymmetric unit contains six Cas1 molecules, with generally clear electron density observed for residues 12–325 of each 326 residue chain and some missing side chains. In some chains, two loops (residues 41–47 and 215–219) could not be confidently modelled and one region (residues 98–114) was also less clear (described below). The six chains were assembled into three homodimers. Since these dimers superimpose well (Cα RMSD of 0.20–0.34 Å), the dimer corresponding to chains A and B will be presented, as its main chain could be modelled without breaks from residues 12 to 325.

Table 3
X-ray data collection and refinement statistics
 PatCas1 
Data collection*  
Space group C
Cell dimensions  
   a, b, c (Å) 261.4, 164.1, 78.7 
   β (°) 90.1 
Resolution range (Å) 46.3–2.70 (2.75–2.70) 
Total observations 290,958 (15,274) 
Unique reflections 86,942 (4658) 
Rmerge (%) 12.7 (61.9) 
Rpim (%) 8.1 (40.0) 
CC1/2 0.989 (0.651) 
I / σI 6.0 (1.6) 
Completeness (%) 95.8 (97.4) 
Redundancy 3.3 (3.3) 
Refinement 
Resolution (Å) 44.3–2.70 
Rwork / Rfree (%) 18.8 / 23.2 
No. atoms  
   Protein (1861 residues) 14,133 
   Water 209 
<B-factors> (Å2 
   Protein 52.6 
   Water 45.2 
r.m.s. deviations  
   Bond lengths (Å) 0.003 
   Bond angles (°) 0.52 
Ramachandran plot  
   Favoured region (%) 98.2 
   Outliers (%) 0.1 
 PatCas1 
Data collection*  
Space group C
Cell dimensions  
   a, b, c (Å) 261.4, 164.1, 78.7 
   β (°) 90.1 
Resolution range (Å) 46.3–2.70 (2.75–2.70) 
Total observations 290,958 (15,274) 
Unique reflections 86,942 (4658) 
Rmerge (%) 12.7 (61.9) 
Rpim (%) 8.1 (40.0) 
CC1/2 0.989 (0.651) 
I / σI 6.0 (1.6) 
Completeness (%) 95.8 (97.4) 
Redundancy 3.3 (3.3) 
Refinement 
Resolution (Å) 44.3–2.70 
Rwork / Rfree (%) 18.8 / 23.2 
No. atoms  
   Protein (1861 residues) 14,133 
   Water 209 
<B-factors> (Å2 
   Protein 52.6 
   Water 45.2 
r.m.s. deviations  
   Bond lengths (Å) 0.003 
   Bond angles (°) 0.52 
Ramachandran plot  
   Favoured region (%) 98.2 
   Outliers (%) 0.1 

*A single crystal was used for this structure. †Values in parentheses are for highest-resolution shell.

The PatCas1 monomers have the same characteristic fold as other known Cas1 structures [17,18,56], consisting of an N-terminal dimerization domain (NTD) and a C-terminal integrase domain (Figures 2a and 2b, Supplementary Figure S1A). The NTD consists of two beta sheets spaced by an alpha helix, enclosing a hydrophobic core. The integrase domain forms a large lobe of seven helices connected to the NTD by a flexible linker. The monomers dimerize along a pseudo two-fold symmetry axis by beta augmentation, with the two beta sheets in each NTD forming two continuous beta sheets across the whole dimer. The orientation of the integrase domain relative to the NTD differs by 36° between the two monomers in the dimer (Figure 2c). Additionally, a channel between the integrase domain and NTD is present in only one monomer (Figure 2d). This channel has recently been shown in the E. coli type I-E Cas1 to bind the ssDNA flank of the protospacer and presumably cleave it next to the PAM sequence [29,31]. The structural conservation suggests that this channel in the type I-F proteins is performing a similar ssDNA-binding and catalytic role. Therefore, we term the channel-forming monomer ‘catalytic’ (chain A) and the other ‘non-catalytic’ (chain B).

Overall fold and assembly of PatCas1

Figure 2
Overall fold and assembly of PatCas1

(a) The domain structure of PatCas1. (b) A PatCas1 dimer (chains A and B), coloured according to (a). (c) Chains A and B superimposed by their N-terminal dimerization domains (rear of figure) with the rigid-body rotation of the integrase domains visible. The rotation angle was calculated from the rotation matrix following least-squares superposition of the integrase domains. (d) Surface representation of the PatCas1 dimer coloured according to (a), rotated with respect to (b) as indicated.

Figure 2
Overall fold and assembly of PatCas1

(a) The domain structure of PatCas1. (b) A PatCas1 dimer (chains A and B), coloured according to (a). (c) Chains A and B superimposed by their N-terminal dimerization domains (rear of figure) with the rigid-body rotation of the integrase domains visible. The rotation angle was calculated from the rotation matrix following least-squares superposition of the integrase domains. (d) Surface representation of the PatCas1 dimer coloured according to (a), rotated with respect to (b) as indicated.

The PatCas1 active site is essential for spacer acquisition in vivo

In the catalytic and non-catalytic integrase domain structures we observed a similar arrangement of active site residues to other Cas1 enzymes, consisting of a conserved triad of Glu191, His255 and Asp269 residues (Figures 3a–3c, Supplementary Figure S1B). The analogous residues in the E. coli type I-E Cas1 are known to be essential for the integrase activity of the Cas1–Cas2 complex in vitro and in vivo [15,23,33]. In vitro, this integrase reaction is metal-dependent [23,33], and structural data suggest these residues coordinate manganese or magnesium ions [17,31].

The conserved active site of PatCas1

Figure 3
The conserved active site of PatCas1

(a) A PatCas1 dimer with the location of the active sites indicated. The conserved active site residues for the non-catalytic (b) and catalytic (c) monomers are shown in stick form. The putative metal-binding site (purple circle M2+) is based on the structure of P. aeruginosa Cas1 [17]. Distances between the modelled metal and active site residues are indicated by dotted lines. The protospacer (c, light blue) is modelled in by least-squares superposition of the integrase domain of E. coli Cas1 bound to DNA [29], with no adjustment of the DNA. (d) In vivo acquisition of a highly-acquired spacer from a primed plasmid (pPrimed; pPF574) [27] was measured in the WT and Δcas1 mutant by PCR. The strains were complemented with pBAD30 (vector) or pCas1 (Cas1 in pBAD30; pPF692) or pCas1 D269A (Cas1 D269A in pBAD30; pPF693).

Figure 3
The conserved active site of PatCas1

(a) A PatCas1 dimer with the location of the active sites indicated. The conserved active site residues for the non-catalytic (b) and catalytic (c) monomers are shown in stick form. The putative metal-binding site (purple circle M2+) is based on the structure of P. aeruginosa Cas1 [17]. Distances between the modelled metal and active site residues are indicated by dotted lines. The protospacer (c, light blue) is modelled in by least-squares superposition of the integrase domain of E. coli Cas1 bound to DNA [29], with no adjustment of the DNA. (d) In vivo acquisition of a highly-acquired spacer from a primed plasmid (pPrimed; pPF574) [27] was measured in the WT and Δcas1 mutant by PCR. The strains were complemented with pBAD30 (vector) or pCas1 (Cas1 in pBAD30; pPF692) or pCas1 D269A (Cas1 D269A in pBAD30; pPF693).

To determine if the active site is involved in spacer acquisition in P. atrosepticum, we used the same priming plasmid loss assay described above, but introduced a plasmid expressing either wild type PatCas1 or a D269A PatCas1 mutant into the Δcas1 P. atrosepticum strain. PCR screening of the P. atrosepticum CRISPR array 1 with primers specific for a known highly-acquired spacer showed that the Δcas1 mutant was defective for spacer acquisition (Figure 3d). This is consistent with the related results in Figure 1(c) where generic CRISPR array expansion was assessed. Complementation with a vector expressing wild type PatCas1 restored primed spacer acquisition activity, demonstrating that the lack of Cas1 in the Δcas1 mutant was responsible for the defective adaptation. In contrast, the D269A PatCas1-expressing vector did not restore spacer acquisition. This result demonstrates that Asp269 of Cas1 is essential for spacer acquisition during adaptation in the type I-F CRISPR–Cas system.

Type I-F Cas1 proteins have unique asymmetries

A striking feature of the PatCas1 structure is the asymmetry of the NTDs: the topology of each beta sheet is different between dimeric partners. The first example of this is the region from residues 41 to 57. In the catalytic monomer, this region forms an extended loop connecting the two beta sheets between beta strands three and four, whereas in the non-catalytic monomer the region forms a fourth beta strand (β3′) before connecting to the second beta sheet (Figure 4a and Supplementary Figure S2). The second example of asymmetric topology in the NTDs involves the seventh beta strand, which forms the dimer interface in the upper beta sheet, and its connected linker to the integrase domain. Remarkably, the seventh beta strand is formed by residues 103–107 in the non-catalytic monomer and by residues 93–95 in the catalytic monomer. As a result, the linker between the NTD and integrase domain is much more extended in the catalytic monomer, whereas the loop on the other side of the NTD, between the sixth and seventh beta strands, is more extended in the non-catalytic monomer (Figure 4b and Supplementary Figure S2). The net result of this arrangement is that one side of the PatCas1 dimer has two prominent protruding loops (the linker between the NTD and the integrase domain, and the loop between the sixth and seventh beta strands), whereas the other side has no protruding loops (Figures 4c–4f). The NTD-integrase domain linker is thus capable of remarkable structural plasticity, in which different residues in each monomer contribute to the same secondary structure. In addition, the same numbered residues contribute to different secondary structures in each monomer.

Asymmetrical elements of PatCas1

Figure 4
Asymmetrical elements of PatCas1

(a) The bottom of the Cas1 N-terminal dimerization domain. Residues 41–57 for each monomer are coloured red. (b) The NTDs of the catalytic (light blue) and non-catalytic (light orange) monomers were superimposed. The NTDs are shown along with helix 1 of the integrase domain. The residues of loops displaying structural plasticity are labelled. Details of the molecular contacts of the loop from residues 108 to 115 or from 96 to 115 are shown in (c) and (d) respectively. (e, f) Two views separated by 180o of the PatCas1 dimer with asymmetrical loops highlighted in red. The location of the active site and of the views shown in (c) and (d) are indicated. (g, h) The electrostatic potential of the PatCas1 dimer was calculated using the Adaptive Poisson–Boltzmann Solver [62] using the default settings and then used to colour the surface as indicated. Protospacer DNA (orange) was modelled based on least-squares superposition of the structure of E. coli Cas1 bound to DNA [29].

Figure 4
Asymmetrical elements of PatCas1

(a) The bottom of the Cas1 N-terminal dimerization domain. Residues 41–57 for each monomer are coloured red. (b) The NTDs of the catalytic (light blue) and non-catalytic (light orange) monomers were superimposed. The NTDs are shown along with helix 1 of the integrase domain. The residues of loops displaying structural plasticity are labelled. Details of the molecular contacts of the loop from residues 108 to 115 or from 96 to 115 are shown in (c) and (d) respectively. (e, f) Two views separated by 180o of the PatCas1 dimer with asymmetrical loops highlighted in red. The location of the active site and of the views shown in (c) and (d) are indicated. (g, h) The electrostatic potential of the PatCas1 dimer was calculated using the Adaptive Poisson–Boltzmann Solver [62] using the default settings and then used to colour the surface as indicated. Protospacer DNA (orange) was modelled based on least-squares superposition of the structure of E. coli Cas1 bound to DNA [29].

The topological difference is associated with the rotation of the integrase domain. In the non-catalytic monomer, Trp243 from the integrase domain is accommodated by a pocket formed by Pro107 and Arg112 of the linker and Met81 of the NTD (Figure 4c). In the catalytic monomer, the integrase domain is rotated (see Figure 2c) and Trp243 is no longer within this pocket, releasing the constraints on Pro107 and Arg112 of the linker. The linker is therefore able to form an extended loop instead, in which Tyr111 is oriented to fill the gap left by Trp243 (Figure 4d). Intriguingly, these linker residues form part of a PQSEYR motif found exclusively in type I-F Cas1 proteins (Supplementary Figure S3). Density for this conserved linker region was not observed in the P. aeruginosa type I-F Cas1 structure [17], suggesting this loop is flexible. Consistent with this, we observed weaker (but traceable) density for the loop in the second dimer of the asymmetric unit, whereas in the third dimer density for the loop was absent (Supplementary Figure S4). This structure of PatCas1 thus provides the first insight into the details of the asymmetry in this conserved loop.

Interestingly, this asymmetric dimer arrangement involving the two protruding loops results in a difference in electrostatic surface charge between the two large faces of the homodimer (Figures 4g and 4h). Whereas the face without the loops has a positively charged groove on the catalytic monomer leading into the active site, the face with the loops has a significantly negatively charged surface, with occluded access to the active site cavity of the non-catalytic monomer. The positively charged groove is similar in size and shape to the DNA-binding groove of E. coli type I-E Cas1 [29,31]. By structural superposition we were therefore able to model protospacer ssDNA into this groove (Figures 3c and 4g). Apo-PatCas1 therefore may have a pre-formed DNA-binding site. Conversely, the negatively charged face of PatCas1 (Figure 4h), could not accommodate modelled ssDNA without severe steric hindrance. This surface charge polarization seems to be unique to type I-F Cas1 enzymes, as all other known Cas1 structures are uniformly positively charged (Supplementary Figure S5).

DISCUSSION

In the present study, we showed that Cas1 is essential for spacer acquisition in the type I-F CRISPR–Cas system of P. atrosepticum and revealed characteristic structural features of Cas1 by X-ray crystallography. The finding that Cas1 is essential for adaptation is consistent with previous studies in other CRISPR types [15,16,5355]. To our knowledge, we have provided the first demonstration of the essentiality of type I-F Cas1 for adaptation in a native host. In agreement, during the preparation of our manuscript, Cas1 was shown to be required for adaptation in a type I-F system using a heterologous overexpression system in E. coli [57]. In type I-E systems Cas1 is known to be the integrase in the Cas1–Cas2 integration complex [23], and our finding probably reflects an analogous role in the type I-F Cas1–Cas2–3 complex [21].

Consistent with Cas1 acting as an integrase during adaptation, we demonstrated that a highly conserved Asp269 residue was essential for spacer acquisition. Alanine mutagenesis of the equivalent residue in other Cas1 enzymes abolished various Cas1 enzymatic activities in vitro and in vivo [15,17,18,22,23,57]. The residue is part of a triad of residues that in P. aeruginosa Cas1 were shown to chelate manganese ions [17], and spacer integration is known to be metal-dependent [23,33]. Although our structure did not have metals modelled into the active site, we noted that in the ‘non-catalytic’ monomer the catalytic residues are in more favourable proximity for chelation of magnesium or manganese [58,59], whereas in the catalytic monomer the residues are considerably further apart (Figures 3b and 3c). These distances are similar to those in the structure of P. aeruginosa Cas1. Interestingly in the P. aeruginosa Cas1 structure, which was solved after soaking crystals with manganese ions, metal density was observed in both catalytic and non-catalytic active sites, but was considerably weaker in the catalytic active site. This suggests the more open catalytic active site may bind metals more transiently. In agreement, in the structure of the type I-E Cas1–Cas2 complex bound to protospacer DNA, magnesium was only observed in the non-catalytic monomer [31].

In type I-F systems, spacers are typically 32 bp [27,30,60], compared with the 33 bp spacers in type I-E systems [25]. Protospacer size selection in type I-E systems was recently shown to operate via a ruler mechanism, with Cas2 measuring the 23 bp dsDNA core and Cas1 measuring the 5 nt ssDNA flanks of the protospacer [29,31]. In our structure, we found that the ssDNA-binding channel was very similar in size to that of type I-E Cas1 (Figures 2d and 4g). The conserved channel size suggests that the type I-F protospacer has 5 nt ssDNA flanks like I-E, and that a 22 bp dsDNA core is measured by the more divergent Cas2–3 protein. However, there is likely to be some flexibility, since 33 nt spacers are acquired ∼10% of the time in the P. atrosepticum I-F system [27].

Structurally, the most notable feature of the PatCas1 homodimer is its asymmetry. The asymmetry is due to structural plasticity, where identical polypeptide sequences in the NTDs are capable of forming different secondary structures in each monomer (Figure 4). Of the seven Cas1 structures in the Protein Data Bank, only the type I-F Cas1 from P. aeruginosa [17] and a distantly-related derivative encoded by a phage (PDB ID 4W8K) showed such plasticity, suggesting this is a characteristic feature of type I-F Cas1 proteins. In support of this, the residues with the most dramatic plasticity in our structure (residues 41–57 and 86–115) are conserved in type I-F Cas1 enzymes and absent from other Cas1 families (Supplementary Figure S3). Significantly, in our study we were able to resolve the loop consisting of residues 98–115 for the first time, showing the structural basis for this plasticity (Figures 4c and 4d) and revealing its profound effects on the electrostatic surface potential of the enzyme.

These unique structural features could reflect unique aspects of adaptation in type I-F CRISPR–Cas systems. In particular, in most CRISPR–Cas systems Cas1 probably interacts with the small ∼12 kDa Cas2 protein, whereas in I-F systems it interacts with the much larger 125 kDa Cas2–3 fusion protein [21]. It is possible that the polarized electrostatic surface of type I-F Cas1 could help mediate interactions with this large partner protein. Alternatively, the charge polarization could be involved in guiding negatively-charged DNA to the correct (protospacer-binding) monomer of the Cas1 dimer (Figures 4g and 4h). It is also feasible that this surface is involved in proposed interactions of the whole Cas1–Cas2–3 complex with the Csy–crRNA CRISPR interference complex during primed spacer acquisition [21,27]. Further biochemical and structural studies of the Cas1–Cas2–3 complex will help resolve the mechanism of spacer acquisition in type I-F CRISPR–Cas systems.

AUTHOR CONTRIBUTION

Max Wilkinson, Raymond Staals, Sebastian Kieper, Rebecca McKenzie, Peter Fineran and Kurt Krause designed the experiments. Max Wilkinson, Yoshio Nakatani, Sebastian Kieper, Helen Opel-Reading and Rebecca McKenzie performed the experiments. Max Wilkinson, Yoshio Nakatani, Raymond Staals, Sebastian Kieper, Rebecca McKenzie, Peter Fineran and Kurt Krause analysed the data. Max Wilkinson drafted the manuscript with input from Yoshio Nakatani, Raymond Staals, Peter Fineran and Kurt Krause.

We thank Yanli Wang for providing structural data for the E. coli Cas1–Cas2 protospacer complex, Corinna Richter and Sylvia Luckner for initial Cas1 expression and purification studies, Sinothai Poen for important input into the structure, and other members of the Krause and Fineran laboratories for helpful discussions. This research was undertaken in part on the MX1 beamline at the Australian Synchrotron, Victoria, Australia.

FUNDING

This work was supported by the University of Otago (to P.C.F. and K.L.K.); the Royal Society of New Zealand (to P.C.F. and K.L.K); the Health Research Council (to Y.N.); the Maurice Wilkins Centre (to Y.N.); the University of Otago Division of Health Sciences (to R.H.J.S.); and the Otago School of Medical Sciences (to M.E.W.).

Abbreviations

     
  • Cas

    CRISPR associated

  •  
  • Cascade

    CRISPR associated complex for antiviral defence

  •  
  • CRISPR

    clustered regularly interspaced short palindromic repeat

  •  
  • crRNA

    CRISPR RNA

  •  
  • LB

    Lysogeny Broth

  •  
  • PAM

    protospacer adjacent motif

  •  
  • NTD

    N-terminal dimerization domain

  •  
  • WT

    wild type

References

References
1
Bergh
O.
Borsheim
K.Y.
Bratbak
G.
Heldal
M.
High abundance of viruses found in aquatic environments
Nature
1989
, vol. 
340
 (pg. 
467
-
468
)
[PubMed]
2
Wommack
K.E.
Colwell
R.R.
Virioplankton: viruses in aquatic ecosystems
Microbiol. Mol. Biol. Rev.
2000
, vol. 
64
 (pg. 
69
-
114
)
[PubMed]
3
Salmond
G.P.
Fineran
P.C.
A century of the phage: past, present and future
Nat. Rev. Microbiol.
2015
, vol. 
13
 (pg. 
777
-
786
)
[PubMed]
4
Lima-Mendez
G.
Toussaint
A.
Leplae
R.
Analysis of the phage sequence space: The benefit of structured information
Virology
2007
, vol. 
365
 (pg. 
241
-
249
)
[PubMed]
5
Dy
R.L.
Richter
C.
Salmond
G.P.C.
Fineran
P.C.
Remarkable mechanisms in microbes to resist phage infections
Ann. Rev. Virol.
2014
, vol. 
1
 (pg. 
307
-
331
)
6
Samson
J.E.
Magadan
A.H.
Sabri
M.
Moineau
S.
Revenge of the phages: defeating bacterial defences
Nat. Rev. Micriobiol.
2013
, vol. 
11
 (pg. 
675
-
687
)
7
van der Oost
J
Westra
E.R.
Jackson
R.N.
Wiedenheft
B.
Unravelling the structural and mechanistic basis of CRISPR-Cas systems
Nat. Rev. Microbiol.
2014
, vol. 
12
 (pg. 
479
-
492
)
[PubMed]
8
Fineran
P.C.
Charpentier
E.
Memory of viral infections by CRISPR-Cas adaptive immune systems: acquisition of new information
Virology
2012
, vol. 
434
 (pg. 
202
-
209
)
[PubMed]
9
Grissa
I.
Vergnaud
G.
Pourcel
C.
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats
BMC Bioinformatics
2007
, vol. 
8
 pg. 
172
 
[PubMed]
10
Makarova
K.S.
Wolf
Y.I.
Alkhnbashi
O.S.
Costa
F.
Shah
S.A.
Saunders
S.J.
Barrangou
R.
Brouns
S.J.
Charpentier
E.
Haft
D.H.
, et al. 
An updated evolutionary classification of CRISPR-Cas systems
Nat. Rev. Microbiol.
2015
, vol. 
13
 (pg. 
722
-
736
)
[PubMed]
11
Shmakov
S.
Abudayyeh
O.O.
Makarova
K.S.
Wolf
Y.I.
Gootenberg
J.S.
Semenova
E.
Minakhin
L.
Joung
J.
Konermann
S.
Severinov
K.
, et al. 
Discovery and functional characterization of diverse class 2 CRISPR-Cas systems
Mol. Cell
2015
, vol. 
60
 (pg. 
385
-
397
)
[PubMed]
12
Plagens
A.
Richter
H.
Charpentier
E.
Randau
L.
DNA and RNA interference mechanisms by CRISPR-Cas surveillance complexes
FEMS Microbiol. Rev.
2015
, vol. 
39
 (pg. 
442
-
463
)
[PubMed]
13
Jackson
R.N.
Wiedenheft
B.
A conserved structural chassis for mounting versatile CRISPR RNA-Guided immune responses
Mol. Cell
2015
, vol. 
58
 (pg. 
722
-
728
)
[PubMed]
14
Barrangou
R.
Fremaux
C.
Deveau
H.
Richards
M.
Boyaval
P.
Moineau
S.
Romero
D.A.
Horvath
P.
CRISPR provides acquired resistance against viruses in prokaryotes
Science
2007
, vol. 
315
 (pg. 
1709
-
1712
)
[PubMed]
15
Yosef
I.
Goren
M.G.
Qimron
U.
Proteins and DNA elements essential for the CRISPR adaptation process in Escherichia coli
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
5569
-
5576
)
[PubMed]
16
Datsenko
K.A.
Pougach
K.
Tikhonov
A.
Wanner
B.L.
Severinov
K.
Semenova
E.
Molecular memory of prior infections activates the CRISPR/Cas adaptive bacterial immunity system
Nat. Commun.
2012
, vol. 
3
 pg. 
945
 
[PubMed]
17
Wiedenheft
B.
Zhou
K.
Jinek
M.
Coyle
S.M.
Ma
W.
Doudna
J.A.
Structural basis for DNase activity of a conserved protein implicated in CRISPR-mediated genome defense
Structure
2009
, vol. 
17
 (pg. 
904
-
912
)
[PubMed]
18
Babu
M.
Beloglazova
N.
Flick
R.
Graham
C.
Skarina
T.
Nocek
B.
Gagarinova
A.
Pogoutse
O.
Brown
G.
Binkowski
A.
, et al. 
A dual function of the CRISPR-Cas system in bacterial antivirus immunity and DNA repair
Mol. Microbiol.
2011
, vol. 
79
 (pg. 
484
-
502
)
[PubMed]
19
Beloglazova
N.
Brown
G.
Zimmerman
M.D.
Proudfoot
M.
Makarova
K.S.
Kudritska
M.
Kochinyan
S.
Wang
S.
Chruszcz
M.
Minor
W.
, et al. 
A novel family of sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats
J. Biol. Chem.
2008
, vol. 
283
 (pg. 
20361
-
20371
)
[PubMed]
20
Nam
K.H.
Ding
F.
Haitjema
C.
Huang
Q.
DeLisa
M.P.
Ke
A.
Double-stranded endonuclease activity in Bacillus halodurans clustered regularly interspaced short palindromic repeats (CRISPR)-associated Cas2 protein
J. Biol. Chem.
2012
, vol. 
287
 (pg. 
35943
-
35952
)
[PubMed]
21
Richter
C.
Gristwood
T.
Clulow
J.S.
Fineran
P.C.
In vivo protein interactions and complex formation in the Pectobacterium atrosepticum subtype I-F CRISPR/Cas System
PloS One
2012
, vol. 
7
 pg. 
e49549
 
[PubMed]
22
Nuñez
J.K.
Kranzusch
P.J.
Noeske
J.
Wright
A.V.
Davies
C.W.
Doudna
J.A.
Cas1-Cas2 complex formation mediates spacer acquisition during CRISPR-Cas adaptive immunity
Nat. Struct. Mol. Biol.
2014
, vol. 
21
 (pg. 
528
-
534
)
[PubMed]
23
Nuñez
J.K.
Lee
A.S.
Engelman
A.
Doudna
J.A.
Integrase-mediated spacer acquisition during CRISPR-Cas adaptive immunity
Nature
2015
, vol. 
519
 (pg. 
193
-
198
)
[PubMed]
24
Levy
A.
Goren
M.G.
Yosef
I.
Auster
O.
Manor
M.
Amitai
G.
Edgar
R.
Qimron
U.
Sorek
R.
CRISPR adaptation biases explain preference for acquisition of foreign DNA
Nature
2015
, vol. 
520
 (pg. 
505
-
510
)
[PubMed]
25
Swarts
D.C.
Mosterd
C.
van Passel
M.W.
Brouns
S.J.
CRISPR interference directs strand specific spacer acquisition
PloS One
2012
, vol. 
7
 pg. 
e35888
 
[PubMed]
26
Fineran
P.C.
Gerritzen
M.J.
Suárez-Diez
M.
Künne
T.
Boekhorst
J.
van Hijum
S.A.
Staals
R.H.
Brouns
S.J.
Degenerate target sites mediate rapid primed CRISPR adaptation
Proc. Natl. Acad. Sci. U.S.A.
2014
, vol. 
111
 (pg. 
E1629
-
E1638
)
[PubMed]
27
Richter
C.
Dy
R.L.
McKenzie
R.E.
Watson
B.N.
Taylor
C.
Chang
J.T.
McNeil
M.B.
Staals
R.H.
Fineran
P.C.
Priming in the Type I-F CRISPR-Cas system triggers strand-independent spacer acquisition, bi-directionally from the primed protospacer
Nucleic Acids Res.
2014
, vol. 
42
 (pg. 
8516
-
8526
)
[PubMed]
28
Redding
S.
Sternberg
S.H.
Marshall
M.
Gibb
B.
Bhat
P.
Guegler
C.K.
Wiedenheft
B.
Doudna
J.A.
Greene
E.C.
Surveillance and processing of foreign DNA by the Escherichia coli CRISPR-Cas system
Cell
2015
, vol. 
163
 (pg. 
854
-
865
)
[PubMed]
29
Wang
J.
Li
J.
Zhao
H.
Sheng
G.
Wang
M.
Yin
M.
Wang
Y.
Structural and mechanistic basis of PAM-dependent spacer acquisition in CRISPR-Cas systems
Cell
2015
, vol. 
163
 (pg. 
840
-
853
)
[PubMed]
30
Mojica
F.J.
Diez-Villasenor
C.
Garcia-Martinez
J.
Almendros
C.
Short motif sequences determine the targets of the prokaryotic CRISPR defence system
Microbiology
2009
, vol. 
155
 (pg. 
733
-
740
)
[PubMed]
31
Nuñez
J.K.
Harrington
L.B.
Kranzusch
P.J.
Engelman
A.N.
Doudna
J.A.
Foreign DNA capture during CRISPR-Cas adaptive immunity
Nature
2015
, vol. 
527
 (pg. 
535
-
538
)
[PubMed]
32
Goren
M.G.
Yosef
I.
Auster
O.
Qimron
U.
Experimental definition of a clustered regularly interspaced short palindromic duplicon in Escherichia coli
J. Mol. Biol.
2012
, vol. 
423
 (pg. 
14
-
16
)
[PubMed]
33
Rollie
C.
Schneider
S.
Brinkmann
A.S.
Bolt
E.L.
White
M.F.
Intrinsic sequence specificity of the Cas1 integrase directs new spacer acquisition
Elife
2015
, vol. 
4
 pg. 
e08716
 
34
Ivančić-Baće
I.
Cass
S.D.
Wearne
S.J.
Bolt
E.L.
Different genome stability proteins underpin primed and naïve adaptation in E. coli CRISPR-Cas immunity
Nucleic Acids Res.
2015
, vol. 
43
 (pg. 
10821
-
10830
)
[PubMed]
35
Arslan
Z.
Hermanns
V.
Wurm
R.
Wagner
R.
Pul
U.
Detection and characterization of spacer integration intermediates in type I-E CRISPR-Cas system
Nucleic Acids Res.
2014
, vol. 
42
 (pg. 
7884
-
7893
)
[PubMed]
36
Kaniga
K.
Delor
I.
Cornelis
G.R.
A wide-host-range suicide vector for improving reverse genetics in gram-negative bacteria: inactivation of the blaA gene of Yersinia enterocolitica
Gene
1991
, vol. 
109
 (pg. 
137
-
141
)
[PubMed]
37
Fineran
P.C.
Everson
L.
Slater
H.
Salmond
G.P.C.
A GntR family transcriptional regulator (PigT) controls gluconate-mediated repression and defines a new, independent pathway for regulation of the tripyrrole antibiotic, prodigiosin, in Serratia
Microbiology
2005
, vol. 
151
 (pg. 
3833
-
3845
)
[PubMed]
38
Ho
S.N.
Hunt
H.D.
Horton
R.M.
Pullen
J.K.
Pease
L.R.
Site-directed mutagenesis by overlap extension using the polymerase chain-reaction
Gene
1989
, vol. 
77
 (pg. 
51
-
59
)
[PubMed]
39
Grinter
N.J.
A Broad-Host-Range cloning vector transposable to various replicons
Gene
1983
, vol. 
21
 (pg. 
133
-
143
)
[PubMed]
40
Herrero
M.
Delorenzo
V.
Timmis
K.N.
Transposon vectors containing non-antibiotic resistance selection markers for cloning and stable chromosomal insertion of foreign genes in Gram-Negative Bacteria
J. Bacteriol.
1990
, vol. 
172
 (pg. 
6557
-
6567
)
[PubMed]
41
Vercoe
R.B.
Chang
J.T.
Dy
R.L.
Taylor
C.
Gristwood
T.
Clulow
J.S.
Richter
C.
Przybilski
R.
Pitman
A.R.
Fineran
P.C.
Cytotoxic chromosomal targeting by CRISPR/Cas systems can reshape bacterial genomes and expel or remodel pathogenicity islands
PLoS Genet.
2013
, vol. 
9
 pg. 
e1003454
 
[PubMed]
42
Przybilski
R.
Richter
C.
Gristwood
T.
Clulow
J.S.
Vercoe
R.B.
Fineran
P.C.
Csy4 is responsible for CRISPR RNA processing in Pectobacterium atrosepticum
RNA Biol.
2011
, vol. 
8
 (pg. 
517
-
528
)
[PubMed]
43
Heckman
K.L.
Pease
L.R.
Gene splicing and mutagenesis by PCR-driven overlap extension
Nat. Protoc.
2007
, vol. 
2
 (pg. 
924
-
932
)
[PubMed]
44
Guzman
L.M.
Belin
D.
Carson
M.J.
Beckwith
J.
Tight regulation, modulation, and High-level expression by vectors containing the Arabinose P-Bad promoter
J. Bacteriol.
1995
, vol. 
177
 (pg. 
4121
-
4130
)
[PubMed]
45
Kabsch
W.
Xds
Acta Crystallogr. Sect. D-Biol. Crystallogr.
2010
, vol. 
66
 (pg. 
125
-
132
)
46
Winn
M.D.
Ballard
C.C.
Cowtan
K.D.
Dodson
E.J.
Emsley
P.
Evans
P.R.
Keegan
R.M.
Krissinel
E.B.
Leslie
A.G.W.
McCoy
A.
, et al. 
Overview of the CCP4 suite and current developments
Acta Crystallogr. Sect. D-Biol. Crystallogr.
2011
, vol. 
67
 (pg. 
235
-
242
)
47
Mccoy
A.J.
Grosse-Kunstleve
R.W.
Adams
P.D.
Winn
M.D.
Storoni
L.C.
Read
R.J.
Phaser crystallographic software
J. Appl. Crystallogr.
2007
, vol. 
40
 (pg. 
658
-
674
)
[PubMed]
48
Emsley
P.
Lohkamp
B.
Scott
W.G.
Cowtan
K.
Features and development of Coot
Acta Crystallogr. Sect. D-Biol. Crystallogr.
2010
, vol. 
66
 (pg. 
486
-
501
)
49
Murshudov
G.N.
Skubak
P.
Lebedev
A.A.
Pannu
N.S.
Steiner
R.A.
Nicholls
R.A.
Winn
M.D.
Long
F.
Vagin
A.A.
REFMAC5 for the refinement of macromolecular crystal structures
Acta Crystallogr. Sect. D-Biol. Crystallogr.
2011
, vol. 
67
 (pg. 
355
-
367
)
50
Adams
P.D.
Afonine
P.V.
Bunkoczi
G.
Chen
V.B.
Davis
I.W.
Echols
N.
Headd
J.J.
Hung
L.W.
Kapral
G.J.
Grosse-Kunstleve
R.W.
, et al. 
PHENIX: a comprehensive Python-based system for macromolecular structure solution
Acta Crystallogr. Sect. D-Biol. Crystallogr.
2010
, vol. 
66
 (pg. 
213
-
221
)
51
Schrödinger
L.
2010
 
The PyMOL Molecular Graphics System, Version 1.7.6.2.
52
Pettersen
E.F.
Goddard
T.D.
Huang
C.C.
Couch
G.S.
Greenblatt
D.M.
Meng
E.C.
Ferrin
T.E.
UCSF Chimera–a visualization system for exploratory research and analysis
J. Comput. Chem.
2004
, vol. 
25
 (pg. 
1605
-
1612
)
[PubMed]
53
Li
M.
Wang
R.
Zhao
D.
Xiang
H.
Adaptation of the Haloarcula hispanica CRISPR-Cas system to a purified virus strictly requires a priming process
Nucleic Acids Res.
2014
, vol. 
42
 (pg. 
2483
-
2492
)
[PubMed]
54
Wei
Y.
Terns
R.M.
Terns
M.P.
Cas9 function and host genome sampling in Type II-A CRISPR-Cas adaptation
Genes Dev.
2015
, vol. 
29
 (pg. 
356
-
361
)
[PubMed]
55
Heler
R.
Samai
P.
Modell
J.W.
Weiner
C.
Goldberg
G.W.
Bikard
D.
Marraffini
L.A.
Cas9 specifies functional viral targets during CRISPR-Cas adaptation
Nature
2015
, vol. 
519
 (pg. 
199
-
202
)
[PubMed]
56
Kim
T.Y.
Shin
M.
Huynh Thi Yen
L.
Kim
J.S.
Crystal structure of Cas1 from Archaeoglobus fulgidus and characterization of its nucleolytic activity
Biochem. Biophys. Res. Commun.
2013
, vol. 
441
 (pg. 
720
-
725
)
[PubMed]
57
Vorontsova
D.
Datsenko
K.A.
Medvedeva
S.
Bondy-Denomy
J.
Savitskaya
E.E.
Pougach
K.
Logacheva
M.
Wiedenheft
B.
Davidson
A.R.
Severinov
K.
Semenova
E.
Foreign DNA acquisition by the I-F CRISPR-Cas system requires all components of the interference machinery
Nucleic Acids Res.
2015
, vol. 
43
 (pg. 
10848
-
10860
)
[PubMed]
58
Harding
M.M.
The architecture of metal coordination groups in proteins
Acta Crystallogr. Sect. D-Biol. Crystallogr.
2004
, vol. 
60
 (pg. 
849
-
859
)
59
Miller
M.D.
Cai
J.
Krause
K.L.
The active site of Serratia endonuclease contains a conserved magnesium-water cluster
J. Mol. Biol.
1999
, vol. 
288
 (pg. 
975
-
987
)
[PubMed]
60
Pourcel
C.
Salvignol
G.
Vergnaud
G.
CRISPR elements in Yersinia pestis acquire new repeats by preferential uptake of bacteriophage DNA, and provide additional tools for evolutionary studies
Microbiology
2005
, vol. 
151
 (pg. 
653
-
663
)
[PubMed]
61
Bell
K.S.
Sebaihia
M.
Pritchard
L.
Holden
M.T.G.
Hyman
L.J.
Holeva
M.C.
Thomson
N.R.
Bentley
S.D.
Churcher
L.J.C.
Mungall
K.
, et al. 
Genome sequence of the enterobacterial phytopathogen Erwinia carotovora subsp atroseptica and characterization of virulence factors
Proc. Natl. Acad. Sci. U.S.A.
2004
, vol. 
101
 (pg. 
11105
-
11110
)
[PubMed]
62
Baker
N.A.
Sept
D.
Joseph
S.
Holst
M.J.
McCammon
J.A.
Electrostatics of nanosystems: application to microtubules and the ribosome
Proc. Natl. Acad. Sci. U.S.A.
2001
, vol. 
98
 (pg. 
10037
-
10041
)
[PubMed]

Author notes

2

Present address: MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, U.K.

3

Present address: Laboratory of Microbiology, Wageningen University, 6703 HB Wageningen, Netherlands.

Supplementary data