CRISPR (clustered regularly interspaced short palindromic repeats) arrays and Cas (CRISPR-associated) proteins confer acquired resistance against mobile genetic elements in a wide range of bacteria and archaea. The phytopathogen Pectobacterium atrosepticum SCRI1043 encodes a single subtype I-F CRISPR system, which is composed of three CRISPR arrays and the cas operon encoding Cas1, Cas3 (a Cas2–Cas3 fusion), Csy1, Csy2, Csy3 and Cas6f (Csy4). The CRISPR arrays are transcribed into pre-crRNA (CRISPR RNA) and then processed by Cas6f to generate crRNAs. Furthermore, the formation of Cas protein complexes has been implicated in both the interference and acquisition stages of defence. In the present paper, we discuss the development of tightly controlled ‘programmable’ CRISPR arrays as tools to investigate CRISPR–Cas function and the effects of chromosomal targeting. Finally, we address how chromosomal targeting by CRISPR–Cas can cause large-scale genome deletions, which can ultimately influence bacterial evolution and pathogenicity.
Being constantly challenged by foreign genetic elements, such as phages or plasmids, prokaryotes have evolved numerous mechanisms to avoid invasion [1,2]. One group of systems, composed of CRISPR (clustered regularly interspaced short palindromic repeats) arrays and Cas (CRISPR-associated) proteins, are widespread among bacteria and archaea . The CRISPR–Cas defence is different from other known defence mechanisms in that it provides acquired sequence-specific resistance by utilizing siRNAs derived from short sequences of the invaders . There is considerable variation between different CRISPR–Cas systems; however, the main characteristics are similar. CRISPR arrays are composed of near-identical repeats, which alternate with unique spacers derived from extrachromosomal elements, and an upstream leader sequence of several hundred base pairs that contains the promoter and sequences required for the acquisition of new spacers [5–7]. Genes encoding the Cas proteins are typically located in close proximity to the CRISPR array(s). The Cas proteins provide the enzymatic machinery involved in all steps of the resistance mechanism [8,9]. Resistance is accomplished in three steps. Upon entry of an invading element, short sequences from its genome (protospacers or precursor spacers) are integrated into the CRISPR array as new spacers [9–12]. For this, the presence of short nucleotide motifs next to the protospacer sequence [PAMs (protospacer-adjacent motifs)] is important for spacer acquisition as well as for interference in many systems [13,14]. Next, the CRISPR array is transcribed into a pre-crRNA (CRISPR RNA), which is processed into guide RNAs (crRNAs) consisting of the spacer sequence flanked by remnants of one or both adjacent repeats. Finally, the crRNAs are integrated into a ribonucleoprotein complex, which together identify complementary invader sequences and promote their degradation, either alone or upon recruitment of another Cas protein [9,11,12].
Currently, CRISPR–Cas systems are classified into three major types and various subtypes, based on the phylogeny and composition of their Cas proteins . All CRISPR–Cas systems possess Cas1, the universal marker, and Cas2, which can be either separate or present as a fusion to another protein. The main types are characterized by a signature protein (e.g. Cas3, Cas9 or Cas10 for Types I–III), and the subtypes are distinguished by their unique subset of proteins . Unsurprisingly, the mechanisms by which resistance is conferred differ between the different CRISPR–Cas types (see [9,15,16] for reviews).
Pectobacterium atrosepticum SCRI1043 (formerly Erwinia carotovora subsp. atroseptica) is a member of the Enterobacteriaceae and is a major cause of soft rot and blackleg disease in potato . Virulence is mainly determined by a large group of quorum-sensing-regulated PCWDEs (plant cell wall-degrading enzymes), including 20 predicted pectinases and further putative cellulases . A total of 183 phage genes and 17 putative HAIs (horizontally acquired islands) have been predicted in the genome, of which some have important roles in virulence . HAI2, for example, encodes a cluster of cfa genes involved in the production of the phytotoxin component coronafacic acid, the deletion of which resulted in drastically reduced virulence . Therefore encounters with mobile genetic elements have played an important role in the evolution of P. atrosepticum. At the same time, both ‘innate’ and ‘adaptive’ immune systems exist to control phage infection. As part of the ‘innate’ immune arsenal, P. atrosepticum possesses an abortive infection/toxin–antitoxin system that triggers cell suicide upon phage infection, limiting viral spread resulting in an altruistic population level protection [20–22]. P. atrosepticum also harbours the CRISPR–Cas ‘adaptive’ immune system, which we review in the following sections.
The CRISPR–Cas system of P. atrosepticum
P. atrosepticum SCRI1043 encodes a subtype I-F (formerly Ypest) CRISPR–Cas system, which includes three CRISPR arrays and the cas operon encoding Cas1, Cas3 (a Cas2–Cas3 fusion) and the subtype-specific proteins Csy1, Csy2, Csy3 and Cas6f (formerly Csy4)  (Figure 1A). CRISPR1 is divergently transcribed from the cas operon, whereas CRISPR2–CRISPR3 are located downstream. CRISPR2 and CRISPR3 are separated by a putative toxin–antitoxin system (eca3686–eca3687) composed of a VagC family antitoxin and a VapC toxin . CRISPR arrays 1–3 contain 28, ten and three spacers respectively, which are 32–33 nt in length (predominantly 32 nt). Spacers 2 and 19 in CRISPR1 match, albeit imperfectly, targets in a predicted prophage, ΦPCC21_1, in the genome of Pectobacterium carotovorum subsp. carotovorum strain PCC21 . Additionally, spacer 6 in CRISPR2 shows a 100% match to a target in eca0560, a gene on HAI2 within the P. atrosepticum chromosome; however, a single nucleotide deviation from the consensus PAM prevents this spacer from being functional for interference [23,26]. As observed for many other CRISPR–Cas systems, the subtype I-F repeats are partly palindromic and have the ability to form stem–loops, which is important for the mechanism of pre-crRNA processing to crRNAs [9,27]. The repeats in CRISPR1–CRISPR3 of P. atrosepticum are 28 nt and exhibit the consensus sequence (5′-GTTCACTCGCGTACAGGCAGCTTAGAAA-3′) [23,27]; however, variations from the consensus sequence do occur in some spacers throughout all three arrays (Figure 1B). The promoter activity of the leader region was shown for all three arrays and the activity of CRISPR3, the shortest array, was lower than those observed for CRISPR1 and CRISPR2, suggesting that this array might be less active. Furthermore, crRNA production from all three CRISPR arrays was detected .
The CRISPR–Cas system of P. atrosepticum SCRI1043
The cas genes are transcribed as a polycistronic mRNA, with the transcriptional start ~140 nt upstream of cas1 (Figure 1A). It is not yet clear whether this long untranslated region is important for regulation of cas expression. Promoter activity was confirmed by cloning ~500 bp upstream of cas1 into a promoterless lacZ transcriptional fusion vector . Furthermore, a translational lacZ fusion to the cas1 ATG in the chromosome revealed increasing expression throughout growth, reaching maximal expression in stationary phase [23,26]. Over the last few years, several studies in P. atrosepticum have led to a model of the function of this subtype I-F CRISPR–Cas system, which is summarized in Figure 2.
Model of CRISPR–Cas function in P. atrosepticum
Cas6f is required for generation of crRNAs
The generation of guide RNAs (crRNAs) is a key element within CRISPR–Cas defence . We observed crRNAs for all CRISPR arrays in the wild-type with sizes of approximately 55–60 nt, consistent with a single endonucleolytic processing event in each repeat. Interestingly, no crRNA was detected in a cas operon deletion mutant, from which all six cas genes were absent, demonstrating the involvement of Cas proteins . The generation of crRNAs was restored by expressing Cas6f, but not by any of the other Cas proteins. Additionally, no crRNAs were detected in a Δcas6f deletion mutant, suggesting that Cas6f alone is the endoribonuclease responsible for processing the pre-crRNAs into crRNAs  (Figure 2C). In contrast, in Pseudomonas aeruginosa, csy1, csy2, csy3, cas6f (csy4) and cas3 have been suggested to have a role in vivo for crRNA generation . Our results are consistent with in vitro observations for Cas6f from P. aeruginosa, which cleaves the pre-crRNA between base 20 and 21 of the repeat, leaving an 8 nt repeat handle at the 5′ end of the crRNA and a 20 nt handle with the stem–loop structure derived from the downstream repeat at the 3′ end . Indeed, G20A repeat mutations in P. atrosepticum abolished targeting, presumably by disrupting processing . P. aeruginosa Cas6f pre-crRNA binding and cleavage is strongly dependent on both the correct secondary structure and the sequence of the crRNA stem–loop . In agreement, a C18A mutation, which would introduce a bulge in the P. atrosepticum repeat stem, abolished targeting. Interestingly, introduction of a second compensatory mutation (G8U) restored the ability to target, indicating that the stem–loop structure was important for P. atrosepticum Cas6f function and that sequence changes were tolerated . In P. aeruginosa, the same mutation caused a 2-fold reduction in repeat cleavage by Cas6f [26,31].
Formation of a Csy protein complex for interference
Cas6f, together with Csy1, Csy2 and Csy3, forms the P. atrosepticum subtype I-F-specific targeting complex, named the Csy complex  (Figure 2, inset). It is likely that a Cas6f–crRNA complex is the prerequisite for the formation of this complex as was observed in the homologous system in P. aeruginosa . Directed pairwise co-immunoprecipitation experiments in vivo revealed the protein–protein interaction network of the P. atrosepticum Csy complex. The endoribonuclease Cas6f interacted with Csy3, but only in the presence of the other Cas proteins. Indeed, no interaction between Cas6f and Csy3 was detected in a Δcas strain, which is deficient in other Cas proteins, but generates crRNAs . Csy3 formed homo-multimers in both the presence and absence of other CRISPR–Cas components, and was predicted to form the backbone of the Csy complex in P. atrosepticum as well as in P. aeruginosa [32,34]. A homomultimeric backbone is a common feature of the Type I targeting complexes [35–37] and serves to bind the spacer portion of the crRNA to facilitate binding to the complementary target sequence on the invading genome. Csy3 also interacted with Csy1, again only in a crRNA and/or Cas protein-dependent manner . Finally, Csy1 and Csy2 interacted independently of the presence of crRNA or other Cas proteins. In summary, this is concurrent with the findings of an in vitro study in P. aeruginosa  and suggests that Cas6f forms the head of the Csy complex, in which Csy3 represents the neck (or backbone) connecting Cas6f to the Csy1–Csy2 heterodimer, which forms the ‘feet’ (Figure 2D) . Cas3, which is both a nuclease and helicase, was shown to be essential for degrading invader DNA in the I-E subtype [38,39]. Since Cas3 is the Type I signature protein, it was hypothesized that it would have a similar role in the subtype I-F system. Indeed, we demonstrated that Cas3 interacted with the Csy complex, indicating an involvement in the targeting and interference process  (Figures 2D and 2E).
Cas1 and Cas2–Cas3 form a complex possibly required for acquisition
Cas1 and Cas2 have recently been shown to be involved in spacer acquisition in the subtype I-E-system [5,40]. We hypothesized that Cas1 and Cas2 proteins might interact directly as part of the spacer-acquisition mechanism [10,32]. An interesting feature of the subtype I-F system is the absence of a separate Cas2 protein. Instead, a Cas2 domain at the N-terminus of Cas3 was proposed , but not unanimously accepted . Therefore we performed structural homology searches and alignments of characterized Cas2 homologues. These analyses confirmed the conservation of critical residues conserved in Cas2 homologues, and therefore the presence of a Cas2-like domain at the N-terminus of Cas3 of P. atrosepticum . The presence of a Cas2 domain at the N-termini of the subtype I-F Cas3 proteins led to the question of whether Cas2–Cas3 could interact with Cas1. Native Cas3 co-purified strongly with His6-tagged Cas1, demonstrating formation of a Cas1–Cas2–Cas3 subcomplex. We propose, on the basis of the role of Cas1 and Cas2 in acquisition in the subtype I-E systems, that this Cas1–Cas2–Cas3 complex is involved in the incorporation of new spacers (Figure 2B). The requirement, if any, for Cas3 in this acquisition complex is unclear. Cas3 in the subtype I-E system of E. coli is important for a positive-feedback loop during spacer acquisition that enables an accelerated, or ‘priming’ mechanism for the incorporation of new spacers [40,42]. The Cas3 domains in the Cas1–Cas2–Cas3 complex may facilitate similar ‘primed’ spacer integration in the subtype I-F systems. Therefore P. atrosepticum Cas2–Cas3 is likely to act as a double agent, being involved in both spacer acquisition and interference .
Engineered CRISPRs as a tool to investigate chromosomal targeting
A large number of spacers match sequences in the genomes of the organism encoding the respective CRISPR array. It has been proposed that the incorporation of self-spacers is accidental, but some CRISPR arrays can be regulatory [43,44]. In order to test the effect of chromosomal targeting, a method to engineer CRISPR arrays was developed  (Figure 3A). The following strategy is applicable for generating arrays containing any number of spacers of interest for any CRISPR system to study interference (i.e. plasmid or phage invasion, or chromosomal targeting). Entry plasmids were generated that contained a CRISPR1 leader sequence and a single repeat with a BbsI site 3′ of the repeat. The BbsI restriction enzyme generates 4 nt sticky overhangs after cutting 2 nt (top strand) and 6 nt (lower strand) 3′ from the recognition site (5′-GAAGAC-3′) (Figure 3B). Oligonucleotides are designed with an appropriate repeat and BbsI sequence to amplify a user-defined spacer sequence (Figure 3C). Following BbsI digestion, unique ends are generated, enabling sticky-ended directional cloning of each new spacer into the entry vector (Figure 3D). Since the BbsI site remains at the trailer end of the array following each ligation, cloning can be repeated multiple times, with each new spacer added at the leader-distal end (Figure 3E).
Development of engineered CRISPRs
Plasmids were designed with various lengths of leader (780, 180 or 52 bp) 5′ from the first repeat, which resulted in expression under the native CRISPR promoter. To enable tight control of the engineered CRISPRs using the araBAD promoter, plasmids containing only 16 bp leader sequence were required, which led to the identification of a putative CRISPR1 promoter within 52 bp of the leader . Expression of engineered CRISPRs that targeted the lacZ (β-galactosidase) or the expI genes (involved in the production of quorum-sensing signals) led to growth inhibition and a 105 reduction in viable count in the wild-type, but not in a Cas-protein-deficient strain. Targeting of expI caused cellular elongation from ~2 μm in the control to a mean of ~10 μm, which is indicative of DNA damage and consistent with DNA, rather than RNA, being the target in subtype I-F systems. Notably, a single spacer was sufficient to cause toxicity and deletion of the target gene abolished the toxic phenotype. These experiments demonstrated the functionality of these ‘programmable’ CRISPR plasmids and showed that chromosomal targeting by engineered CRISPRs is toxic and CRISPR–Cas-induced .
As mentioned above, spacer 6 in CRISPR2 has 100% complementarity to eca0560 in the HAI2 pathogenicity island in P. atrosepticum. However, the adjacent repeats contain one (downstream repeat) or two (upstream repeat) mutations. Additionally, the protospacer target has a 5′-protospacer-TG 3′ PAM, in contrast with the 5′-protospacer-GG 3′ proposed PAM consensus for subtype I-F systems [13,45]. The mutated repeats could be cleaved in vitro by Cas6f and cloning spacer 6 between two correct repeats did not cause toxicity, suggesting that the repeat mutations were not the reason for the inability of this native spacer to target HAI2. Thus it was hypothesized that the single nucleotide mutation in the PAM was the reason that spacer 6 does not confer toxicity. To test this, a CRISPR with a spacer targeting an eca0560 protospacer with a GG PAM was constructed. Indeed, the engineered spacer caused a toxic effect, confirming that a correct PAM is required for targeting and that non-optimal PAM sequences can lead to evasion from interference in subtype I-F .
Previously, in the E. coli subtype I-E system, an 8 nt seed sequence at the 5′ spacer end was shown to be essential for initial target binding . Additionally, in P. aeruginosa subtype I-F, short 8 nt ssDNA fragments complementary to nucleotides 1–8 of the spacer bound to the Csy complex with higher affinity than fragments complementary to other parts of the spacer . Introduction of a C3T mutation into the expI protospacer in P. atrosepticum resulted in a partial escape from targeting and a C6T mutation had no effect, indicating that certain mismatches between spacer and protospacer are tolerated . This is consistent with a discontinuous seed sequence in subtype I-E and a previous study in P. aeruginosa subtype I-F, which showed that spacer:protospacer mismatches are tolerated in certain positions [29,46]. Further experiments are necessary to determine the exact base-pairing requirements of interference in subtype I-F systems.
Chromosomal targeting effects include CRISPR–Cas-based reshaping of the genome
Mechanisms to evade the detrimental effects of targeting can include deletions within the CRISPR loci or the cas genes as well as changes of the target locus, e.g. mutations or absence of the PAM [26,43,47]. After initial growth inhibition upon targeting eca0560 on HAI2, we positively selected for suppressor mutants that survived toxicity. Investigation of a subpopulation of 20 of those suppressor mutants revealed that 13 had lost the entire pathogenicity island by precise excision, with the bacterial insertion site attB remaining. The other seven suppressor mutants showed partial deletions of HAI2, all of which included the removal of the region with the target protospacer . In a similar approach, suppressor mutants that arose after targeting lacZ, and appeared white when grown on X-Gal blue/white selection plates, had acquired deletions of lacZ and an additional >50 kb 5′ from lacZ. This region contains genes encoding non-ribosomal peptide synthetases, which might have a role in pathogenicity. Therefore CRISPR–Cas-mediated targeting of the chromosome can direct large-scale rearrangements within the bacterial genome, including the modification or deletion/excision of the target pathogenicity region. These cases of ‘aberrant’ chromosomal targeting by CRISPR–Cas systems might influence the evolution of bacterial genomes and, indeed, there is bioinformatic evidence to support this view [48–50].
P. atrosepticum SCRI1043 is a phytopathogen of economic importance, in which a number of horizontally acquired islands contribute significantly to virulence. P. atrosepticum encodes a subtype I-F CRISPR–Cas system with three actively transcribed CRISPR arrays. The pre-crRNAs are processed into crRNAs by Cas6f and a targeting Csy complex forms, composed of the subtype-specific proteins (Csy1, Csy2, Csy3 and Cas6f), and Cas2–Cas3. Furthermore, we revealed formation of a Cas1–Cas2–Cas3 complex, which is proposed to be involved in spacer acquisition. We have developed a unique widely applicable strategy for generating engineered CRISPR arrays. Furthermore, chromosomal targeting led to substantial rearrangements within the genome that can have significant evolutionary effects on pathogenicity. Despite our increasing knowledge about subtype I-F CRISPR–Cas systems, many questions remain unanswered and possible applications await development. For example, little is known about spacer acquisition or how the CRISPR–Cas defence systems overlap and function with other resistance mechanisms in P. atrosepticum.
CRISPR Evolution, Mechanisms and Infection: A Biochemical Society Focused Meeting held at the University of St Andrews, U.K., 17–19 June 2013. Organized and Edited by Emmanuelle Charpentier (Laboratory for Molecular Infection Medicine Sweden, Sweden), John van der Oost (Wageningen University, The Netherlands) and Malcolm White (University of St Andrews, U.K.).
CRISPR work in the Fineran Laboratory is supported by a Rutherford Discovery Fellowship (to P.C.F.). C.R. was supported by a University of Otago Doctoral Scholarship, a Deutscher Akademischer Austauschdienst (DAAD) Doktorandenstipendium and by a University of Otago Postgraduate Publishing Bursary (Ph.D.).
The authors thank past and present members of the Fineran laboratory for helpful discussions and Ron Dy for critically reading the paper prior to submission.