Uptake of foreign mobile genetic elements is often detrimental and can result in cell death. For protection against invasion, prokaryotes have developed several defence mechanisms, which take effect at all stages of infection; an example is the recently discovered CRISPR (clustered regularly interspaced short palindromic repeats)–Cas (CRISPR-associated) immune system. This defence system directly degrades invading genetic material and is present in almost all archaea and many bacteria. Current data indicate a large variety of mechanistic molecular approaches. Although almost all archaea carry this defence weapon, only a few archaeal systems have been fully characterized. In the present paper, we summarize the prerequisites for the detection and degradation of invaders in the halophilic archaeon Haloferax volcanii. H. volcanii encodes a subtype I-B CRISPR–Cas system and the defence can be triggered by a plasmid-based invader. Six different target-interference motifs are recognized by the Haloferax defence and a 9-nt non-contiguous seed sequence is essential. The repeat sequence has the potential to fold into a minimal stem–loop structure, which is conserved in haloarchaea and might be recognized by the Cas6 endoribonuclease during the processing of CRISPR loci into mature crRNA (CRISPR RNA). Individual crRNA species were present in very different concentrations according to an RNA-Seq analysis and many were unable to trigger a successful defence reaction. Recognition of the plasmid invader does not depend on its copy number, but instead results indicate a dependency on the type of origin present on the plasmid.
The diversity of CRISPR (clustered regularly interspaced short palindromic repeats)–Cas (CRISPR-associated) systems
A recently discovered defence strategy of prokaryotes against foreign genetic elements is the CRISPR–Cas system [1–6]. Central elements of this system are the Cas proteins and the crRNAs (CRISPR RNAs). The system detects invading foreign nucleic acid and degrades it. The CRISPR–Cas system has been identified in almost all archaea and in approximately half of bacteria. Comparison of the Cas proteins from different organisms revealed that they can be sorted into 45 Cas protein families  that have been grouped into three major types (I–III), which show considerable differences between each other . The major types have been further divided into ten subtypes (I-A–I-F, II-A and II-B, and III-A and III-B), which again show significant variances. The major Type I contains the most subtypes and these six subtypes are not equally distributed between the Bacteria and Archaea domains . Archaea contain mainly subtypes I-A, I-B and I-D, whereas bacteria contain mainly subtypes I-C, I-E and I-F . Comparably little is known about the archaeal CRISPR–Cas systems, especially about the subtypes I-B and I-D . In the present paper, we summarize our investigations concerning the subtype I-B in the halophilic archaeon Haloferax volcanii.
Characteristics of the Haloferax CRISPR–Cas subtype I-B system
H. volcanii is a model archaeon, representative of the euryarchaeal class of haloarchaea . It is halophilic, requiring high salt concentrations for growth and contains similar concentrations of salt intracellularly to cope with the high salt concentration in the medium. H. volcanii (H119) encodes a subtype I-B system with eight Cas proteins (Cas1–Cas8b) and three CRISPR RNAs [12,13]. We have shown previously that the CRISPR–Cas system is still active, processing all three CRISPR loci, which are constitutively expressed and processed into mature crRNAs . The spacers encoded in the three CRISPR loci show only two matches to sequences in the public sequence databases: one with an overall sequence identity of 76% with the Haloferax genome itself and a second one with a sequence identity of 88% with an environmental sequence from a sample isolated from Lake Tyrrell, an Australian salt lake . The low number of matches may be due to the fact that only a few virus sequences are present in the databases and that the H. volcanii strain was isolated 30 years ago and virus populations have evolved since then . Analysis of the repeat structure of the three Haloferax CRISPR loci showed that the repeats are 30 nt in length and the sequences are identical between the three CRISPR loci except for a single nucleotide (Figure 1A). The CRISPR loci are processed into crRNAs, which are the central elements of the CRISPR–Cas defence since they direct the degradation complex to the invader in a sequence-specific manner. They contain the spacer sequence (Figure 1B) in between two repeat sequence tags that is derived from a previously encountered invader and is used to detect the invader during a repeated invasion. The different CRISPR–Cas types have developed various mechanisms to generate the functional crRNA molecule from the crRNA precursor . In Type I and III systems the crRNA is processed from the crRNA precursor by representatives of the family of the Cas6 protein. Cas6 proteins have been analysed in detail from subtype I-B, I-E, I-F and III-B systems . Although they catalyse the same reaction, they show very little sequence similarity and the catalytic site is made up differently . The Cas6 protein in H. volcanii is also essential for crRNA production, since the deletion of the cas6 gene resulted in loss of mature crRNAs (J. Brendel, B. Stoll, S.J. Lange, L.-K. Maier, R. Backofen and A. Marchfelder, unpublished work).
The repeat sequences and crRNAs of Haloferax
crRNA repeats and crRNA characteristics
Analysis of mature crRNAs in Haloferax showed that they contain an 8-nt 5′ handle , similar to crRNAs in other organisms and systems  (Figure 1B). According to the crRNA sizes detected in Northern blots, Haloferax crRNAs have an overall length of approximately 65 nt, suggesting that they contain 22 nt of the repeat at the 3′ end  (Figure 1B). This type of structure is similar to the crRNA structure found in subtypes I-A, I-E and I-F , but, interestingly, is different from the crRNA structure found in two other subtype I-B systems, those of Methanococcus maripaludis and Clostridium thermocellum . M. maripaludis and C. thermocellum crRNAs are further trimmed at the 3′ end as observed in subtype III-B systems  leaving only a few nucleotides of the downstream repeat . Our observations, together with the one from the other subtype I-B systems, suggest that even within a subtype differences in the molecular details of the reaction can exist. A detailed analysis of the nature of the crRNA 3′ end in Haloferax will show how much the crRNA characteristics in the subtype I-B systems differ.
Analysis of repeat sequences of the different CRISPR–Cas types showed that they have highly variable structures [20,21] (CRISPRmap web server: http://rna.informatik.uni-freiburg.de/CRISPRmap). The crRNAs from the subtypes I-E and I-F form a stable hairpin structure which is a critical feature for recognition and cleavage by the Cas6 protein . Similarity searches with the Haloferax repeat sequence against microbial databases showed that the repeat sequence is conserved in haloarchaea with the potential to form a minimal stem–loop structure with a 3-nt stem and a 4-nt loop (Figure 1B). This structure is conserved throughout the subtype I-B-containing haloarchaea, suggesting that it is important for function . Binding by proteins might stabilize this minimal structure and facilitate cleavage at the 3′ end of the stem by the Cas6b protein. Such a stabilization of a minimal stem–loop structure by the Cas6 protein has recently been shown for one of the Sulfolobus solfataricus Cas6 proteins and its respective repeat .
A plasmid invader triggers the immune system and reveals essential requirements for interference
To determine the requirements for a successful defence reaction we employed a plasmid-based invader system [14,17,23] (Figure 2A). To trigger a defence reaction, the invader must match one of the spacers in the crRNAs and contain a distinct sequence motif of approximately 2–5 nt in length, the so-called PAM (protospacer-adjacent motif) [8,24,25] (Figure 3). This motif is vital for two stages of the defence reaction: (i) during adaptation when a piece of the invader DNA is selected for integration into the CRISPR locus; and (ii) in interference when the invader DNA is recognized and degraded. Studies with different organisms and different CRISPR–Cas types have shown that the motifs for these two stages are not identical . The motif required for selection during adaptation is more conserved than the one for the interference reaction, allowing more invader sequences to be recognized. Therefore the two motifs have been termed differently: SAM for spacer-acquisition motif and TIM for target-interference motif  (Figure 3).
A plasmid invader for triggering the immune defence
Different recognition motifs for different steps of the defence: PAM, SAM and TIM
Using the plasmid invader system, we determined the TIMs for efficient recognition by the Haloferax defence system . In Haloferax the TIM sequences are 3-nt-long and located upstream of the protospacer, and we found six different TIM sequences which were effective in triggering the defence reaction: ACT, TTC, TAA, TAT, TAG and CAC.
Investigation of the prerequisites for invader detection revealed that the spacer sequence in the crRNA must form base pairs with the corresponding target sequence at a seed sequence spanning 10 nt at its 5′ end that allows only a single mismatch at position six (Figure 2B). This is similar to the seed sequence detected in Escherichia coli and Pseudomonas aeruginosa that was an essential prerequisite for the interference reaction [26,27]. In E. coli, the required sequence is only 7-nt-long with a mismatch tolerated at position six .
An RNA-Seq analysis of the quantities of individual crRNA species in Haloferax showed that they are not present in equal concentrations ; similar observations were previously made in other organisms [18,28–30]. This might be due to technical biases (e.g. some crRNAs are better ligated and amplified in preparation for deep sequencing), but could also be due to different stabilities of the crRNA molecules depending on the spacer sequences contained. To investigate whether all crRNAs have the same effectivity in triggering invader degradation (independent of crRNA concentration), several invaders with sequences against different spacers were tested in the plasmid invader assay. Only some of the selected crRNAs were active in triggering a defence reaction, whereas many were not. It is unclear at this point which factors influence the effectivity of a crRNA: the concentration of the crRNA, the length of the spacer, the percentage of G/C nucleotides or other additional factors that might influence the stability of the crRNA or the crRNA–target interaction.
In initial plasmid invader assays, different vectors were used, revealing that a plasmid with a high copy number was not active in triggering the defence reaction. Further analysis showed it was not the copy number, but the nature of the origin of replication that was the cause of the loss of defence. Although plasmids with an ORC (origin-recognition complex)-based mode of replication [31,32] activated the defence and were degraded, plasmids with a distinct replication mode (presumably Rep-dependent [33,34]) were not degraded . Further analyses will show whether these observations are due to sterical problems (the origin is located close to the protospacer sequence on the invader plasmid) or interactions of the CRISPR–Cas system with the replication of the invader.
Using a plasmid-based invader we have determined that the Haloferax subtype I-B system recognizes six different TIMs and requires a 9-nt-long non-contiguous seed interaction between crRNA and target. A minimal stem–loop structure, which is conserved in haloarchaea, can fold in the repeat and might be important for processing by the Cas6 protein. The crRNAs in Haloferax are not all active in triggering the defence reaction and they appear to be present in different concentrations. Results indicate that the type of origin present on the invader plasmid may be important for a successful defence of the invader.
CRISPR Evolution, Mechanisms and Infection: A Biochemical Society Focused Meeting held at the University of St Andrews, U.K., 17–19 June 2013. Organized and Edited by Emmanuelle Charpentier (Laboratory for Molecular Infection Medicine Sweden, Sweden), John van der Oost (Wageningen University, The Netherlands) and Malcolm White (University of St Andrews, U.K.).
We thank all of the members of the DFG Research Unit FOR1680 for helpful discussions.
This work was supported by the Deutsche Forschungsgemeinschaft in the frame of the priority programme ‘Unravelling the prokaryotic immune system’ [number FOR1680].