CRISPR (clustered regularly interspaced short palindromic repeats) together with cas (CRISPR-associated) genes form the CRISPR–Cas immune system, which provides sequence-specific adaptive immunity against foreign genetic elements in bacteria and archaea. Immunity is acquired by the integration of short stretches of invasive DNA as novel ‘spacers’ into CRISPR loci. Subsequently, these immune markers are transcribed and generate small non-coding interfering RNAs that specifically guide nucleases for sequence-specific cleavage of complementary sequences. Among the four CRISPR–Cas systems present in Streptococcus thermophilus, CRISPR1 and CRISPR3 have the ability to readily acquire new spacers following bacteriophage or plasmid exposure. In order to investigate the impact of building CRISPR-encoded immunity on the host chromosome, we determined the genome sequence of a BIM (bacteriophage-insensitive mutant) derived from the DGCC7710 model organism, after four consecutive rounds of bacteriophage challenge. As expected, active CRISPR loci evolved via polarized addition of several novel spacers following exposure to bacteriophages. Although analysis of the draft genome sequence revealed a variety of SNPs (single nucleotide polymorphisms) and INDELs (insertions/deletions), most of the in silico differences were not validated by Sanger re-sequencing. In addition, two SNPs and two small INDELs were identified and tracked in the intermediate variants. Overall, building CRISPR-encoded immunity does not significantly affect the genome, which allows the maintenance of important functional properties in isogenic CRISPR mutants. This is critical for the development and formulation of sustainable and robust next-generation starter cultures with increased industrial lifespans.

Streptococcus thermophilus and dairy products

S. thermophilus is a domesticated lactic acid bacterium widely used in the formulation of industrial dairy starter cultures for the fermentation of milk into yogurt and cheese. This important food-grade species is notably used in the production of cheddar and mozzarella, as well as hard cooked cheeses such as emmental. It is a critical ingredient in the ~US$50 billion annual dairy products business, and it is estimated that annual human consumption of S. thermophilus exceeds 1021 cells [1,2]. The important functional features inherent to S. thermophilus include rapid acidification of milk through the production of lactic acid, as well as the ability to enhance the textural and organoleptic properties of fermented dairy products through the production of exopolysaccharides, acetaldehyde and diacetyl [2]. The inclusion of this organism in many food products probably explains its relatively high abundance in the gastrointestinal tract of humans [3], where it may play additional roles yet to be determined. This is consistent with the extensive historical use of S. thermophilus in the fermentation of dairy products for hundreds of years, if not conservation of milk by humans for millennia [2,4]. Recent advances in food science and microbiology have furthered the rigorous selection of highly desirable strains for their exploitation on an industrial scale, their formulation in complex and high-performance starter cultures, and their functional and genomic characterization. Accordingly, S. thermophilus has been the subject of extensive research efforts over time. Genomic analyses of this industrially relevant bacterium have provided insights into the genetic basis for several of its physiological functions carried out throughout the fermentation processes, as shown in genomic studies of CNRZ1066 [1], LMG 18311 [1,2,5] and LMD-9 [6,7], and the recent genome sequencing of additional strains including ND03 [8], JIM8232 [9] and MN-ZLW-002 [10].

Industrial phages and defence systems

Of critical importance, the ubiquitous presence of bacteriophages (phages) in the environment has negatively and repeatedly affected fermentation processes in industrial settings [11,12], as phage predation can interfere with the acidification of milk by starter cultures in (very) large fermentation tanks. Phages can actually persist in manufacturing facilities given their resistance to pasteurization, airborne dissemination and the practical challenges inherent to sanitation strategies in food-grade manufacturing settings. Consequently, dairy manufacturers and providers of starter cultures have relied on the exploitation of bacterial phage-resistance systems to safeguard the production of important and traditional food products. The formulation of starter cultures often relies on combining multiple phage-resistance mechanisms and strains in rotation strategies that allow sustainable use of the most effective and resistant strains that carry critical functional properties [4,13].

A plethora of phage defence systems occur in lactic acid bacteria, including prevention of adsorption, blocking of injection, abortive infection, R-M (restriction–modification), toxin–antitoxin systems and the recently characterized CRISPR (clustered regularly interspaced short palindromic repeats) loci [13,14]. These defence systems may be combined and occasionally engineered to rein in phage populations in industrial settings. Whereas traditional defence strategies relied originally on systems such as the prevention of adsorption and blocking of DNA injection, and subsequently on R-M systems and abortive infection, the importance of the newly described CRISPR–Cas (CRISPR-associated) systems in providing phage resistance in S. thermophilus has had a dramatic impact on the management of phage-related issues [1416].

CRISPR–Cas immune systems

CRISPR loci, together with cas genes form the CRISPR–Cas immune system, which is present in ~46% of bacteria and almost 90% of archaea [17,18]. CRISPR loci typically consist of several non-contiguous short DNA repeats separated by stretches of non-repeated elements, called spacers, that derive from invasive nucleic acids such as viruses and plasmids [1416,1921]. These hypervariable loci differ widely across genera and species in terms of number of CRISPR–Cas systems and core elements such as the cas genes, repeat sequence and spacer content [16]. Overall, three distinct types of CRISPR–Cas systems have been established on the basis of the sequence of universal cas1 and cas2 genes, as well as the occurrence of signature genes, namely cas3, cas9 and cas10 for Types I, II and III respectively [18]. Overall, CRISPR–Cas systems provide immunity in three distinct steps: (i) adaptation, where immunity is acquired by integration of new spacers from invasive elements into CRISPR loci; (ii) expression, where CRISPR loci are transcribed and processed into mature non-coding interfering crRNAs (CRISPR RNAs); and (iii) interference, where crRNAs guide Cas proteins for sequence-specific cleavage of complementary nucleic acids [2229]. Originally, CRISPR–Cas systems were used for genotyping purposes [6,30], on the basis of their hypervariable nature, which has been exploited for epidemiological studies, and studying the interplay between hosts and viruses in natural systems [16,3135]. Subsequently, their ability to interfere with foreign genetic elements has been used for building immunity against phages and to preclude plasmid uptake. More recently, the ability to reprogramme the Cas9 endonuclease using small synthetic guide RNAs has revolutionized genome editing [3639].

Four distinct CRISPR–Cas systems have been identified in S. thermophilus, including CRISPR1 and CRISPR3, that both have the ability to rapidly acquire novel spacers in response to phage challenge [15,22,40,41] or plasmid exposure [21]. Although the occurrence of these four loci can vary widely in this species [30], the model strain DGCC7710 in which the first demonstration that CRISPR–Cas systems provide adaptive immunity in prokaryotes was shown [15] carries all four [22]. An overview of CRISPR–Cas systems present in S. thermophilus genomes is provided in Table 1. CRISPR1 and CRISPR3 are both subtype II-A CRISPR–Cas systems; however, each locus is associated with a distinct set of four cas genes, including the universal cas1 and cas2 genes, the Type II cas9 signature gene involved in target DNA cleavage [21,4244], and csn2, a gene exclusively found in this particular subtype. The subtype III-A CRISPR2–Cas system encodes several cas genes including the universal cas1 and cas2 genes, the Type III signature gene cas10 and cas6, which is important in crRNA biogenesis. Lastly, the subtype I-E CRISPR4–Cas system is associated with eight cas genes including the cas1 and cas2 universal genes, the Type I signature gene cas3 which is involved in target nucleic acid degradation [45,46], and the Cascade (CRISPR-associated complex for antiviral defence)-encoding genes. Among the S. thermophilus strains sequenced to date, this locus uniquely occurs in DGCC7710 (Table 1).

Table 1
Features of CRISPR–Cas systems occurring in S. thermophilus genomes

Each of the four CRISPR–Cas systems identified in S. thermophilus genomes is characterized by its type and subtype, on the basis of cas gene content [18], and by the typical repeat sequence found within the CRISPR array [30], namely 5′-GTTTTTGTACTCTCAAGATTTAAGTAACTGTACAAC-3′ for CRISPR1, 5′-GATATAAACCTAATTACCTCGAGAGGGGACGGAAAC-3′ for CRISPR2, 5′-GTTTTAGAGCTGTGTTGTTTCGAATGGTTCCAAAAC-3′ for CRISPR3 and 5′-GTTTTTCCCGCACACGCGGGGGTGATCC-3′ for CRISPR4. For each strain, the number of repeats (including possibly degenerated repeat sequences, notably at the trailer end of the array) and of cas genes adjacent to the array are provided. An asterisk (*) indicates the presence of mutations (frameshift, premature stop, INDEL) that alter the number and/or integrity of cas genes. Accession numbers for complete chromosome sequences: CP000024 (CNRZ1066), CP000023 (LMG 18311), FR875178 (JIM 8232), CP000419 (LMD-9), CP003499 (MN-ZLW-002) and CP002340 (ND03).

 CRISPR1 (subtype II-A) CRISPR2 (subtype III-A) CRISPR3 (subtype II-A) CRISPR4 (subtype I-E) 
Strain Repeats cas genes Repeats cas genes Repeats cas genes Repeats cas genes 
DGCC7710 33 10* 13 13 
CNRZ1066 42 7* 
LMG 18311 34 11* 
JIM 8232 43 18 10 1* 
LMD-9 17 9* 
MN-ZLW-002 31 9* 27 
ND03 37 10* 21 
 CRISPR1 (subtype II-A) CRISPR2 (subtype III-A) CRISPR3 (subtype II-A) CRISPR4 (subtype I-E) 
Strain Repeats cas genes Repeats cas genes Repeats cas genes Repeats cas genes 
DGCC7710 33 10* 13 13 
CNRZ1066 42 7* 
LMG 18311 34 11* 
JIM 8232 43 18 10 1* 
LMD-9 17 9* 
MN-ZLW-002 31 9* 27 
ND03 37 10* 21 

The S. thermophilus genome

The genome sequence of the industrial S. thermophilus DGCC7710 (DuPont Global Culture Collection) strain was determined by 454 pyrosequencing using Roche GS FLX technology. A total of 220377 reads totalling 52211361 nt was used for assembly, representing approximately a 29× coverage of the genome. Primary assembly of raw sequencing data was performed using Newbler's gsAssembler program (Roche), which generated 155 contigs ranging in size between 100 and 194234 bp. Large contigs above 500 bp were subsequently ordered with the ProgressiveMauve software [47] using the LMD-9 complete genome as a template. Several gaps were closed by Sanger sequencing of PCR amplicons, and the assembly was validated further using optical mapping (OpGen). The assembled DGCC7710 draft genome (accession number AWVZ00000000) consists of 17 contigs totalling 1798341 bp, which encode at least 2124 ORFs. The relatively high synteny observed across all the S. thermophilus genomes sequenced to date is illustrated by the small number of blocks provided by the ProgressiveMauve alignment, with notably two large blocks spanning two-thirds of the genome (Figure 1). Apart from a small number of genes missing in some genomes, it is noteworthy to highlight that most genes are widely conserved across genetically distinct strains. With ‘only’ five blocks spanning the other one-third of the genome, it is also fairly obvious that, overall, gene content and organization are highly conserved in the S. thermophilus species (Figure 1).

Comparative analysis of S. thermophilus genomes

Figure 1
Comparative analysis of S. thermophilus genomes

Alignment of the DGCC7710 draft genome to complete S. thermophilus genomes CNRZ1066, LMG 18311, JIM 8232, LMD-9, MN-ZLW-002 and ND03 using ProgressiveMauve [47], which defines conserved and locally co-linear blocks of genes coloured to reveal sequence identity and synteny. Conversely, unique regions and genomic islands are represented as white areas (with low sequence conservation), annotated according to the functions of the genes they encode. The boxed region within MN-ZLW-002 indicates a putative genomic rearrangement (inversion). Vertical red lines indicate contig boundaries. IS, insertion sequence.

Figure 1
Comparative analysis of S. thermophilus genomes

Alignment of the DGCC7710 draft genome to complete S. thermophilus genomes CNRZ1066, LMG 18311, JIM 8232, LMD-9, MN-ZLW-002 and ND03 using ProgressiveMauve [47], which defines conserved and locally co-linear blocks of genes coloured to reveal sequence identity and synteny. Conversely, unique regions and genomic islands are represented as white areas (with low sequence conservation), annotated according to the functions of the genes they encode. The boxed region within MN-ZLW-002 indicates a putative genomic rearrangement (inversion). Vertical red lines indicate contig boundaries. IS, insertion sequence.

Consistent with the functional properties of the strain, the genome of DGCC7710 includes numerous important genes involved in the fermentation of milk into dairy products such as yogurt. Specifically, given the milk acidification reliance on lactose fermentation, the presence of a dedicated lacS permease associated with the lacZ β-galactosidase which feeds into the glycolysis and Leloir pathways for eventual lactic acid production is a staple of the S. thermophilus genome [4,7]. Additionally, the ability to generate texture during yogurt fermentation is a key functional feature of selected S. thermophilus starter strains. The eps gene cluster within the DGCC7710 genome illustrates the ability of this strain to generate exopolysaccharides that yield desirable viscosity and mouth feel [48]. An important feature encoded within the DGCC7710 genome is the molecular machinery involved in natural competence, which has been described as a new molecular biology tool instrumental in manipulating and investigating this organism [4951].

Another important feature of S. thermophilus genomes is the widespread presence of transposases and IS (insertion sequence) elements that reflect genomic plasticity and further complicate the assembly and genome closing process. Nevertheless, a careful analysis of the differential content between genomes, and especially the unique content in the genome of DGCC7710, reveals the presence of multiple genomic islands that encode functionally important genes (Figure 1). The identified diverse and unique genomic islands are consistent with previous reports showing that differential content between various strains of S. thermophilus primarily consist of mobile genetic recombinases, exopolysaccharide biosynthesis enzymes, bacteriocins and phage-resistance mechanisms including R-M and CRISPR [1]. Diversity between strains in the eps gene cluster has repeatedly been shown in S. thermophilus, and may involve horizontal transfer of genes in connection with transposable elements such as IS3, IS6 and ISL3 families [48,52]. Similar lateral gene transfer events may also be mediated by ICEs (integrative conjugative elements), and the spread of the important prtS cell-wall proteinase likewise exemplifies the diversity, plasticity and adaptability of S. thermophilus genomes [9,51,53].

Overall, the genome sequence of DGCC7710 illustrates further the evolutionary path of the S. thermophilus species towards adaptation to milk and highlights the importance of phage-resistance systems, notably R-M and CRISPR–Cas.

Building CRISPR immunity through iterative phage challenges

In order to enhance the phage resistance in DGCC7710, four iterative cycles of phage exposure followed by CRISPR BIM (bacteriophage-insensitive mutant) selection were performed as outlined previously [15,21], using a set of biodiverse lytic phages chosen for their virulence spectra. The iterative rounds of phage exposure chronologically generated DGCC9705 (following challenge with phage 2972), DGCC9726 (following challenge with phage 3821), DGCC9733 (following challenge with phage 3288) and, eventually, the fourth-generation BIM DGCC9836 (following challenge with phage 4753) which was insensitive to all four phages (Figure 2).

CRISPR immunization using iterative phage challenges

Figure 2
CRISPR immunization using iterative phage challenges

(A) Iterative phage challenges scheme and BIM screening and cultivation. (B) Overview of the CRISPR spacer content across the four CRISPR loci (represented in numerical order in each BIM), with unique square colour combinations representing a particular spacer sequence, and novel CRISPR spacer-acquisition events tracked iteratively, ultimately yielding five new spacers in CRISPR1 and three new spacers in CRISPR3. (C) Overview of phage sensitivity as defined by lysotypes, with the phage-resistance spectrum increasing iteratively with each challenge. R, resistant; S, sensitive.

Figure 2
CRISPR immunization using iterative phage challenges

(A) Iterative phage challenges scheme and BIM screening and cultivation. (B) Overview of the CRISPR spacer content across the four CRISPR loci (represented in numerical order in each BIM), with unique square colour combinations representing a particular spacer sequence, and novel CRISPR spacer-acquisition events tracked iteratively, ultimately yielding five new spacers in CRISPR1 and three new spacers in CRISPR3. (C) Overview of phage sensitivity as defined by lysotypes, with the phage-resistance spectrum increasing iteratively with each challenge. R, resistant; S, sensitive.

A detailed sequence analysis of CRISPR loci revealed polarized insertion of phage protospacer sequences at the leader end of both CRISPR1 and CRISPR3. All acquired protospacers were systematically associated with a PAM (protospacer-adjacent motif) [30,40]. This is consistent with previous reports characterizing the PAMs in active S. thermophilus CRISPR–Cas systems, and their involvement in both spacer acquisition and invasive nucleic acid interference [21,44,54]. In contrast, no CRISPR acquisition was detected in CRISPR2 or in CRISPR4. Notwithstanding the propensity of inactive CRISPR loci for evolution by internal spacer deletion through homologous recombination between identical CRISPR repeats, we observed strict conservation of CRISPR spacer content and CRISPR repeat sequences, suggesting that even inactive CRISPR–Cas systems in S. thermophilus can be relatively stable, at least within the timeframe that generated these BIMs. We cannot rule out the possibility that these CRISPR–Cas systems are active or that they may be involved in biological roles that go beyond immunity against foreign genetic elements, which would be consistent with the detection and occasional induction of cas genes and the proteins they encode in this strain [55]. The extent and rapidity of novel spacer acquisition observed here in a laboratory system corresponds to similar findings derived from natural microbial communities exposed to phages where only the most recently acquired spacers match coexisting phages [32], which is consistent with modelling predictions [56].

The propensity of both active CRISPR–Cas systems to readily acquire novel spacers following phage exposure provides a convenient and efficient option to develop strains with a broad range of phage resistance through iterative challenges. Furthermore, the sequential addition of spacers that collectively confer increasing levels of resistance provides a molecular basis for enhancing phage resistance depth. Accordingly, we anticipate that building CRISPR immunity iteratively using a diversity of phages will allow the biogenesis of novel strains with increased phage resistance in terms of both spectrum breadth and resistance depth, rapidly extending the lifespan of commercial cultures for perennial use in industrial settings where phages are ubiquitously problematic.

A practical advantage of generating BIMs with multiple spacers is that they provide a unique set of chromosome-encoded sequences that can be used as natural genetic tags theoretically rare and unique to selected, thus proprietary, strains. Likewise, these hypervariable sequences and dynamic genetic loci provide a molecular basis for high-resolution genotyping of even very closely related isolates [16,30]. Genomically, the argument could be made that the biogenesis of a diverse population of CRISPR genotypes derived from the exposure of a single original wild-type strain establishes novel genetic biodiversity or strains. This natural approach can be readily replicated or synthetically implemented in laboratory settings using molecular biology techniques. Furthermore, the recent development of small guide RNAs that drive Cas9-mediated interference provide a molecular basis for CRISPR-mediated, but CRISPR spacer-independent, (re)programmable immunity [36,57].

Impact of CRISPR immunization on the host genome

A comparative analysis of the terminal BIM (DGCC9836) draft genome (1762882 bp) compared with that of the wild-type revealed the presence of 552 bp of differential content. It was readily determined that 526 bp were associated with CRISPR-related insertions resulting from the addition of novel CRISPR spacers into the two active CRISPR loci, namely CRISPR1 and CRISPR3 (Figure 2). This is consistent with the experimental design whereby intermediates were specifically selected following PCR screening of locus (CRISPR1 and/or CRISPR3), with size increase reflecting CRISPR adaptation events. Overall, the differences consisted of 16 putative INDELs (insertions/deletions) and three SNPs (single nucleotide polymorphisms) (Table 2). Each difference predicted in silico was subjected to Sanger re-sequencing both in the wild-type and final BIM, but also in the three intermediate BIMs (Table 2). Given the anticipated impact difference between an INDEL (loss and frameshifts) as opposed to SNPs (non-synonymous at worst), we were initially surprised by the high ratio of predicted INDELs/SNPs. Intriguingly, a majority of the INDELs (ten of 16) actually corresponded to single nucleotide deletions. Careful analysis of the genomic context of these sequences revealed that these single nucleotide INDELs occurred primarily within homopolynucleotidic sequences, a caveat of next-generation pyrosequencing technologies. Likewise, sequencing results revealed that SNP2 was derived from a pyrosequencing error. In contrast, INDEL2 was detected within the first round of phage challenge (Table 2). Similarly, SNP3 was also validated and documented to occur during the initial phage exposure. The anticipated insertion of novel spacers within the active CRISPR1–Cas and CRISPR3–Cas systems, INDEL8 and INDEL14 respectively, were validated. These results confirm the high rate of incorrect SNP and INDEL predictions arising from next-generation sequencing technologies, and highlight the need to systematically re-sequence putative mutations using Sanger sequencing, as not to overestimate mutation rates.

Table 2
Impact of iterative phage exposures on the S. thermophilus genome

DNA sequence of the chromosomal regions showing differences between DGCC7710 (parental strain) and DGCC9836 (final BIM) draft genomes. All of these regions were Sanger re-sequenced in both strains and in the three intermediate BIMs. Nucleotides in blue correspond to alleles found in DGCC7710, whereas nucleotides in red correspond to alleles found in DGCC9836. Nucleotides in green were initially absent from both draft genomes.

 
 

Analysis of INDEL2 revealed the in-frame insertion of three nucleotides in the first generation BIM DGCC9705, leading to the insertion of an aspartic acid residue within the predicted protein sequence of a putative bacterial capsule synthesis protein/poly-γ-glutamate biosynthesis enzyme. Analysis of the occurrence of this gene in other S. thermophilus genomes, namely STER_0153 in LMD-9 or stu0110 in LMG 18311, revealed annotations as uncharacterized or unknown or hypothetical proteins. This sequence seems to be highly conserved across Streptococcus spp., with up to 96% similarity in most species, including S. pneumoniae, S. suis, S. equii, S. mutans, S. pyogenes and S. agalactiae.

In silico analysis of INDEL13 indicated the deletion of a single nucleotide (A) within a poly(A) stretch during the fourth phage challenge, generating the DGCC9836 BIM. This mutation yields a premature stop codon, leading to a predicted truncated protein (36 instead of 666 amino acids). This mutation is located towards the 5′ end of a gene encoding a putative ABC (ATP-binding cassette) transporter/permease. This gene is also present in CNRZ1066 (str1333) and LMG 18311 (stu1333), predicted to encode a peptide-4 ABC exporter, whereas it is annotated as an antimicrobial peptide transporter in LMD-9 (STER_1307). Likewise, this sequence has orthologues in many Streptococcus spp., including S. pneumoniae, S. sanguinis, S. salivarius, S. suis and S. gordonii.

The first SNP (SNP1, Table 2) is a T>A silent mutation in strain DGCC9705, within the STER_0096 (LMD-9) orthologue, predicted to encode a leucyl aminopeptidase (aminopeptidase T).

The third SNP (SNP3, Table 2) is a C>T mutation in strain DGCC9836, at the 3′ end of the STER_1849 (LMD-9) orthologue, predicted to encode the small regulatory subunit of an acetolactate synthase. This non-synonymous mutation is predicted to change the last residue of the 158-amino-acid protein, replacing an asparagine residue with aspartic acid.

Overall, the comparative analysis of the draft genome sequences of the parental strain with that of a fourth-generation CRISPR BIM initially suggested the presence of 16 INDELs and three SNPs (Table 2), of which two were selected for (spacer acquisitions in the active CRISPR1 and CRISPR3 loci), and ‘only’ four validated (two SNPs and two INDELs), since the majority of predicted differences were sequencing artefacts. We are mindful that these sequences may ‘only’ represent 97% of the complete genome size, and that other mutations may have occurred in the remaining gaps, which consist primarily of ribosomal DNA sequences and transposons. Altogether, these results indicate that multiple iterative phage challenges primarily give rise to novel spacer insertions within active CRISPR loci, and that there are occasional mutations consistent with natural evolutionary events giving rise to SNPs and small INDELs. Surprisingly, the stress inherent to phage exposure does not seem to significantly affect the mutation rate or evolutionary pattern of S. thermophilus genomes, other than CRISPR immunization events, indicating that CRISPR-mediated processes arguably generate isogenic variants.

Outlook

Although the two primary forces driving the overall genome evolution of S. thermophilus consist of genome reduction by iterative gene losses in combination with occasional acquisition of beneficial genes through horizontal gene transfer for adaptation to a rich environment (primarily milk) [6,53,58,59], we show in the present article that CRISPR plays a major role in genome evolution following exposure to phages. Indeed, regressive genome evolution by extensive gene loss has been a key driving force shaping the adaptation of S. thermophilus to the rich milk environment, illustrated by the loss of virulence genes widely distributed in most streptococci. Overall, the DGCC7710 genome shares a high degree of synteny with other S. thermophilus genomes, with a few unique genomic islands and hypervariable loci that include the eps operon, the gp operon and CRISPR–Cas systems. Focusing on genome interplay within host–virus dynamics, we propose that the impact of the virus on host genome evolution is relatively limited, primarily consisting of CRISPR immunity build-up, whereas the effects of the resistant host on viral genome evolution are conversely relatively widespread, consisting of extensive protospacer mutations, PAM mutations and occasionally recombination.

The development of model laboratory spacer-acquisition systems, together with knowledge inferred from CRISPR adaptation in natural systems, and the development of mathematical models for CRISPR locus evolution is rapidly expanding our limited understanding of the adaptation/acquisition phase [25,34,56]. This makes DGCC7710 an appropriate model organism to fundamentally investigate the balance between CRISPR spacer acquisition and occasional loss, and investigate the impact of CRISPR immunity build-up on genome evolution. Notwithstanding the expanding understanding of the organization, content, mechanistic and molecular underpinnings of CRISPR-mediated targeting of complementary nucleic acids, relatively little is known about the short- and long-term interplay with viruses. The present article is the first report of iterative build-up of CRISPR-encoded immunity against a series of genetically distinct phages, and shows that CRISPR immunization does not have a significant impact on the host genome.

Practically, the work described in the present article provides a proof of concept for the development of next-generation starter cultures with naturally generated CRISPR immunity, optimally developed through multiple iterative rounds of exposure to a diversity of industrially relevant phages. Overall, these results confirm that active CRISPR loci are subject to rapid evolution by acquisition of novel spacers derived from protospacers associated with PAMs. Moreover, the data indicate that both active and inactive loci are stable with regards to spacer loss, which is consistent with a previous metagenomic study of this system on a similar timeframe [60]. Eventually, iterative addition of multiple spacers derived from different phages provides increased immunity in terms of both level and spectrum of resistance. This natural process then relies on the availability of virulent phages, and the ability to readily screen CRISPR BIMs using PCR monitoring the polarized insertion of novel spacers in active CRISPR loci. The concurrent use of multiple CRISPR–Cas systems that recognize and target different PAMs should provide additional pressure on phage genomes.

Exploiting CRISPR-based strategies to enhance phage resistance in various S. thermophilus strains and using multiple isogenic variants in rotation schemes will ensure sustainable and perennial use of the most efficient and desirable strains with extended lifespans. Harnessing CRISPR immunity may be enhanced further by combination with other efficient and compatible phage-resistance mechanism [61]. These results further our understanding of virus–host dynamics, especially with regard to the impact of CRISPR immunity, and provide an evolutionary framework for the analysis of the interactions between bacteria and their phages, and their ecological impact.

CRISPR Evolution, Mechanisms and Infection: A Biochemical Society Focused Meeting held at the University of St Andrews, U.K., 17–19 June 2013. Organized and Edited by Emmanuelle Charpentier (Laboratory for Molecular Infection Medicine Sweden, Sweden), John van der Oost (Wageningen University, The Netherlands) and Malcolm White (University of St Andrews, U.K.).

Abbreviations

     
  • ABC

    ATP-binding cassette

  •  
  • BIM

    bacteriophage-insensitive mutant

  •  
  • Cas

    CRISPR-associated

  •  
  • CRISPR

    clustered regularly interspaced short palindromic repeats

  •  
  • crRNA

    CRISPR RNA

  •  
  • INDEL

    insertion/deletion

  •  
  • PAM

    protospacer-adjacent motif

  •  
  • R-M

    restriction–modification

  •  
  • SNP

    single nucleotide polymorphism

We are thankful for the many insightful conversations we have had with Sylvain Moineau and team members in his laboratory at Université Laval (Québec, Canada). We also thank Mickaël Charron, DuPont Nutrition and Health, for his technical support in gap closure.

Funding

This work was supported by funding from DuPont Nutrition and Health.

References

References
1
Bolotin
A.
Quinquis
B.
Renault
P.
Sorokin
A.
Ehrlich
S.D.
Kulakauskas
S.
Lapidus
A.
Goltsman
E.
Mazur
M.
Pusch
G.D.
, et al. 
Complete sequence and comparative genome analysis of the dairy bacterium Streptococcus thermophilus
Nat. Biotechnol.
2004
, vol. 
22
 (pg. 
1554
-
1558
)
2
Hols
P.
Hancy
F.
Fontaine
L.
Grossiord
B.
Prozzi
D.
Leblond-Bourget
N.
Decaris
B.
Bolotin
A.
Delorme
C.
Ehrlich
S.D.
, et al. 
New insights in the molecular biology and physiology of Streptococcus thermophilus revealed by comparative genomics
FEMS Microbiol. Rev.
2005
, vol. 
29
 (pg. 
435
-
463
)
3
Qin
J.
Li
R.
Raes
J.
Arumugam
M.
Burgdorf
K.S.
Manichanh
C.
Nielsen
T.
Pons
N.
Levenez
F.
Yamada
T.
, et al. 
A human gut microbial gene catalogue established by metagenomic sequencing
Nature
2010
, vol. 
464
 (pg. 
59
-
65
)
4
Mills
S.
O’Sullivan
O.
Hill
C.
Fitzgerald
G.
Ross
R.P.
The changing face of dairy starter culture research: from genomics to economics
Int. J. Dairy Technol.
2010
, vol. 
63
 (pg. 
149
-
170
)
5
Pastink
M.I.
Teusink
B.
Hols
P.
Visser
S.
de Vos
W.M.
Hugenholtz
J.
Genome-scale model of Streptococcus thermophilus LMG18311 for metabolic comparison of lactic acid bacteria
Appl. Environ. Microbiol.
2009
, vol. 
75
 (pg. 
3627
-
3633
)
6
Makarova
K.
Slesarev
A.
Wolf
Y.
Sorokin
A.
Mirkin
B.
Koonin
E.
Pavlov
A.
Pavlova
N.
Karamychev
V.
Polouchine
N.
, et al. 
Comparative genomics of the lactic acid bacteria
Proc. Natl. Acad. Sci. U.S.A.
2006
, vol. 
103
 (pg. 
15611
-
15616
)
7
Goh
Y.J.
Goin
C.
O’Flaherty
A.
Altermann
E.
Hutkins
R.
Specialized adaptation of a lactic acid bacterium to the milk environment: the comparative genomics of Streptococcus thermophilus LMD-9
Microb. Cell Fact.
2011
, vol. 
10
 
Suppl. 1
pg. 
S22
 
8
Sun
Z.
Chen
X.
Wang
J.
Zhao
W.
Shao
Y.
Wu
L.
Zhou
Z.
Sun
T.
Wang
L.
Meng
H.
Zhang
H.
Chen
W.
Complete genome sequence of Streptococcus thermophilus strain ND03
J. Bacteriol.
2011
, vol. 
193
 (pg. 
793
-
794
)
9
Delorme
C.
Bartholini
C.
Luraschi
M.
Pons
N.
Loux
V.
Almeida
M.
Guédon
E.
Gibrat
J.F.
Renault
P.
Complete genome sequence of the pigmented Streptococcus thermophilus strain JIM8232
J. Bacteriol.
2011
, vol. 
193
 (pg. 
5581
-
5582
)
10
Kang
X.
Ling
N.
Sun
G.
Zhou
Q.
Zhang
L.
Sheng
Q.
Complete genome sequence of Streptococcus thermophilus strain MN-ZLW-002
J. Bacteriol.
2012
, vol. 
194
 (pg. 
4428
-
4429
)
11
Brüssow
H.
Phages of dairy bacteria
Annu. Rev. Microbiol.
2001
, vol. 
55
 (pg. 
283
-
303
)
12
Quiberoni
A.
Moineau
S.
Rousseau
G.M.
Reinheimer
J.
Hackermann
H.-W.
Streptococcus thermophilus bacteriophages
Int. Dairy J.
2010
, vol. 
20
 (pg. 
657
-
664
)
13
Labrie
S.J.
Samson
J.E.
Moineau
S.
Bacteriophage resistance mechanisms
Nat. Rev. Microbiol.
2010
, vol. 
8
 (pg. 
317
-
327
)
14
Deveau
H.
Garneau
J.E.
Moineau
S.
CRISPR/Cas system and its role in phage–bacteria interactions
Annu. Rev. Microbiol.
2010
, vol. 
64
 (pg. 
475
-
493
)
15
Barrangou
R.
Fremaux
C.
Boyaval
P.
Richards
M.
Deveau
H.
Moineau
S.
Romero
D.A.
Horvath
P.
CRISPR provides acquired resistance against viruses in prokaryotes
Science
2007
, vol. 
315
 (pg. 
1709
-
1712
)
16
Barrangou
R.
Horvath
P.
CRISPR: new horizons in phage resistance and strain identification
Annu. Rev. Food Sci. Technol.
2012
, vol. 
3
 (pg. 
143
-
162
)
17
Grissa
I.
Vergnaud
G.
Pourcel
C.
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats
BMC Bioinfor.
2007
, vol. 
8
 (pg. 
172
-
182
)
18
Makarova
K.S.
Haft
D.H.
Barrangou
R.
Brouns
S.J.
Charpentier
E.
Horvath
P.
Moineau
S.
Mojica
F.J.
Wolf
Y.I.
Yakunin
A.F.
, et al. 
Evolution and classification of the CRISPR–Cas systems
Nat. Rev. Microbiol.
2011
, vol. 
9
 (pg. 
467
-
477
)
19
Mojica
F.J.
Diez-Villasenor
C.
Garcia-Martinez
J.
Soria
E.
Intervening sequences of regularly spaced prokaryotic repeats derive from foreign genetic elements
J. Mol. Evol.
2005
, vol. 
60
 (pg. 
174
-
182
)
20
Mills
S.
Griffin
C.
Coffey
A.
Mijer
W.C.
Hafkamp
B.
Ross
R.P.
CRISPR analysis of bacteriophage insensitive mutants (BIMs) of industrial Streptococcus thermophilus: implications for starter design
J. Appl. Microbiol.
2010
, vol. 
108
 (pg. 
945
-
955
)
21
Garneau
J.E.
Dupuis
M.-È.
Villion
M.
Romero
D.A.
Barrangou
R.
Boyaval
P.
Fremaux
C.
Horvath
P.
Magadán
A.H.
Moineau
S.
The CRISPR/Cas bacterial immune system cleaves bacteriophage and plasmid DNA
Nature
2010
, vol. 
468
 (pg. 
67
-
71
)
22
Horvath
P.
Barrangou
R.
CRISPR/Cas, the immune system of Bacteria and Archaea
Science
2010
, vol. 
327
 (pg. 
167
-
170
)
23
Bhaya
D.
Davison
M.
Barrangou
R.
CRISPR–Cas systems in bacteria and archaea: versatile small RNAs for adaptive defense and regulation
Annu. Rev. Genet.
2011
, vol. 
45
 (pg. 
273
-
297
)
24
Barrangou
R.
CRISPR–Cas systems and RNA-guided interference
Wiley Interdiscip. Rev.: RNA
2013
, vol. 
4
 (pg. 
267
-
278
)
25
Fineran
P.C.
Charpentier
E.
Memory of viral infections by CRISPR–Cas adaptive immune systems: acquisition of new information
Virology
2013
, vol. 
434
 (pg. 
202
-
209
)
26
Marraffini
L.A.
Sontheimer
E.J.
CRISPR interference: RNA-directed adaptive immunity in bacteria and archaea
Nat. Rev. Gen.
2010
, vol. 
11
 (pg. 
181
-
190
)
27
Westra
E.R.
Swarts
D.C.
Staals
R.H.
Jore
M.M.
Brouns
S.J.
van der Oost
J.
The CRISPRs, they are a-changin’: how prokaryotes generate adaptive immunity
Annu. Rev. Genet.
2012
, vol. 
46
 (pg. 
311
-
339
)
28
Jore
M.M.
Brouns
S.J.
van der Oost
J.
RNA in defense: CRISPRs protect prokaryotes against mobile genetic elements
Cold Spring Harbor Perspect. Biol.
2012
, vol. 
4
 pg. 
a003657
 
29
Wiedenheft
B.
Sternberg
S.H.
Doudna
J.A.
RNA-guided genetic silencing systems in bacteria and archaea
Nature
2012
, vol. 
482
 (pg. 
331
-
338
)
30
Horvath
P.
Romero
D.A.
Coûté-Monvoisin
A.-C.
Richards
M.
Deveau
H.
Moineau
S.
Boyaval
P.
Fremaux
C.
Barrangou
R.
Diversity, activity, and evolution of CRISPR loci in Streptococcus thermophilus
J. Bacteriol.
2008
, vol. 
190
 (pg. 
1401
-
1412
)
31
Tyson
G.W.
Banfield
J.F.
Rapidly evolving CRISPRs implicated in acquired resistance of microorganisms to viruses
Environ. Microbiol.
2008
, vol. 
10
 (pg. 
200
-
207
)
32
Andersson
A.F.
Banfield
J.F.
Virus population dynamics and acquired virus resistance in natural microbial communities
Science
2008
, vol. 
320
 (pg. 
1047
-
1050
)
33
Held
N.L.
Whitaker
R.J.
Viral biogeography revealed by signatures in Sulfolobus islandicus genomes
Environ. Microbiol.
2009
, vol. 
11
 (pg. 
457
-
466
)
34
Levin
B.R.
Moineau
S.
Bushman
M.
Barrangou
R.
The population and evolutionary dynamics of phage and bacteria with CRISPR-mediated immunity
PLoS Genet.
2013
, vol. 
9
 pg. 
e1003312
 
35
Pride
D.T.
Sun
C.L.
Salzman
J.
Rao
N.
Loomer
P.
Armitage
G.C.
Banfield
J.F.
Relman
D.A.
Analysis of streptococcal CRISPRs from human saliva reveals substantial sequence diversity within and between subjects over time
Genome Res.
2011
, vol. 
21
 (pg. 
126
-
136
)
36
Jinek
M.
Chylinski
K.
Fonfara
I.
Hauer
M.
Doudna
J.A.
Charpentier
E.
A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity
Science
2012
, vol. 
337
 (pg. 
816
-
821
)
37
Mali
P.
Yang
L.
Esvelt
K.M.
Aach
J.
Guell
M.
DiCarlo
J.E.
Norville
J.E.
Church
G.M.
RNA-guided human genome engineering via Cas9
Science
2013
, vol. 
339
 (pg. 
823
-
826
)
38
Cong
L.
Ran
F.A.
Cox
D.
Lin
S.
Barretto
R.
Habib
N.
Hsu
P.D.
Wu
X.
Jiang
W.
Marraffini
L.A.
Zhang
F.
Multiplex genome engineering using CRISPR/Cas systems
Science
2013
, vol. 
339
 (pg. 
819
-
823
)
39
Jiang
W.
Bikard
D.
Cox
D.
Zhang
F.
Marraffini
L.A.
RNA-guided editing of bacterial genomes using CRISPR–Cas systems
Nat. Biotechnol.
2013
, vol. 
31
 (pg. 
233
-
239
)
40
Deveau
H.
Barrangou
R.
Garneau
J.E.
Labonté
J.
Fremaux
C.
Boyaval
P.
Romero
D.A.
Horvath
P.
Moineau
S.
Phage response to CRISPR-encoded resistance in Streptococcus thermophilus
J. Bacteriol.
2008
, vol. 
190
 (pg. 
1390
-
1400
)
41
Horvath
P.
Coûté-Monvoisin
A.-C.
Romero
D.A.
Boyaval
P.
Fremaux
C.
Barrangou
R.
Comparative analysis of CRISPR loci in lactic acid bacteria genomes
Int. J. Food Microbiol.
2009
, vol. 
131
 (pg. 
62
-
70
)
42
Gasiunas
G.
Barrangou
R.
Horvath
P.
Siksnys
V.
Cas9–crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria
Proc. Natl. Acad. Sci. U.S.A.
2012
, vol. 
109
 (pg. 
E2579
-
E2586
)
43
Sapranauskas
R.
Gasiunas
G.
Fremaux
C.
Barrangou
R.
Horvath
P.
Siksnys
V.
The Streptococcus thermophilus CRISPR/Cas system provides immunity in Escherichia coli
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
9275
-
9282
)
44
Magadán
A.H.
Dupuis
M.-È.
Villion
M.
Moineau
S.
Cleavage of phage DNA by the Streptococcus thermophilus CRISPR3–Cas system
PLoS ONE
2012
, vol. 
7
 pg. 
e40913
 
45
Sinkunas
T.
Gasiunas
G.
Femaux
C.
Barrangou
R.
Horvath
P.
Siksnys
V.
Cas3 is a single-stranded DNA nuclease and ATP-dependent helicase in the CRISPR/Cas immune system
EMBO J.
2011
, vol. 
30
 (pg. 
1335
-
1342
)
46
Sinkunas
T.
Gasiunas
G.
Waghmare
A.P.
Dickman
M.J.
Barrangou
R.
Horvath
P.
Siksnys
V.
In vitro reconstitution of Cascade-mediated CRISPR immunity in Streptococcus thermophilus
EMBO J.
2013
, vol. 
32
 (pg. 
385
-
394
)
47
Darling
A.E.
Mau
B.
Perna
N.T.
ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement
PLoS ONE
2010
, vol. 
5
 pg. 
e11147
 
48
Broadbent
J.R.
McMahon
D.J.
Welker
D.L.
Oberg
C.J.
Moineau
S.
Biochemistry, genetics and applications of exopolysaccharide production in Streptococcus thermophilus
J. Dairy Sci.
2003
, vol. 
86
 (pg. 
407
-
423
)
49
Fontaine
L.
Boutry
C.
de Frahan
M.H.
Delplace
B.
Fremaux
C.
Horvath
P.
Boyaval
P.
Hols
P.
A novel pheromone quorum-sensing system controls the development of natural competence in Streptococcus thermophilus and Streptococcus salivarius
J. Bacteriol.
2010
, vol. 
192
 (pg. 
1444
-
1454
)
50
Fontaine
L.
Dandoy
D.
Boutry
C.
Delplace
B.
de Frahan
M.H.
Fremaux
C.
Horvath
P.
Boyaval
P.
Hols
P.
Development of a versatile procedure based on natural transformation for marker-free targeted genetic modification in Streptococcus thermophilus
Appl. Environ. Microbiol.
2010
, vol. 
76
 (pg. 
7870
-
7877
)
51
Dandoy
D.
Fremaux
C.
de Frahan
M.H.
Horvath
P.
Boyaval
P.
Hols
P.
Fontaine
L.
The fast milk acidifying phenotype of Streptococcus thermophilus can be acquired by natural transformation of the genomic island encoding the cell-envelope proteinase PrtS
Microb. Cell Fact.
2011
, vol. 
10
 
Suppl. 1
pg. 
S21
 
52
Bourgoin
F.
Pluvinet
A.
Gintz
B.
Decaris
B.
Guédon
G.
Are horizontal transfers involved in the evolution of the Streptococcus thermophilus exopolysaccharide synthesis loci?
Gene
1999
, vol. 
233
 (pg. 
151
-
161
)
53
Delorme
C.
Bartholini
C.
Bolotin
A.
Ehrlich
S.D.
Renault
P.
Emergence of a cell wall protease in the Streptococcus thermophilus population
Appl. Environ. Microbiol.
2010
, vol. 
76
 (pg. 
451
-
460
)
54
Paez-Espino
D.
Morovic
W.
Sun
C.L.
Thomas
B.C.
Ueda
K.I.
Stahl
B.
Barrangou
R.
Banfield
J.F.
Strong bias in the bacterial CRISPR elements that confer immunity to phage
Nat. Commun.
2013
, vol. 
4
 pg. 
1430
 
55
Young
J.C.
Dill
B.D.
Pan
C.
Hettich
R.L.
Banfield
J.F.
Shah
M.
Fremaux
C.
Horvath
P.
Barrangou
R.
Verberkmoes
N.C.
Phage-induced expression of CRISPR-associated proteins is revealed by shotgun proteomics in Streptococcus thermophilus
PLoS ONE
2012
, vol. 
7
 pg. 
e38077
 
56
Weinberger
A.D.
Sun
C.L.
Pluciński
M.M.
Denef
V.J.
Thomas
B.C.
Horvath
P.
Barrangou
R.
Gilmore
M.S.
Getz
W.M.
Banfield
J.F.
Persisting viral sequences shape microbial CRISPR-based immunity
PLoS Comput. Biol.
2012
, vol. 
8
 pg. 
e1002475
 
57
Karvelis
T.
Gasiunas
G.
Miksys
A.
Barrangou
R.
Horvath
P.
Siksnys
V.
crRNA and tracrRNA guide Cas9-mediated DNA interference in Streptococcus thermophilus
RNA Biol.
2013
, vol. 
10
 (pg. 
841
-
851
)
58
Eng
C.
Thibessard
A.
Danielsen
M.
Rasmussen
T.B.
Mari
J.F.
Leblond
P.
In silico prediction of horizontal gene transfer in Streptococcus thermophilus
Arch. Microbiol.
2011
, vol. 
193
 (pg. 
287
-
297
)
59
Rasmussen
T.B.
Danielsen
M.
Valina
O.
Garrigues
C.
Johansen
E.
Pedersen
M.B.
Streptococcus thermophilus core genome: comparative genome hybridization study of 47 strains
Appl. Environ. Microbiol.
2008
, vol. 
74
 (pg. 
4703
-
4710
)
60
Sun
C.L.
Barrangou
R.
Thomas
B.C.
Horvath
P.
Fremaux
C.
Banfield
J.F.
Phage mutations in response to CRISPR diversification in a bacterial population
Environ. Microbiol.
2013
, vol. 
15
 (pg. 
463
-
470
)
61
Dupuis
M.-È.
Villion
M.
Magadán
A.H.
Moineau
S.
CRISPR–Cas and restriction-modification systems are compatible and increase phage resistance
Nat. Commun.
2013
, vol. 
4
 pg. 
2087