The CRISPR (clustered regularly interspaced short palindromic repeat)/Cas9 adaptive immunity system has been harnessed for genome editing applications across eukaryotic species, but major drawbacks, such as the inefficiency of precise base editing and off-target activities, remain. A catalytically inactive Cas9 variant (dead Cas9, dCas9) has been fused to diverse functional domains for targeting genetic and epigenetic modifications, including base editing, to specific DNA sequences. As base editing does not require the generation of double-strand breaks, dCas9 and Cas9 nickase have been used to target deaminase domains to edit specific loci. Adenine and cytidine deaminases convert their respective nucleotides into other DNA bases, thereby offering many possibilities for DNA editing. Such base-editing enzymes hold great promise for applications in basic biology, trait development in crops, and treatment of genetic diseases. Here, we discuss recent advances in precise gene editing using different platforms as well as their potential applications in basic biology and biotechnology.
Genome engineering via the CRISPR (clustered regularly interspaced short palindromic repeat)/Cas9 system has revolutionized biology, therapeutics, and biotechnology. CRISPR/Cas9 is an adaptive immune system used to protect bacterial and archaeal species against invasion by foreign DNA and phages [1,2]. The CRISPR/Cas9 machinery acquires short fragments of such foreign DNA within an array of CRISPRs [3–6]. These DNA fragments function as molecular records of previous invasions and are transcribed together with CRISPR repeats as CRISPR RNA (crRNA). The crRNA and trans-activating crRNA (tracrRNA) form a complex with the Cas9 endonuclease and guide Cas9 to complementary sequences. Cas9 activity depends on the presence of a protospacer adjacent motif (PAM) sequence in the target DNA, thereby enabling the CRISPR/Cas9 machinery to recognize self from non-self DNA [7–9].
For genome engineering using CRISPR/Cas9, the crRNA and tracrRNA were combined into a single-guide RNA (sgRNA) capable of guiding Cas9 to complementary sequences. This resulted in a two-component system composed of the Cas9 endonuclease and sgRNA that can be engineered to target virtually any genomic locus and generate double-strand breaks (DSBs). DSBs are subsequently repaired via either the imprecise non-homologous end-joining (NHEJ) repair pathway or the precise homology-directed repair (HDR) pathway. NHEJ can be harnessed to generate gene knockouts and HDR can be harnessed for precise editing of DNA sequences. NHEJ is much more efficient than HDR across eukaryotic cells, especially in non-dividing cells, which has made precise gene editing challenging and limited its application in gene therapy. Furthermore, precision gene editing requires the presence of a homologous template containing the desired change for HDR. The simultaneous delivery of this template to the DSB in the target cells has further limited gene editing applications .
A catalytically inactive Cas9 endonuclease (dead Cas9, dCas9) has been used as a DNA targeting module to tether different enzymatic activities to specific DNA sequences for a variety of applications, including transcriptional regulation, epigenetic modification, and fluorescent genome tracking [11–14]. Similarly, dCas9 has been used to tether DNA deaminases for gene editing purposes. Several recent reports have employed a variety of chimeric dCas9–DNA deaminase fusions for precision gene editing and the generation of site-specific protein variants, which were then used to select for gain-of-function drug resistance phenotypes [15–20]. DNA editing is a key technology for engineering novel protein functions and increasing trait diversity, for example, in crop species key for food security. Genetic variation is key for evolution and organismal survival. In plants, modern crops have undergone extensive genetic changes through domestication from their wild relatives. Previously, radiation and chemical mutagenesis was used for forward genetic screens and to generate mutants with increased yield or other desirable traits , such as the semi-dwarf varieties that were key to the Green Revolution in the late 1970s .
Recently, several groups have developed different platforms for efficient and precise DNA base editing. This has opened myriad possibilities for engineering single-base changes, diversifying a localized sequence, generating novel protein variants, and accelerating the evolution of specific proteins to generate crop cultivars that can cope with biotic or abiotic stresses [18,19,23,24]. These efforts would accelerate trait development and expand the range of traits in agriculture and, in humans, could lead to gene therapies for genetic diseases. Here, we discuss the development of DNA and RNA base editors, along with their applications and limitations. Undoubtedly, these base-editing approaches expand the molecular toolbox to engineer genomes for functional biology, biotechnology, and gene therapy, and therefore represent an important chapter in the effort to improve the quality of human life.
Development of DNA base editors
The use of CRISPR/Cas9 to generate gene knockouts is feasible and straightforward in transformable eukaryotic species [1,25]. However, making precise single-base changes or substitutions (base editing) remains challenging, mainly because HDR is very inefficient across eukaryotes [26–28]. Moreover, HDR requires a repair template to precisely repair the genomic sequence across the DSB . Single-base substitutions and localized sequence diversification  are needed for directed evolution and for a variety of applications including the generation of functional variants of proteins for basic studies or for gene therapy applications to treat genetic diseases [15,24].
In contrast with DSB–HDR-mediated genome editing, base editing involves site-specific modification of the DNA base along with manipulation of the DNA repair machinery to avoid faithful repair of the modified base . Base editors are chimeric proteins composed of a DNA targeting module and a catalytic domain capable of deaminating a cytidine or adenine base (Figure 1B). There is no need to generate DSBs to edit DNA bases, thereby limiting the generation of insertions and deletions (indels) at target and off-target sites [17,32]. Several groups have developed different base-editing systems with different architectures, catalytic activities, and potential modifications. In most such systems, the DNA targeting module is based on dCas9 guided by an sgRNA molecule . Cas9 nickase can also be used as the targeting module, resulting in high frequencies of base editing (Figure 1B) [33,34].
The enzymatic activity, subsequent cellular repair events, and molecular modules of base editors.
Cytidine deaminase-based DNA base editors
Cytidine deaminases have been developed by two groups (Liu and Akihiko groups) and these enzymes catalyze the conversion of cytosine into uracil [17,19]. The first base editor was developed by the Liu group in 2016 . This base editor composed of dCas9 and the APOBEC deaminase successfully converted cytidine into thymidine with a catalytic window of activity of −16 to −12 bp from the PAM sequence (Figure 1B). In vivo, the APOPEC family of cytidine deaminases prevents HIV infection by base editing of the viral genomes. APOBEC3G deaminates the HIV genome, rendering it incapable of replication [20,35]. In the base-editing system, APOBEC, guided by dCas9, deaminates a specific cytidine to uracil; the resulting U–G mismatches are resolved via repair mechanisms and form U–A base pairs, and subsequently T–A base pairs. Thus, these base editors can be used to produce C-to-T point mutations (Figure 1A).
Subsequent to base editing of the DNA molecule, a DNA lesion is formed, which can be repaired and replaced by thymidine during DNA replication; base excision repair removes the uridine and allows the incorporation of any base, and mismatch repair through the trans-lesion synthesis increases the mutations at nearby nucleotides via error-prone polymerase (Figure 1A) . The hydrolytic deamination of cytosine by deaminases generates uracil as a product; uracil is read as thiamine by the cellular machinery (Figure 1A). Subsequently, another base editor, BE2, was developed by the addition of uracil DNA glycosylase (UGI) (Figure 1A,B) . Cytidine deaminase converts C into U and subsequently uracil DNA glycosylase can perform error-free repair, converting the U into the wild-type sequence. The addition of the UGI inhibits the base excision repair pathway, resulting in a three-fold increased efficiency (Figure 1A). Another major improvement of the system was achieved by the development of BE3, which uses the Cas9 D10A nickase, resulting in a six-fold increase in the base editing [38–41]. However, this increased catalytic activity resulted in an increased indel frequency (Figure 1B).
Multiple additional base-editing systems have been developed, with different deaminases and different targeting factors. In vivo, activation-induced cytidine deaminase (AID) facilitates antibody diversification via somatic hypermutation and class switch recombination. AID targets the Ig locus and generates diverse mutations that are selected through antigen binding. Loss of AID functions or promiscuous activity leads to serious disease states [42,43]. The target-AID system uses a nickase to recruit the cytidine deaminase pmCDA1 (Figure 1B) . dCas9-PmCDA exhibits modest catalytic activities; however, when Cas9 nickase was used, the efficiency was significantly improved but with more indels. The BE3 system is composed of dCas9-pmCDA, and UGI, and thus is similar to target-AID (Figure 1B). The BE3 system has been very successful in editing the genomes of mammalian and plant cells for a variety of applications [17,19]. Because the activity of the UGI is essential for inhibiting base excision repair and improving the base-editing efficiency, BE4 was developed to include two UGI molecules at the C- and N-termini . Although BE4 is more efficient than BE3 and target-AID, the window of catalytic activities is different (Figure 1B). Target-AID has a more distal PAM activity and is therefore more suitable for some applications, including directed evolution and generation of gain-of-function protein variants .
It is worth noting that base editing is not limited to the Cas9 backbone. For example, Cpf1-based deaminases were also generated for various base-editing purposes. The Cpf1 is a type V class 2 CRISPR endonuclease. Cpf1 lacks the HNH endonuclease domain, favors T-rich, -TTTN-, protospacer adjacent motifs (PAMs), and generates staggered DNA double-strand breaks with four or five nucleotides 5′ overhangs . The ability of Cpf1 to process its own sgRNA enhances its use in multiplex genome targeting . The Cpf1-generated PAM distal cuts enable successive sgRNA binding/s, thus resulting in more mutagenesis events . Piatek et al.  and Tang et al.  have achieved targeted transcriptional regulation in planta via the use of a catalytically inactive dCas9 or dead Cpf1 (dCpf1), respectively. Furthermore, Li et al. have generated the first Cpf1-based cytidine deaminase base editor. This base editor is composed of fusion of a rat APOBEC1 domain, catalytically inactive Lachnospiraceae bacterium Cpf1 (dLbCpf1) and uracil DNA glycosylase inhibitor (UGI), also called dLbCpf1-BE0. Cpf1 base editors would extend the base-editing capacity to sites where Cas9 cannot bind, specifically T-rich sequences. The editing window of this base editor ranges from positions 8 to 13 bp preceding the PAM and exhibits an editing efficiency of 20–22% . Similarly, Li et al. have generated other fusions based on the Cas9-BEs generated by Komor et al. . These fusions include dCpf1-BE-YE, dCpf1-eBE, and dCpf1-eBE-YE. Intriguingly, CRISPR–Cpf1-based BEs exhibit low levels of indel formation and non-C-to-T substitutions, thereby unlocking the editing at A/T-rich positions .
What determines the best base editor for a given application? The choice of base editor will depend on the availability of a PAM sequence, the presence of a C nucleotide relative to the PAM, how much indel generation can be tolerated, and how the base-editor reagents are delivered to the target cell. Furthermore, the nature of the edits could also be determined by the base editor. For example, BE4 and target-AID mediate C-to-T conversion. Proper control over indels through base excision repair inhibition remains crucial to the development of base-editing platforms. Fusion of a bacteriophage Mu protein (Gam) to a base editor allows the binding to the generated SSBs and protect them from degradation and exonuclease activity and thus decreasing the indel frequency  (Figure 1A).
Adenine deaminase-based DNA base editors
Adenine base editors have been generated to modify adenine bases [16,49]. The deamination of adenosine yields inosine, which can base pair with cytidine and subsequently be corrected to guanine, thereby converting A into G, or A–T into G–C (Figure 1A) . It is worth noting that adenine DNA deaminases do not exist in nature. Efforts employing directed evolution have been used to generate adenine base editors. Escherichia coli TadA, a tRNA adenine deaminase, was used to develop adenine base editors, which catalyze adenine deamination on single-stranded DNA and convert adenine into inosine [16,51–53]. Gaudelli et al. replaced the rAPOBEC1 CDA in BE3 with E. coli TadA (ecTadA), a tRNA adenine deaminase that converts adenine into inosine in the single-stranded anticodon loop of tRNAARG and also shares sequence similarity with the APOBEC enzyme [16,51,52]. To test Tad on a DNA target, Gaudelli et al. used protein engineering and homology studies to determine the residual activity of ecTad-das9 on DNA. The first-generation adenine base editors were developed through an antibiotic resistance complementation approach in bacteria. Where the bacteria had to mutate the adenine editing domain of edTAd-cas9 in order to fix the targeted adenine in a mutant chloramphenicol resistance gene. The application of antibiotic selective pressure allowed for the generation of adenine base editors with different editing windows of activity. The most active adenine base editors (ABEs) generated include ABE5.3 with an activity window of 3–6 bp from the protospacer and ABE7.8, ABE7.9 and ABE7.10 with an activity window of 4–9 bp from the protospacer (Figure 1B) . In summary, base editors using cytosine and adenosine deaminases can convert C–G into U–G and T–A, A → T → G → C. These base modifications can generate targeted sequence variation in a precise manner and without the need to generate DSBs.
ADAR2-based RNA base editors
Very recently, RNA base editors have been developed and used to modulate biological processes. Several systems, including ADAR2, deaminate adenosine to inosine, which is read as guanine by the translational machinery, have been used for RNA editing . Intriguingly, an RNA-guided ribonuclease system using CRISPR/Cas13 has been recently repurposed to edit mRNA sequences and to edit adenosine to inosine via the use of a catalytically inactive Cas13 protein and the deaminase activity of ADAR2. This system, referred to as RNA editing for programmable adenosine to inosine replacement (REPAIR), and similar systems hold great promise to treat genetic diseases . The major advantage of using RNA editing systems is that there is no permanent change in the genome, and therefore, the safety of these reagents is much better when compared with DNA base editing.
Directed evolution via DNA base editors
Two recent platforms, targeted AID-mediated mutagenesis (TAM) and CRISPR-X, generate localized sequence diversification, which is ideal for the generation of mutant variants with gain-of-function phenotypes [23,24]. Most base editors induce base changes in a very narrow window proximal to the PAM sequence [31,49]. To achieve localized sequence diversification, which enables accelerated directed evolution of proteins, base editors with a wider window of activity are needed. The base editor would serve as a hypermutator, rather than an editor, to produce sequence variants, some of which could have novel functions. For example, TAM and CRISPR-X systems are suitable for directed evolution [23,24,56]. The TAM system is composed of dCas9 and AID, and dCas9–AIDx exhibits strong activity (>20%) with transitions and transversions from cytidine and guanine to the other three bases. Notably, when UGI is co-expressed, an increase in mutagenesis was observed with a bias for C → T transitions and a catalytic window of activity between −16 and −12 bp from the PAM sequence (Figure 1A) . In CRISPR-X, dCas9-AID* was used, but AID* was fused to MS2 and recruited to the target sequence via MS2 hairpins engineered in the sgRNA sequence. CRISPR-X exhibited a window of catalytic activities between −50 and +50 from the PAM sequence (Figure 1A) . The catalytic activities of TAM and CRISPR-X can convert C into A, G, or T and G into A, C, or T [23,24]. TAM and CRISPR-X are ideal platforms for localized mutagenesis or applications that require the development of mutant variants.
Base editors provide effective reagents to potentially treat human genetic diseases, two-thirds of which are due to single-base alterations . Moreover, such base editors can help model, study, and correct various genetic diseases. Therefore, the most important use of base editors is in gene therapy, and to treat debilitating genetic diseases. However, there are many potential applications across eukaryotic systems for a variety of basic biology and biotechnology purposes. For example, several reports have demonstrated the applications of base editors in developing herbicide resistance. Indeed, targeted mutagenesis has been used to edit the ALS gene to develop herbicide resistance . Precise base editing has been successfully implemented in wheat, rice, maize, and tomato [18,19]. Important applications examining protein functions and producing functional variants are expected to revolutionize basic research and biotechnological applications. CRISPR/Cas9 base editors can be used to investigate the allelic variations and the impact of certain variants on protein functions. Base editing can help to interrogate the function of the non-coding genome and regulatory elements. Also, it can be used to map protein–protein and protein–drug interactions and to understand the molecular underpinnings of the protein regulation via drug interactions . Base editors can also be used for a variety of synthetic biology applications including metabolic engineering of bacteria and yeast to produce select chemicals or identify protein variants with desired functionality or properties.
Base engineering technology is still in its infancy, and major developments are expected to expand the base-editing molecular toolbox with novel activities and modifications. The impact of this technology will be significant in plant science and agriculture where off-target activities are too low and base-editor machinery can be segregated away from the intended modifications. These base editors would be used for targeted trait improvement and to expand the molecular toolbox of editing for targeted improvement of crop traits. One important use of base-editing reagents would be in genome-wide screens. Base editors can be used with a library of sgRNAs to generate mutants genome-wide; these mutants can then be screened for gain-of-function phenotypes. Once the desired phenotype is identified and selected, the causal gene can be easily identified through the sequencing of the sgRNA molecule. Therefore, a CRISPR-based genome-wide screen can be used to develop novel traits of value in crop species and answer basic questions in model and non-model plants through the generation of important localized variation in a protein of interest (Figure 2). Thus, base editors provide an excellent platform for precise gene editing in plants to develop mutant variants that can be easily screened and genotyped to determine the causal gene. Targeted generation of different alleles of a specific gene can be quite useful to generate mutant variants with gain-of-function phenotypes (Figure 2). Accelerated evolution of a particular gene can be undertaken and selective pressure from biotic or abiotic factors can be applied to accelerate the evolution of specific mutant variants. Such screens would depend on the genome-wide activities of base editors and the identification of causal genes. However, since the off-target activities will be determined by the sgRNA sequence, it is of paramount importance to assess the off-target activities of base editors. This will help make sure the bona fide causal gene is identified.
CRISPR-mediated genome-wide screening (CRISPR-GWS) via different CRISPR platforms.
Several approaches have been used to determine the off-target activities of Cas9 [58–61]. Since base editors produce single-base changes, it will be difficult to determine the base editor off-target activities. Specifically, digested genome sequencing technology (Digenome-seq) can be modified and adapted to determine the off-target activities of base editors . Further efforts will focus on developing technologies to assess and identify off-target activities of cytidine and adenine base editors. It is worth noting that other modifications can be used to reduce off-target activities of base editors, including the use of ribonucleoprotein complexes (RNPs), rather than being produced from expression constructs (Figure 2). This would ensure that the base editors are active transiently and degraded by proteases after a brief window of time. A great advantage of the use of RNPs is that it reduces the chimera or mosaic modifications in progeny plants. There is also a pressing need to generate plants that carry no foreign DNA, but have the user-desired edits. Therefore, base editors can be delivered as RNPs into protoplasts where base editing can be achieved, and protoplast cells can be regenerated into whole plants carrying the gene edits but with no foreign DNA (Figure 2). Such a platform has been shown previously to work with CRISPR/Cas9. Furthermore, delivery of RNP complexes of base editors into plant cells, for example germ-line cells, via other means would lead to the efficient generation of edited plants. Furthermore, the use of sgRNAs with specific architectures can reduce the off-target activities of Cas9 proteins . The use of such architectures might be useful to reduce the off-target activities of base editors. Such sgRNA modifications include the use of an extra G nucleotide at the 5′ of the sgRNA molecule and/or the use of truncated sgRNA molecules .
Base editors can be used to develop virus resistance in prokaryotes, plants, and other eukaryotic species. Several reports have shown the feasibility of the use of the CRISPR/Cas systems to generate virus resistance against DNA and RNA viruses in plants [64–66]. Conceivably, base editors can be used to target the virus genome and generate stop codons through CRISPR-stop  or iStop , leading to the generation of nonfunctional proteins and subsequently limiting the virus propagation and systemic spread across the plant tissues. Similarly, base editors can be used to engineer plants and other eukaryotes with immunity against different single and multiple pathogens by targeting and modifying the genome.
The current base editors can efficiently produce C → T and A → G mutations [16,17]. However, they have several limitations. First, other point mutations are not feasible with the base-editor systems. Second, the window of activity of base editors may be narrow. Some CRISPR-X variants exhibit −50 PAM +50 windows of activity , but generally, the window of activity is very narrow. Third, the precision of base editing is also presently lacking; for example, it is not possible to modify a certain cytosine base in a stretch of cytosine bases in a DNA sequence. Fourth, the PAM sequence dependency needs to be resolved through the use of other Cas9 variants to expand the targeting range of base editors . Moreover, engineered Cas9 variants with reduced off-target activities need to be applied. Furthermore, recent reports show improvement of the specificity of the Cas9 system when RNPs are used for base editing [32,69]. Delivery of RNPs would depend on the target cell and organism, and the base-editing efficiency would depend on the delivery method. Finally, base-editor efficiency should be significantly improved so that it can be applied on other molecules like RNA. RNA editing has been recently reported, but significant improvements are needed to increase the efficiency of base editors in RNA editing to allow their use in practical applications and for basic research purposes [55,70]. Therefore, efforts are needed to develop efficient tools of base editing to expand their use in basic biology and biotechnology. These tools will help establish unique tools for genome editing with applications across eukaryotic systems. These efforts will constitute an important chapter in the genome editing book.
In conclusion, CRISPR base editors apply chemical principles to gene editing and provide powerful systems to precisely edit the genome for functional biological studies, and for various applications in biotechnology and gene therapy.
adenine base editors
activation-induced cytidine deaminase
clustered regularly interspaced short palindromic repeats
catalytically inactive Lachnospiraceae bacterium Cpf1
E. coli TadA
insertions and deletions
protospacer adjacent motif
protospacer adjacent motifs
RNA editing for programmable adenosine to inosine replacement
targeted AID-mediated mutagenesis
uracil DNA glycosylase inhibitor
Research in the M.M.M.'s laboratory for genome engineering is supported by the King Abdullah University of Science and Technology.
We thank members of the genome-engineering laboratory at KAUST for discussions.
The Authors declare that there are no competing interests associated with the manuscript.