Recent advances in genome editing technologies are allowing investigators to engineer and study cancer-associated mutations in their endogenous genetic contexts with high precision and efficiency. Of these, base editing and prime editing are quickly becoming gold-standards in the field due to their versatility and scalability. Here, we review the merits and limitations of these precision genome editing technologies, their application to modern cancer research, and speculate how these could be integrated to address future directions in the field.
Introduction
Cancer is a complex disease initiated and driven by a diverse spectrum of genetic and epigenetic alterations. Clinical DNA sequencing efforts across cancer patients and tumor types have revealed thousands of somatic mutations and allelic variants within cancer-associated genes — a number that continues to grow on a daily basis [1]. The biological function and significance of most genetic variants remain unknown; in fact, >50% of the genetic variants that have been cataloged in the ClinVar database are annotated as ‘variants of unknown significance’, or VUS [2]. Understanding how specific genetic variants affect cancer development, progression, and other important hallmarks of the disease [3] is critical for treating it and developing targeted therapeutics.
Functional genomic assays have long served as a critical pillar for dissecting the mechanistic basis by which variants produce diverse types of oncogenic phenotypes. Informative functional studies often require that genetic variants be modeled in a physiologically relevant manner and context. While certain variants, particularly gain-of-function mutations (e.g. KRASG12D) can be introduced into cells by way of exogenous cDNA overexpression constructs, it is important to note that this type of approach does not accurately recapitulate the temporal, stoichiometric, and regulatory variables associated with endogenous gene expression; in fact, many examples have shown that these can produce contrasting effects [4–9]. Thus, there remains a great need to study cancer mutations in their native genetic environment, which is becoming increasingly possible thanks to rapidly-evolving genome engineering tools. In particular, emerging precision genome editing technologies that enable the installation of genetic variants at defined loci are proving to be instrumental to dissect the cellular and molecular mechanics of variant-induced cancer phenotypes.
Genome editing: origins
Much of the initial work in genome engineering came from knowledge of DNA repair pathways. At the site of a double-strand break, DNA can be repaired by error-prone non-homologous end joining (NHEJ), where the two ends are ligated together, often with the inadvertent incorporation of insertions or deletions (indels). Alternatively, through homology directed repair (HDR), donor DNA can be used as a repair template to introduce a sequence at the break site [10,11]. The first major breakthrough in genome engineering came from proteins such as zinc finger nucleases (ZFNs), and later Transcription Activator-Like Effector Nucleases, which can be designed to bind distinct sequences of DNA and induce a double-strand break [12]. Resolution via the NHEJ pathway and the formation of subsequent indels will often disrupt the reading frame and knock out the gene of interest. For instance, the feasibility of this approach was demonstrated using ZFNs to target the oncogenic BCR-ABL fusion gene in murine Ba/F3 cells, rendering them growth factor dependent [13]. Specific variants can also be engineered at the break site with the addition of a donor DNA template, which promotes induction of the HDR pathway, albeit at a low efficiency. An initial use of genome editing via ZFN-mediated HDR was performed to correct pathogenic p53 mutations in yeast [14]. While powerful, functional genetic experiments with these techniques remain inefficient and costly due to the fact that both methods require protein engineering for each desired target site and rely on protein-DNA interactions for genome targeting that are challenging to predict and control. Nevertheless, ZFNs remain a strong and promising therapeutic modality to treat many diseases.
Genome engineering technologies developed over the last decade or so have rapidly accelerated the field's ability to interrogate genetic variation in cancer. The discovery and repurposing of the ancient bacterial adaptive immune mechanism, clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR associated protein 9 (which together form CRISPR-Cas9), allowed researchers to generate double-strand breaks at specific target sites in the genome without extensive protein engineering [15–20]. Specifically, CRISPR-Cas9 relies on a 20 nucleotide long single guide RNA (sgRNA) with a unique protospacer sequence to direct the Cas9 nuclease to its target site (Figure 1A). There, the sgRNA anneals to its DNA complement, forming a stable RNA-DNA duplex. Cas9 induces a double-strand break upon recognition of a specific protospacer adjacent motif (PAM) immediately downstream of the protospacer, often resulting in a genetic knockout via generation of out-of-frame indels. Among other examples, this approach has been used extensively during the last decade to identify cancer-specific genetic dependencies [21], mechanisms of drug response and resistance [22–24], essential domains for protein function [22], and mediators of metastasis [25–27]. Beyond loss-of-function mutations, other work has demonstrated that Cas9-mediated HDR can be used to engineer specific single nucleotide variants (SNVs) or indels (Figure 1A) in a multiplexed fashion in vitro [18,28] or in vivo [29]. Furthermore, Cas9-mediated knockout can be coupled with HDR [30] or autochthonous mouse models [31] to assess tumorigenicity of different genetic perturbation combinations. CRISPR-Cas9 has also been adapted to modulate gene expression by linking a catalytically dead Cas9 (dCas9) to various transcriptional proteins, allowing one to inhibit (CRISPRi) or activate (CRISPRa) gene expression (Figure 1B) [32–34]. Finally, CRISPR-induced perturbations can be coupled with single-cell RNA sequencing (sc-RNAseq) to understand how certain genes or non-coding variants modulate the transcriptome [35]. Despite these advances, the use of CRISPR-Cas9 to model specific alterations like SNVs remains imprecise, inefficient, and is limited to one or a few loci. Furthermore, Cas9-based editing can produce genotoxic, off-target double-strand breaks, potentially resulting in undesired chromosomal rearrangements and/or indels [36–39]. These drawbacks have prompted the development of alternative strategies to model and assay genetic variation.
The modular precision genome editing toolkit.
(A) Traditional CRISPR-Cas9 systems generated double stranded breaks (DSBs) after being directed to a locus by a sgRNA. This DSB can be resolved via non-homologous end joining (NHEJ), or via homology directed repair (HDR) when provided with an exogenous donor DNA template. (B) Different precision genome editors are generated by fusing effector domains to different Cas proteins. These editors take direction from the information encoded within sgRNAs or pegRNAs.
(A) Traditional CRISPR-Cas9 systems generated double stranded breaks (DSBs) after being directed to a locus by a sgRNA. This DSB can be resolved via non-homologous end joining (NHEJ), or via homology directed repair (HDR) when provided with an exogenous donor DNA template. (B) Different precision genome editors are generated by fusing effector domains to different Cas proteins. These editors take direction from the information encoded within sgRNAs or pegRNAs.
Precision genome editors
Cytosine and adenine base editors
A key scientific breakthrough that propelled the field from ‘standard genome editing’ to ‘precision genome editing’ was the development of cytosine base editors (CBEs) by Komor et al. [40] in the laboratory of David Liu. These enzymes couple a modified Cas9 enzyme with a cytosine deaminase to produce endogenous C•G to T•A transition mutations at specific sites in the genome without the need to induce a double-strand break (Figure 1B). The first CBEs consisted of a dCas9 enzyme fused to the rat APOBEC1 or AID protein, which resulted in sgRNA-dependent C•G to T•A conversions within targeted sites [40–43]. Subsequent protein engineering efforts led to the inclusion of a uracil glycosylase inhibitor, which increased editing efficiency by preventing corrective base excision repair mechanisms. Additional optimization work, including modifications to linker and nuclear localization sequences, as well as replacing dCas9 with a single-strand DNA nicking Cas9 (nCas9), further increased editing efficiency and precision [40]. Since the emergence of first generation base editors, modifications to the Cas protein have increased editing efficiency and precision [44–51], PAM site flexibility, and lowered off-target editing activity by incorporating alternative deaminating domains like TadA [52–55]. Finally, species-specific codon optimization has allowed these enzymes to be efficiently expressed in mammalian cells, organoids, and even in animal tissues in vivo [56–61]. New generations of CBEs that further increase editing efficiency while decreasing unwanted, off-target edits are continually being developed within the field [62].
Shortly after the development of CBEs, a similar approach was used by Gaudelli et al. [63] to engineer adenine base editors (ABEs). These editors link nCas9 to a transfer RNA adenosine deaminase, which is capable of generating A•T to G•C transition mutations (Figure 1B). Additional directed evolution efforts of these editors have progressively increased editing efficiency, narrowed editing windows, and decreased indel formation across each generation [64,65]. Much like CBEs, ABEs can be engineered to utilize different Cas enzymes, increasing PAM site flexibility and the number of sites that can be targeted by the protein [66]. ABEs have been shown to work with high efficiency across diverse cell types and contexts, including primary hematopoietic cells, organoids, and even non-human primates [55,67–70].
Prime editors
Beyond transition mutations, cancer is characterized by many other genetic alterations, including transversion mutations, insertions, and deletions. Until very recently, modeling these alterations at endogenous loci could only be achieved using HDR-based editing. The recent emergence of a new type of precision genome editing technology — prime editing — has circumvented this roadblock [71]. Prime editing, which was developed by Anzalone et al. [71] is capable of engineering all types of SNVs and small indels without the need to induce double-strand breaks. The editor, which is composed of nCas9 fused to a reverse transcriptase, can complex with a prime editing guide RNA (pegRNA) and travel to the target site specified by the protospacer region of the pegRNA (Figure 1B). Once the protospacer is bound to its DNA complement, nCas9 introduces a single strand nick on the opposing strand, and the reverse transcriptase uses a template encoded within the pegRNA (referred to as the reverse transcription template, or RTT) to synthesize a new, single-stranded DNA sequence containing the desired edit on the nicked strand. The newly edited strand is incorporated into the DNA fragment at a variable frequency through a still poorly understood mechanism that can depend on the edit type, pegRNA design, and other factors. Due to its mechanism of action, prime editing is often referred to as a ‘search-and-replace’ method because prime editors are directed to engineer a mutation of interest at a specific site in the genome by virtue of the instructions encoded in a pegRNA, which contains both a protospacer (the ‘search’ sequence) and a 3′ extension sequence (the ‘replace’ sequence that dictates the mutation to be installed at the site). Because the mutation is encoded in a modular pegRNA, prime editing can be used to engineer any type of SNV and small indel.
Much like base editors, significant work has been done to optimize prime editing components (or add new ones) to push the limits of what this technology can achieve. These include modifications to the pegRNA structure [72–75], such as the addition of specific motifs at the end of the pegRNA to prevent degradation [76]. Inhibition of mismatch repair systems, integration of multiple synonymous edits into the region of interest, and incorporation of a secondary, nicking sgRNA have also increased editing efficiency [72,77]. Recent efforts combining phage-assisted evolution of prime editors, modification or implementation of alternative reverse transcriptase domains, incorporation of site-specific integrases (e.g. Bxb1) and recombinases (e.g. Cre, Flp), and development of new methods that employ two or more pegRNAs have increased the size of indels and other types of genomic events that can be engineered using prime editing. For instance, dual pegRNA-based systems like twinPE, PRIME-Del, PEDAR, HOPE, GRAND, Bi-PE, PETI, bi-WT-PE, PASTE, and PrimeRoot, among others, have been shown to enable engineering of significantly larger (multi-kilobase) indels and genomic rearrangements [78–88]. Approaches like twinPE, PASTE, and PrimeRoot are particularly suitable for performing increasingly large genomic manipulations because they leverage the ability of site-specific integrases like Bxb1 (twinPE, PASTE) and recombinases like Cre/Flp (PrimeRoot) to insert or otherwise manipulate large fragments of DNA at defined genetic loci [78,87,88]. These enzymes rely on specific recognition sequences (attB/attP for Bxb1, loxP for Cre, and Frt for Flp) that are pre-installed at endogenous loci with prime editing before they can catalyze site-specific deletion, integration, replacement, or inversion of genetic material flanked by these sequence motifs [78,87,88]. Even though overall prime editing efficiencies remain low compared with base editors, many studies have developed or applied machine learning algorithms that have begun to clarify the ‘rules’ of prime editing, including ideal pegRNA parameters, optimal nucleotide contexts, and other determinants of efficient editing [89–92].
Other precision genome editors
While CBEs and ABEs have proven very effective at engineering transition mutations, they are unable to model transversion mutations, which are highly relevant in the context of cancer. For example, C•G to A•T transversions are a mutational signature of tobacco smoke and frequently present in smoking-associated cancers [93]. Prime editing allows for all types of SNVs to be modeled, but editing efficiency remains unpredictable across different loci and cellular contexts. The recent development and ongoing optimization of new base editors, including C•G to G•C base editors (CGBEs), A•T to C•G base editors (ACBEs), and T•A to G•C base editors [94–97], may help circumvent these roadblocks. Orthogonal efforts combining base editing with CRISPRi screens to identify genetic determinants of successful C•G to G•C editing resulted in the production of a diverse panel of CGBEs capable of achieving high editing efficiency (often times >80%) within many cellular and genetic contexts [97]. In head-to-head comparisons with prime editors, CGBEs demonstrated higher editing efficiency at certain genomic sites. These enzymes also avoid the laborious process of empirically optimizing pegRNA design. Furthermore, given that these proteins are relatively new in the field compared with transition base editors, they are likely to be the subject of future optimization efforts to further increase their efficiency and precision.
In addition to individual CBEs and ABEs, efforts have been made to combine the two and produce dual base editors (dual BEs) capable of engineering C•G to T•A and A•T to G•C mutations simultaneously. One approach, aimed at engineering distinct oncogenic mutations using CBE or ABE-based mechanisms, fused each deaminase to a unique Cas protein with different PAM requirements. Through this design, ABE-specific guides will only produce intended edits at the ABE-PAM-specific site, while CBE-specific guides will only edit at the CBE-PAM-specific site. This system has been used in organoid models to engineer cells harboring at least two types of endogenous mutations, with the CBE producing a nonsense mutation in TP53 and the ABE modeling the oncogenic CTNNB1 S45P variant [98]. Many dual editors that link cytosine and adenine deaminases to a single Cas protein have also been developed, the most recent of which shows equal amounts of editing of either nucleotide [52,99–104]. These enzymes could be particularly useful in mutagenesis studies, as they should edit any C or A nucleotide within the appropriate targeting window.
For mutagenesis that extends beyond engineering transition mutations, the CRISPR-X enzyme could be considered [41]. CRISPR-X contains a dCas9 enzyme that recruits a hyperactive variant of the somatic hypermutation protein AID to the target site. This AID variant can introduce transition and transversion mutations within the genomic region dictated by the sgRNA, resulting in more diverse mutagenesis than a dual BE. For larger-scale mutagenesis, the recently-described helicase-associated continuous editing system is able to perform AID-induced mutagenesis over a broad (>200 nucleotides) range over time, allowing interrogation of a wide range of genetic variants within a target region [105]. Finally, the recent application of prime editors to integrate sequence-specific recombinase recognition sites into repetitive genomic elements, such as LINE1 retrotransposons, can generate hundreds to thousands of large chromosomal rearrangements within the cell [106,107]. These types of tools could be used to probe essential amino acids within a protein or to more closely model the high tumor mutational burden and complex genomic rearrangements that are commonly observed in certain cancer types. For instance, Hess et al. [108] have shown that CRISPR-X can be used to identify mutations within drug targets that lead to therapeutic resistance.
A final, separate class of editors can directly modify RNA molecules, allowing genetic manipulation of e.g. protein-coding mRNAs without permanent changes to the genome. These RNA editors make use of adenosine deaminase enzymes, such as ADAR proteins, to perform adenosine-to-inosine editing [109], which is recognized as guanosine by the translation machinery [110]. To target specific mRNAs, the deaminase protein can be fused to a Cas ortholog, such as dCas9 [111–113] or RNA-specific dCas13 [114,115], and coupled with a targeting guide RNA. These editors have demonstrated promising therapeutic potential in vivo to correct pathogenic nonsense and missense mutations due to their ability to be dosed and transient effects [116].
Applications of precision genome editing to deconstruct cancer mechanisms
High-throughput assessment of genetic variants with multiplexed screening
Precision genome editors can expand the scope of gene knockout/overexpression-based genetic studies towards probing the effects of specific endogenous SNVs and/or indels. While they can also effectively model genetic knockouts or protein truncations in a manner that circumvents genotoxic double-strand breaks [117], for example through CBE-mediated nonsense mutations or ABE-mediated splice site mutations, the ability to engineer SNVs endogenously opens up the space to study a much greater sphere of genetic variation and gene regulation. As such, precision genome editing screens have become an increasingly popular method to study variants of known and unknown significance, particularly through fitness-based screening approaches. In these experiments, large sgRNA or pegRNA libraries targeting thousands of genomic sites can be synthesized and cloned into delivery vectors (usually viral) for parallel screening of these genetic variants (Figure 2A). The library is then introduced via lentivirus into editor-expressing cells at a low multiplicity of infection to ensure that most cells receive only one sgRNA/pegRNA (acquiring only one mutation, in theory). After a selection step to ensure that only transduced cells survive, the cells are cultured for a defined period of time, after which their genomic DNA is harvested and the number of sgRNAs/pegRNAs represented is quantified through next generation sequencing (NGS). Guides that enrich within the population over time suggest that those engineered mutations increase cellular fitness, while those that deplete may decrease fitness [118]. Despite being an effective method of analysis, there is potential for noise due to intragenic and/or extragenic off-target guide activity, low editing efficiency, bystander edits within the base editing targeting window, and variability in editing zygosity [118,119]. Many resources have been developed to circumvent these sources of bias, including guide RNA design tools to maximize editing efficiency and minimize off-target editing [8,89,120–126], recently-evolved genome editors with narrower editing windows [62,65], and haploid cell lines that eliminate the variable of zygosity [127,128].
Applications of precision genome editing to interrogate cancer-associated phenotypes.
(A) In cell-based systems and animal models, single or multiplexed guides can be delivered to editor-expressing cells to generate cells harboring endogenous genetic variants. These variant-harboring cells can then be tested for various phenotypes. (B) Schematic of precision genome editing screens. After cells are edited to harbor the desired set of endogenous variants, selective pressures or perturbations, including drugs, co-culture assays, nutrient availability, and in vivo tumor microenvironments, can be applied to this cell population. Subsequently, cell phenotypes can be read out via e.g. next generation sequencing (NGS) to count the relative abundance of variant-harboring cells, different ‘omics’ approaches, or optical screening. Analysis of these phenotypic readouts can be used to gain biological insight into variant functions.
(A) In cell-based systems and animal models, single or multiplexed guides can be delivered to editor-expressing cells to generate cells harboring endogenous genetic variants. These variant-harboring cells can then be tested for various phenotypes. (B) Schematic of precision genome editing screens. After cells are edited to harbor the desired set of endogenous variants, selective pressures or perturbations, including drugs, co-culture assays, nutrient availability, and in vivo tumor microenvironments, can be applied to this cell population. Subsequently, cell phenotypes can be read out via e.g. next generation sequencing (NGS) to count the relative abundance of variant-harboring cells, different ‘omics’ approaches, or optical screening. Analysis of these phenotypic readouts can be used to gain biological insight into variant functions.
On top of proliferative fitness, more nuanced selective pressures and perturbations like drug treatment, co-culture with immune cells, in vivo microenvironmental pressure, and altered metabolic environments can be easily implemented into screening pipelines to understand how genetic variants influence cell survival in these different contexts (Figure 2A,B). While precision genome editors are in theory equally effective for studying one or a few loci (Figure 2A), the ability to perform high-throughput precision genome editing screens has rapidly accelerated the speed and scale of interrogation and characterization of genetic variants in cancer.
Base editors
Many initial studies using multiplexed base editing to study cancer mechanisms have consisted of tiling mutagenesis screens of a single protein. For example, multiple CBE and/or ABE screens have surveyed the proliferative fitness of annotated variants within BRCA1 and BRCA2, identifying hundreds of variants that had previously been labeled as ‘likely benign’ or of ‘unknown significance’ that displayed a loss-of-function-mediated reduced fitness phenotype when expressed within their native genetic context [129–131]. Similar mutagenesis approaches have been used for other cancer-associated genes, such as PARP1, MCL1, BCL2L1, and EGFR proteins [130,132]. In contrast, many studies have reported using multiplexed base editing screening to simultaneously assay variants across multiple proteins. For instance, to study the fitness of cancer patient-derived mutations, a library of sgRNAs designed to engineer variants observed within the MSK-IMPACT cancer patient clinical sequencing cohort was screened within CBE-expressing pancreatic cells, resulting in the discovery and validation of multiple pathogenic TP53 mutations in vitro and in vivo that had been formerly uncharacterized [126]. These types of fitness-based screens provide a simple and high-throughput way of identifying variants that promote cellular proliferation, a key hallmark of cancer.
Beyond proliferative fitness, base editor screening has also been used to identify mediators of drug resistance and sensitivity to targeted and systemic therapies (Figure 2A). For example, a tiling mutagenesis screen designed to engineer different types of variants within the BRCA1 and BRCA2 genes was used to identify mechanisms of resistance against PARP inhibitors (talazoparib, olaparib) and general chemotherapies (cisplatin) [130,131,133]. Additional studies have examined the potential role of mutations within the MAPK pathway, EGFR, and BCL2, to drive resistance or sensitivity to BRAF inhibitors, tyrosine kinase inhibitors, and BCL2 inhibitors, respectively [131,132,134]. A broader study probing variants within DNA-damage response genes also identified many genetic alterations that promoted resistance to treatment with DNA damaging agents like cisplatin and doxorubicin [135].
In addition to determining which cancer-associated variants promote resistance to therapy, base editors can also be used to assay the role of specific amino acid residues in cellular fitness and drug resistance. One recent study used ABEs to perturb hundreds of thousands of lysine residues within protein coding regions, identifying many residues that were essential for cellular fitness [136]. A separate screening strategy perturbing nucleophilic cysteine residues within thousands of cancer-associated proteins was used to create an atlas of cysteine ‘functionality,’ annotating residues critical for cancer cell proliferation and putative strong targets for small molecule inhibitors [137]. Much like CRISPR-Cas9 based approaches, base editing will serve as a critical tool for determining how specific patient-derived mutations affect response to targeted and systemic therapies.
Base editors can also be used to probe how genetic variants affect the transcriptome and proteome (Figure 2A,B). For instance, cancer-associated mutations frequently affect genes involved in chromatin regulation and DNA modification; thus, a DNMT3A tiling base editing screen coupled to a measurable DNA methylation reporter identified many variants that increased or decreased DNA methylation [138]. A separate base editing screen in hematopoietic stem and progenitor cells used sc-RNAseq to characterize mechanisms of hematopoietic differentiation and identify genetic modulators of fetal hemoglobin expression [69]. A third study combined installation of cancer-associated TP53 variants into multiple cancer cell lines with sc-RNAseq to reveal the distinct gene expression programs induced by each variant [139]. Finally, the transcriptional effects of many non-coding GWAS loci in blood cell traits were recently identified with base editing [140]. This type of approach can be generalized to study the dynamic transcriptional effects of any variant of interest. Beyond transcription, work in yeast utilized a GFP reporter system to identify how genetic perturbations induced by base editing modulated protein abundance [141]. Thus, large-scale base editing screens coupled to various types of ‘omics’ approaches have tremendous potential to shed light on cellular and transcriptional changes induced by cancer-associated variants.
Beyond cell-autonomous phenotypes, base editing approaches can also be used to dissect cell extrinsic cancer mechanisms. A deep mutagenesis screen of JAK1 in human colorectal cancer cells identified novel loss and gain-of-function variants within the interferon-gamma signaling pathway. Importantly, many loss-of-function variants promoted resistance to cytotoxic T cell killing in in vitro co-culture assays, while a gain-of-function variant sensitized cells to killing, demonstrating the utility of this approach to identify mediators of tumor-T cell interactions [142]. Base editing mutagenesis was also recently employed within primary T cells to define functional domains of hundreds of genes involved in T cell function [143]. This screen identified many genetic variants that positively and negatively regulated cytokine production, T cell activation, and cytotoxicity. These exciting studies and approaches are paving the way for future work investigating how genetic variation impacts a tumor cell's interactions with neighboring cells.
While many base editor studies are initially conducted in vitro, these enzymes are also suitable for in vivo study of cancer mechanisms. In fact, base editing technologies allow for efficient and precise somatic genome editing in many tissues, thereby mimicking spontaneous disease development and reducing the labor and time required to produce traditional genetically-engineered autochthonous mouse models [56,144]. In a study of Ar and Hoxd13 genes, microinjection of mouse zygotes with ABE mRNA and synthetic sgRNAs produced editing efficiencies of up to 100% [145]. Base editor mRNA has also demonstrated effective editing in vivo through delivery via lipid nanoparticles [146]. Multiple viral based delivery methods, including lentivirus, lentivirus-derived nanoparticles, adenovirus, adeno-associated viral vectors (AAVs), and engineered viral-like particles (eVLPs) have led to successful genome editing in specific organs in vivo, including the liver, heart, and retina, among others [147–153]. To circumvent delivery of the (large) editor and minimize potential long-term toxicities, Lukas Dow and colleagues recently developed a transgenic doxycycline-inducible CBE mouse strain (called iBE) that shows efficient and precise base editing at one or more endogenous loci upon delivery of sgRNAs to different organs of adult mice [144]. Beyond in vivo editing, base editors are also commonly introduced into cells ex vivo prior to transplant. Delivery of the editor ex vivo can be performed via lentivirus (though constitutive expression of the Cas protein is known to be immunogenic [154]), electroporation of editor mRNA or a ribonucleoprotein particle (RNP), or transduction with an eVLP [126,149,155]. All of these methods should allow efficient and combinatorial modeling of cancer-associated mutations in biologically relevant in vivo settings.
Prime editors
Though still a relatively new technology, the field has already begun to use prime editing in high-throughput to interrogate cancer mechanisms through targeted gene/protein functional studies and detailed phenotypic analysis of genetic variants. It is worth noting that prime editing offers many theoretical advantages over base editing for high-throughput functional interrogation of genetic variants. First, the types of mutations that can be engineered with base editing are limited to nucleotide transitions and some transversions. While a significant fraction of disease-associated mutations are transitions and transversions, many variants can be compound mutants (i.e. affecting ≥2 nucleotides, including both transitions and transversions, and nucleotides are not always next to each other) or indels, which are not amenable to base editing. Second, base editors can sometimes install undesired ‘bystander’ mutations next to the intended SNV site at certain sequences in the genome that contain more than one cytosine or adenine flanking the target nucleotides, leading to combinations of missense and/or nonsense mutations. Base editing guide RNA designs are also limited to the targeted protospacer sequence and to the appropriate positioning of target nucleotides within an optimal editing window that varies among different CBEs and ABEs. Lastly, it can be challenging to design on-target control base editing guide RNAs to engineer silent mutations in protein coding genes, which can be quite useful for gene and variant functional studies.
The first variant mutagenesis study using prime editing identified deleterious mutations within the NPC1 gene, a key driver of Niemann–Pick disease type C, as well as the BRCA2 tumor suppressor gene [156]. More recently, a multiplexed saturation mutagenesis screen of a MYC enhancer coupled to quantification of relative pegRNA enrichment/depletion revealed enhancer nucleotides that are essential for cellular fitness [157]. Given that the rules that govern the optimal design of pegRNAs to maximize prime editing efficiency and precision remain an area of active investigation, our group recently developed a scalable prime editing ‘sensor’ assay that couples individual pegRNAs to their cognate target sites, which are designed to closely recapitulate the native sequence and genomic context of genes and sequences we intend to target [8,123]. This allows us to simultaneously deploy and quantify prime editing across thousands of sensor sites and endogenous genes by amplifying and sequencing the sensor target site from a population of cells that have been transduced with individual prime editing sensors. Scaling up this assay allowed us to perform high-throughput prime editing mutagenesis to screen more than a thousand patient-derived mutations in the TP53 tumor suppressor gene, identifying several bona fide pathogenic variants that were previously missed by exogenous cDNA overexpression approaches [8]. Moreover, analysis of sensor editing efficiency and pegRNA counts within the screen found that the correlation between the two markedly increased with as little as 1% correct editing within the sensor [8]. This correlation demonstrates the utility of a sensor-based screen, given that guide RNA quantities (counts) are the primary readout for cellular fitness in pooled screens. The reduction in noise provided by the sensor readout greatly increases the likelihood of identifying and validating genetic variants with real biological effects.
Broader screens have surveyed the fitness of ClinVar and GWAS-identified breast cancer variants, shedding light on the pathogenicity (or lack thereof) of variants of unknown significance [156,157]. Similar to base editing, drug resistance screens have also been conducted targeting proteins involved in EGFR signaling [158]. Finally, prime editing screens in conjunction with RNA in situ hybridization and flow cytometry have recently demonstrated the feasibility of such an approach to understand how genetic variants in regulatory elements, such as promoters and enhancers, affect gene expression [159].
While still early days, many studies have already demonstrated the feasibility of in vivo prime editing. Initial efforts successfully utilized dual AAVs expressing split PEs, which can produce editing in many tissues, including the brain, liver, and heart, with no off-target editing detected [160–163]. Additionally, ex vivo editing of cells with plasmid, lentiviral, mRNA, or RNP-based prime editors followed by in vivo transplantation have demonstrated moderate editing efficiencies [164–167]. Finally, Tyler Jacks and colleagues recently developed the first inducible prime editor mouse model and showed that it can be used for autochthonous cancer modeling upon delivery of variant-specific pegRNAs and Cre to target tissue sites [168]. This model was initially applied to test the tumor-forming potential of different Kras alleles in the lung, including G12A, G12D, and G12R, revealing variable tumorigenicity between the three. As the field of prime editing continues to grow at a rapid pace, new, robust in vivo models and technologies are also likely to continue to evolve.
Looking forward
Precision genome editing is rapidly revolutionizing the study of cancer mechanisms. As editor proteins and delivery methods continue to evolve, it will become increasingly possible to engineer desired mutations into any cell-based system in vitro or tissue type in vivo, eliminating the laborious process of generating genetically engineered mouse models. In conjunction with the optimization of other sequencing and single-cell based methods, it will become possible to tackle previously unanswerable questions in the field of cancer biology. For example, precision genome editors could be used to model multiple cooperating genetic events, and this could be coupled with CRISPR-Cas9 barcoding and sc-RNAseq [169] to understand how combinations of oncogenic mutations affect transcriptomic regulation and tumor evolution. Moreover, single-cell spatial transcriptomics will allow us to understand how different cells are functioning within a heterogeneous tumor, and optical screening could be used to study genotype-induced morphological changes [170,171]. Improvements to screen analysis, such as the integration of long-read sequencing to assess allele enrichment or depletion within complex growing tumors instead of sgRNA quantification will allow for thousands of variants to be simultaneously probed for a given phenotype of interest with a much-improved signal-to-noise ratio. These screens can be conducted in different cellular contexts, mutational landscapes, and genetic backgrounds, allowing one to assess how variants affect cancer hallmarks in general and context-specific manners (Figure 3). All in all, the ability to model and understand how genetic alterations behave is key for developing targeted therapies, and the study of cancer mechanisms with precision genome editing brings the lofty goal of ‘personalized medicine’ one step closer to reality.
Dissecting the hallmarks of cancer with precision genome editing.
Precision genome editing has been used to interrogate a subset of the hallmarks of cancer (labeled as ‘tested’ in white). References for papers that have evaluated these hallmarks are listed with reference numbers. Developments in the field will empower future studies to dissect all of the hallmarks of cancer. Adapted from Hanahan and Weinberg [3].
Precision genome editing has been used to interrogate a subset of the hallmarks of cancer (labeled as ‘tested’ in white). References for papers that have evaluated these hallmarks are listed with reference numbers. Developments in the field will empower future studies to dissect all of the hallmarks of cancer. Adapted from Hanahan and Weinberg [3].
Perspectives
Modeling and studying cancer-associated mutations in physiologically relevant contexts is needed to accurately understand and characterize their biological effects. Precision genome editing tools, including base and prime editors, can be used to engineer endogenous mutations on a one-by-one or high-throughput basis, allowing parallel studies of thousands of genetic variants of interest in their native genetic environment. These tools have greatly expanded the scope, precision, and scale of cancer-associated mutations that can be investigated.
Precision genome editors have been used for disease modeling and genetic screens to assess variant-specific effects on cellular fitness, drug response and resistance, interactions with the immune system, gene expression regulation, and overall protein abundance. Some screens have involved tiling mutagenesis of single or multiple genes to study the effects of perturbing specific amino acids or protein domains, while others have targeted specific variants across a series of genes and non-coding DNA regions. These tools have been implemented effectively in many in vitro and in vivo settings and are continually improving in editing efficiency, precision, and scalability. Ultimately, these studies are revealing how certain genetic variants contribute to various cancer ‘hallmarks’ and other types of relevant phenotypes, and may help with therapeutic development down the line.
As precision genome editors, sequencing, and single-cell based genomics methods continue to evolve and become a mainstay in the field, it will become possible to tackle previously unanswerable questions in the field of cancer biology. For instance, precision genome editors could be used to model multiple cooperating genetic events to understand how specific combinations of oncogenic or otherwise cancer-associated mutations differentially affect disease initiation and tumor evolution. Single-cell spatial genomics, transcriptomics, and other in situ ‘omics’ methods (including epigenomics and proteomics) will allow us to mechanistically understand how different cells with diverse types of mutations are contributing to an ever-evolving heterogeneous tumor ecosystem. We envision the field being able to conduct these experiments within an increasingly diverse spectrum of cellular and in vivo settings to elucidate both general and context-specific mechanisms, bringing the promise of individualized precision medicine much closer to reality.
Competing Interests
The authors declare that there are no competing interests associated with the manuscript.
Funding
Work in the Sánchez-Rivera laboratory is supported by the Howard Hughes Medical Institute (HHMI) (Hanna Gray Fellowship), the V Foundation for Cancer Research [V2022-028], NCI Cancer Center Support Grant P30-CA1405, the Ludwig Center at MIT [2036636], Koch Institute Frontier Awards [2036648 and 2036642], the MIT Research Support Committee [3189800], and the Upstage Lung Cancer Foundation. S.I.G. and G.A.J. are supported by T32GM136540 from the NIH/NIGMS. S.I.G. is also supported by the MIT School of Science Fellowship in Cancer Research. G.A.J. is also supported by a Margaret A. Cunningham Immune Mechanisms in Cancer Research Fellowship Award.
Abbreviations
- AAV
adeno-associated viral vector
- ABE
adenine base editor
- CBE
cytosine base editors
- CRISPR
clustered regularly interspaced short palindromic repeats
- DSB
double stranded break
- eVLP
engineered viral-like particle
- HDR
homology directed repair
- NGS
next generation sequencing
- NHEJ
non-homologous end joining
- PAM
protospacer adjacent motif
- RNP
ribonucleoprotein particle
- SNV
single nucleotide variants
- pegRNA
prime editing guide RNA
- sc-RNAseq
single-cell RNA sequencing
- sgRNA
single guide RNA
- ZFN
zinc finger nuclease
References
Author notes
These authors contributed equally to this work.