Many eukaryotic organisms, from ciliates to mammals, employ programmed DNA elimination during their postmeiotic reproduction. The process removes specific regions from the somatic DNA and has broad functions, including the irreversible silencing of genes, sex determination, and genome protection from transposable elements or integrating viruses. Multiple mechanisms have evolved that explain the sequence selectivity of the process. In some cases, the eliminated sequences lack centromeres and are flanked by conserved sequence motifs that are specifically recognized and cleaved by designated nucleases. Upon cleavage, all DNA fragments that lack centromeres are lost during the following mitosis. Alternatively, specific sequences can be destined for elimination by complementary small RNAs (sRNAs) as in some ciliates. These sRNAs enable a PIWI-mediated recruitment of chromatin remodelers, followed up by the precise positioning of a cleavage complex formed from a transposase like PiggyBac or Tc1. Here, we review the known molecular interplay of the cellular machinery that is involved in precise sRNA-guided DNA excision, and additionally, we highlight prominent knowledge gaps. We focus on the modes through which sRNAs enable the precise localization of the cleavage complex, and how the nuclease activity is controlled to prevent off-target cleavage. A mechanistic understanding of this process could enable the development of novel eukaryotic genome editing tools.
Programmed DNA elimination
Maintaining genetic information accurately is essential for proper cellular functions. Perturbations impede the evolutionary fitness, through multimodal effects including cancer evolution [1,2], and cause resistance to cytotoxic immune responses [3], anti-cancer drugs [3], and development of metastasis [4]. Chromosomal error during embryonic development, such as DNA elimination due to stress and mitotic errors [5], or cytoplasmic DNA shedding [6] leading to aneuploidy, is a leading cause of pregnancy loss [6] or genetic disorders in the offspring [7]. Erroneous DNA elimination is thus a natural phenomenon related to genomic instability and causes a decrease in organismal fitness, since the eliminated sequences are random.
Surprisingly, another form of elimination exists, which leads to the programmed DNA elimination (PDE) of defined genomic regions. Notably, this process occurs reproducibly in each subsequent generation and as part of meiotic or postmeiotic embryonic development in a normal developmental cycle. PDE events are present across diverse clades along the tree of life including various metazoans and vertebrates [8] (Figure 1). While many molecular details about the process still remain elusive, it is clear that the known PDE events involve different machineries. This suggests that PDE evolved multiple times independently [37], suggesting that this convergent evolution [37] must have underlying evolutionary advantages. These include the possibility to irreversibly silence unnecessary genes, the ability to protect the genome from invasive sequences, or enable sex determination (Figure 1) from the same genomic material. Additionally, PDE can enable a similar degree of innovations as the products of alternative RNA splicing, which can generate novel protein products [38]. In this review, we aim to provide an overview of the known biological roles of programmed elimination of DNA in the various organisms, followed up by the molecular mechanisms that enable the discrimination of retained and eliminated sequences. We include the present state-of-the-art hypotheses and future directions for experimental validation. Specific focus is placed on the involvement of small RNAs (sRNAs) and the potential of such sRNA-guided molecular machinery for the development of novel tools for genome editing.
Diverse groups of eukaryotic organisms use programmed DNA elimination (and chromatin diminution) in their postmeiotic development.
This includes the following genera: Ciliates: Paramecium [9], Tetrahymena [10], Oxytricha [10], Stylonychia [11], Euplotes [11]; Worms: Strongyloides [12], Oscheius [13], Ascaris [14], Parascaris [15]; Insects: Phragmatobia [8] (moths), Liposcelis12 (booklice), Bacillus [16] (stick insects), Bradysia [17] and Sciara [8] (fungus gnats), Mayetiola [18] (hessian fly), Nasonia [19] (jewel wasp); Copepods: Cyclops [20], Mesocyclops [21]; Arachnids: Metaseiulus [22]; Jawless fish: Petromyzon [23] (lamprey), Myxine [24] (hagfish); Fish: Hydrolagus [25], Hypseleotris [26]; Amphibians: Pelophylax [27], Bufotes [28]; Songbirds: Taeniopygia [29] (zebra finch), Lonchura [30]; Mammals: Perameles [31] and Isoodon [32] (bandicoots), Acomys [33] (spiny mouse); Plants: Aegilops [34] (goat grass), Brachiaria [35], Hordeum [36].
This includes the following genera: Ciliates: Paramecium [9], Tetrahymena [10], Oxytricha [10], Stylonychia [11], Euplotes [11]; Worms: Strongyloides [12], Oscheius [13], Ascaris [14], Parascaris [15]; Insects: Phragmatobia [8] (moths), Liposcelis12 (booklice), Bacillus [16] (stick insects), Bradysia [17] and Sciara [8] (fungus gnats), Mayetiola [18] (hessian fly), Nasonia [19] (jewel wasp); Copepods: Cyclops [20], Mesocyclops [21]; Arachnids: Metaseiulus [22]; Jawless fish: Petromyzon [23] (lamprey), Myxine [24] (hagfish); Fish: Hydrolagus [25], Hypseleotris [26]; Amphibians: Pelophylax [27], Bufotes [28]; Songbirds: Taeniopygia [29] (zebra finch), Lonchura [30]; Mammals: Perameles [31] and Isoodon [32] (bandicoots), Acomys [33] (spiny mouse); Plants: Aegilops [34] (goat grass), Brachiaria [35], Hordeum [36].
Functions of PDE
In eukaryotes, the roles of PDE are diverse [39] and can be divided into three general groups: (i) sex determination and mating type or species compatibility; (ii) irreversible gene silencing for cell type differentiation; and (iii) genomic defense against invading sequences (Figure 1).
Sex determination based on chromosome segregation and elimination during embryogenesis is found in the booklice Liposcelis, with heterochromatin formation on the paternal chromosome followed by elimination [40]. PDE also enables sex determination in the roundworms from the genus Strongyloides [12]. In some plants, DNA elimination allows for species separation and determination of hybrid compatibility based on the loss or retention of paternal chromosomes [35,36]. Similar elimination of paternal genomes was observed in wasps [19] and arachnids [22]. Additionally, the whole paternal genome can be removed during gametogenesis in hybridizing species, [28] including frogs [27], insects [16,17], and fishes [25,26]. In mammals, various species also use PDE for the inactivation of sex chromosomes [41]. Marsupials eliminate the Y chromosome from specific cell lineages [32]. Additionally, the paternal X chromosome is eliminated in some female bandicoots, in a form of X-chromosomal dosage compensation [31]. An X0/XY sex chromosomal mosaicism is present in some rodents, like the spiny mouse [33], in which the male sexual chromosomes are eliminated from the somatic cells.
Genome streamlining is another prominent role of PDE. In jawless vertebrates like lampreys [23] and hagfish [24], the process targets both repetitive elements and also developmentally specific genes. The elimination enables an irreversible control mechanism of the embryogenesis transcriptional program and streamlines the genome content of the somatic cells. Differential chromosomal segregation during meiosis is a prominent form of PDE in thousands of songbird species that karyotypically differentiate their germline and somatic cells [29,30]. Notably, the indispensability of the songbird germline-restricted chromosome is likely due to a single highly conserved gene that is encoding for the RNA-binding protein CPEB1 involved in oocyte maturation [42]. Similarly, in the parasitic worm Ascaris, PDE has regulatory roles such as the permanent silencing of genes that are essential for gametogenesis and embryogenesis in somatic cells [14,15]. Recently, PDE was also discovered in Oscheius tipulae, a free-living member of the Rhabditidae nematodes to which also the common model organism Caenorhabditis elegans belongs, but the functional significance of the process for the organism remains to be determined [13]. Another form of genome streamlining called somatic mosaicism is formed by a selective elimination of DNA sequences in parts of an organism, where these genes are no longer necessary. This form of PDE has been detected in the roots of goat grass Aegilops speltoides, in which the B chromosome is eliminated [34]. Another prominent example for genome streamlining exists in unicellular organisms with a nuclear dimorphism that enables two distinct types of nuclei, a germline and a somatic nucleus, to coexist in a single cell [9]. The somatic genome in many ciliates is highly polyploid (e.g., in Paramecium, n = 1600) [11], thus streamlining its contents can significantly reduce the energetic costs for its replication. This DNA replication efficiency has been previously observed in other aquatic organisms [43].
In addition to genome streamlining by removing unnecessary content, the ability to selectively eliminate DNA can provide the means for a defense mechanism against the propagation of invasive sequences like transposons. These elements are particularly active during sexual reproduction in ciliates [44] and oogenesis in copepods [45]. Numerous eliminations of invasive DNA sequences are observed in ciliates, with some species removing as much as 97% of their germline genome [11]. In copepods, more than 80% of the germline genome consists of repetitive elements that are efficiently eliminated during embryogenesis [20,21]. The particular bias for the elimination of evolutionary younger and highly active transposon sequences suggests that PDE serves as a form of genome protection [21].
Cellular mechanisms for discrimination of eliminated and retained DNA
To initiate any form of PDE, the cells must correctly recognize the DNA regions destined for elimination and retention. Due to the ancient origins of the process, and a convergent evolution, various mechanisms exist in different species. This is also due to the different lengths of the eliminated sequences ranging from small fragments to whole chromosomes.
A simple mechanism for the selective elimination of whole chromosome is present in plant hybrids and involves the differential affinity of centromeric proteins to the parental centromeric repeats. The lack of centromeres on a chromosome leads to its elimination during subsequent mitosis [36] (Figure 2A). Similarly, the parasitic worms from the genus Ascaris eliminate large regions on their polycentric chromosomes that lack centromeric repeats [46]. These regions are flanked by conserved cleavage sites and upon excision excluded into micronuclei during mitosis [46,47] (Figure 2B). To retain these regions of the polycentric chromosomes in the germline, it is likely that the transcription of a specialized centromeric RNA is needed for RNA-mediated scaffolding that initiates the formation of a transient germline-specific centromeres, similar to the already described ones in C. elegans [52].
Differentiating DNA regions for elimination or retention.
(A) Differential binding of centromeric proteins leads to retention of bound and exclusion of unbound chromosomes [36]. (B)DNA regions lacking centromeric repeats are lost during mitosis [46,47]. (C)In different ciliated organisms, the sRNAs could determine either the eliminated or the maintained sequences [9,11]. (D)sRNAs guide a PRC2-like complex for the deposition of H3K9me3, and H3K27me3 marks on the histones of eliminated transposable elements in ciliates [48,49]. (E)sRNA-guided methylation complex could potentially enable targeted sequence retention in ciliates [50,51]. sRNAs, small RNAs.
(A) Differential binding of centromeric proteins leads to retention of bound and exclusion of unbound chromosomes [36]. (B)DNA regions lacking centromeric repeats are lost during mitosis [46,47]. (C)In different ciliated organisms, the sRNAs could determine either the eliminated or the maintained sequences [9,11]. (D)sRNAs guide a PRC2-like complex for the deposition of H3K9me3, and H3K27me3 marks on the histones of eliminated transposable elements in ciliates [48,49]. (E)sRNA-guided methylation complex could potentially enable targeted sequence retention in ciliates [50,51]. sRNAs, small RNAs.
The precise elimination of short sequences requires more sophisticated demarcation mechanisms. Such PDE events are present in ciliates, and the eliminated sequences are often shorter than 100 bp [53] and lack long conserved sequence motifs [53,54]. Specific DNA recognition sequences typically allow a sequence-specific binding by proteins such as restriction nucleases or transposases [55]. Interestingly, the lack of conserved motives at the ends of eliminated sequences in ciliates stems from an sRNA-mediated specific recognition [56]. This form of delineation suggests base pairing interactions of the sRNAs similar to other sRNA-guided systems such as CRISPR and the recently described ragath-18-derived sRNAs that guide precise DNA targeting of IS607-encoded nucleases [57]. Notably, both positive (protection of sequences) and negative (marking for elimination) mechanisms exist in different ciliate species [11]. In Paramecium and Tetrahymena, sRNAs are complementary to the eliminated regions [9,10], whereas in Oxytricha and Stylonychia, the sRNAs mark the sequences that need to be retained [10,11] (Figure 2C). While sRNA-to-DNA base pairing has not yet been detected in ciliates, the PDE-specific sRNAs resemble piRNAs in their biogenesis and are loaded into PIWI-like proteins [58]. In Paramecium, the sRNA–PIWI complexes recognize TFIIS4-driven noncoding transcripts, which are produced in the new somatic nucleus before DNA elimination and thus contain eliminated sequences [59]. Interestingly, the proper execution of meiotic recombination is essential for the subsequent DNA elimination step, suggesting that a DNA development cycle with multiple checkpoints exists [60]. In line with this, at least two meiosis-specific factors, Spt5m and Spt4m, have been identified as regulators of transcription of the long noncoding RNAs, from which the initial guiding sRNAs are produced [61,62].
Similar to transposon silencing in other eukaryotes, the ciliate sRNA–protein complex binds to the complementary transcripts and then recruits effector proteins leading to heterochromatinization. Specifically, it has been shown that sRNA-loaded PIWI proteins can recruit a PRC2-like complex to establish H3K9me3 and H3K27me3 modifications at the sites of DNA destined for elimination [48,49] (Figure 2D). In the diatom Phaeodactylum tricornutum, sRNAs of similar length (26–31 bp) were recently found as essential for the establishment of the H3K9me3 and H3K27me3 repressive marks on transposable elements [63]. The similarity of this sRNA-guided heterochromatinization system of this diatom to the ciliate one suggests that it was probably present in the last common ancestor of the SAR clade and later repurposed or expanded in ciliates for genomic eliminations of the repressed regions.
Interestingly, heterochromatinization is also related to paternal genome elimination in some insects, since the eliminated genetic material accumulates heterochromatin-associated proteins before its elimination during meiosis [64,65]. The links between PDE and heterochromatin formation suggest that PDE probably evolved as a process for irreversible gene silencing. This is further corroborated by the discovery that many spermatogenesis genes in C. elegans subject to piRNA-mediated silencing [66] are permanently eliminated from the somatic cells of other worm species [13,67]. Similarly, the mammalian homologs of some eliminated genes in lampreys are silenced by PRC2-mediated heterochromatinization during embryonic development [68,69].
It must be noted that in ciliates, many of the eliminated fragments are actually smaller than the histone footprint [70]. The placement of nucleosomes and histone tail modifications is insufficient to explain the base–pair precision of the cleavage. This suggests that pleiotropic effects of the sRNA–PIWI complexes exist, leading on one side to heterochromatin formation on specific longer sequences akin to the canonical piRNA functions [71]. But, in the subset of eliminated sequences that require sRNA guidance, the sRNAs could also directly mark DNA for excision. Several reports suggest that in addition to histone modifications, direct nucleotide modifications are differentially deposited [50,51,72]. The role of such nucleotide modifications is presently unknown but could potentially regulate the nuclease activity (Figure 2E) similar to the toxin–antitoxin nuclease systems in bacteria. Proteins from these families form extensive defense systems against foreign nucleic acids, and the nuclease activity is controlled by RNA fragments and nucleotide modifications [73].
In lampreys, hypermethylation of cytosines is observed specifically on the eliminated sequences [72]. Similarly, DNA methylation presents a mechanism for suppression of transposable elements in plants. The target sequences are marked by complementary sRNAs that lead to sequence-dependent methylation of these regions in the germline [74]. The processing of the involved sRNAs requires Tudor domain Argonaute proteins [75], similar to the sRNA processing systems involved in PDE in ciliates [11]. Since PDE in ciliates also targets transposons and transposon remnants [76], it raises the possibility that sRNA-mediated DNA methylation marks in ciliates govern the elimination process. In line with such a hypothesis, a highly active eukaryotic DNA adenine methylation complex was discovered recently in Tetrahymena [50,77]. Furthermore, in Paramecium and Oxytricha, DNA 6-adenine methylation (6mA) has functional roles and disfavors nucleosome positioning on DNA [51,78] (Figure 2E). The disruption of the methylation complex causes significant lethality of the progeny that executed PDE [51]. Defects in the DNA elimination could also be caused by mispositioning of the nucleosomes, since correct nucleosome remodeling is required for DNA excision [79]. Additionally, the 6mA-modified nucleic acids could hypothetically allow RNA–DNA cross-talk during the sRNA-guided phase of PDE, similar to a recently discovered methylation-dependent transposable element suppression mechanism in human embryonic stem cells [80]. Further research into the DNA modification landscape of ciliates is needed to clarify the roles of nucleic acid modifications for the programmed genomic excisions.
Molecular machinery involved in programmed DNA cleavage
Since the PDE process likely evolved multiple times separately, it is presently assumed that various enzymes and principles are involved in the cleavage step. In the worm Oscheius tipulae, the eliminated DNAs are flanked by a conserved sequence motif, which likely recruits the cleavage complex to the correct site [13] (Figure 3A). In ciliates, in addition to the previously mentioned sRNA-guided cleavage, an sRNA-independent cleavage coexists. The latter one serves likely for the elimination of centromeric regions [88] and chromosomal breakage [89], which has the function to split the germline chromosomes into smaller somatic chromosomes [81]. Chromosomal breakage is followed by de novo telomerization of the generated DNA ends [82] (Figure 3B). In contrast, sRNA-guided PDE usually leads to rejoining of the free DNA ends directly after cleavage, since this is required for the reconstitution of protein coding sequences (CDSs) interrupted by the eliminated DNA (Figure 3B). In Oxytricha, CDS reconstitution involves a comparison of the DNA fragments to long RNAs prior to DNA ligation in a process known as unscrambling [90]. This is necessary for reconstructing the correct order of the fragments as these are dispersed across the germline genome in both different orientation and succession [91,92]. The complex downstream processing of the free DNA ends is suggestive of a coupling of the DNA cleavage and rejoining. In Paramecium, most excisions occur within a single genome endoreplication, from a polyploidy of 32n to 64n [44]. However, the DNA removal process seems to occur sequentially, and two distinct classes of sRNAs are involved [83]. The initial scanRNAs are transcribed from the germline, whereas the subsequent iesRNAs are produced directly from the eliminated DNA, thereby creating a positive feedback loop for excisions [83] (Figure 3C).
Molecular mechanics of different types of programmed DNA cleavage.
(A) Conserved flanking motifs can serve for recognition by sequence specific nucleases [13]. (B) Chromosomal break sites (CBS) and sRNA-mediated DNA elimination can coexist, albeit the outcomes of the two processes are different with CBS resulting in de novo telomerized DNA ends [81,82]. (C)The excised DNA fragments are circularized and serve the production of sRNA precursor transcripts as a positive feedback loop for excision [83]. (D)In Paramecium, the guiding sRNAs are produced from the germline, but sRNAs matching to the old somatic nucleus are removed, before the remaining sRNA guide the excisions in the new somatic nucleus. The progress of the excision adds a feedback cross-talk to the old somatic and the germline nucleus, e.g., regulating gene expression there. (E)The PiggyBac-like nuclease (PGM) requires interactions with spt16-1 for import into the new developing somatic nucleus, where PGM in complex with PGM-like proteins and components of the NHEJ machinery conducts the genomic excisions [84-86]. (F)SMC proteins as part of a condensin complex could be involved in DNA looping that ensures that the cleaved DNA ends are in close proximity [87]. sRNA, small RNA.
(A) Conserved flanking motifs can serve for recognition by sequence specific nucleases [13]. (B) Chromosomal break sites (CBS) and sRNA-mediated DNA elimination can coexist, albeit the outcomes of the two processes are different with CBS resulting in de novo telomerized DNA ends [81,82]. (C)The excised DNA fragments are circularized and serve the production of sRNA precursor transcripts as a positive feedback loop for excision [83]. (D)In Paramecium, the guiding sRNAs are produced from the germline, but sRNAs matching to the old somatic nucleus are removed, before the remaining sRNA guide the excisions in the new somatic nucleus. The progress of the excision adds a feedback cross-talk to the old somatic and the germline nucleus, e.g., regulating gene expression there. (E)The PiggyBac-like nuclease (PGM) requires interactions with spt16-1 for import into the new developing somatic nucleus, where PGM in complex with PGM-like proteins and components of the NHEJ machinery conducts the genomic excisions [84-86]. (F)SMC proteins as part of a condensin complex could be involved in DNA looping that ensures that the cleaved DNA ends are in close proximity [87]. sRNA, small RNA.
This idea that PDE is a precisely executed multistep process is further supported by the fine-tuning of the gene expression of PDE-related genes in Paramecium, which is governed by the progression of the PDE process [84]. A nuclear cross-talk in the opposite direction is already necessary for the initial selection of the sRNAs [93] (Figure 3D), and likely the bi-directional cross-talk enables checkpoints during the process and the transition from a germline to somatic nuclear phenotype. While the germline nucleus produces the transcripts from which the sRNAs are produced, it is considered otherwise transcriptionally inactive [94]. However, germline-limited sequences can be expressed during the development of the somatic nucleus as recently reported for an essential PiggyBac transposon-derived gene [95]. Typically, such PiggyBac transposases recognize long inverted terminal repeats (ITRs) [55], which are not present in the eliminated sequences in ciliates, as these are usually shorter than the canonical ITRs. Nonetheless, in some ciliate species such as Paramecium and Tetrahymena, a domesticated PiggyBac-like transposase executes the DNA cleavage step [96]. While a knockdown of the enzyme causes a complete retention of eliminated sequences [96], the selective nuclease activity has not been directly demonstrated for purified protein. This discrepancy probably stems from nucleolytic licensing through additional factors in the excision complex. These include scaffolding catalytically inactive PiggyBac-like proteins [97] and components linked to DNA repair (Ku70/80) [85,98] and ligation (Ligase IV) [86], whose knockdown prevented DNA cleavage (Figure 3E). Noteworthy, the importance of the interactions for nucleolytic activity was demonstrated using a mutant Ku70 that lacks DNA affinity but can still interact with the nuclease. Expression of this protein re-enabled cleavage in wt Ku70 knockdown cells and also resulted in a massive increase of incorrectly rejoined chromosomes and CDS-internal telomeres [85].
The involvement of additional enzymes in the excision is likely. An RNA helicase is essential for establishing the base pairing between sRNAs and the targeted transcripts in Tetrahymena [99], a closely related organism to Paramecium, in which the excisions are also conducted by a domesticated PiggyBac transposon. An additional factor in the process is the histone chaperone spt16-1, which enables the proper localization of the PiggyBac to the developing nucleus [100]. It would be interesting to understand if this is based on direct interactions with the cleavage complex or through exchange of nucleosome subunits corresponding to the established function of spt16 chaperones. The proper localization and recruitment of the excision complex could also be controlled by specific posttranslational modifications. One such example is SUMOylation, which is an essential regulatory process involved in sRNA-mediated transposon silencing and heterochromatinization [101]. This posttranscriptional modification is widespread also in ciliates and specifically up-regulated during PDE [101]. Furthermore, SUMOylation can be PIWI-dependent [102] and linked to piRNA-mediated heterochromatinization of transposable elements as reported in Drosophila, where Panoramix recruits heterochromatinization factors when SUMOylated [103].
Finally, a specific DNA organization might be necessary for DNA elimination and correct joining of the free DNA ends (Figure 3F). This is corroborated by the recent finding of a meiosis-specific SMC protein and the condensin complex in DNA elimination in Paramecium [87]. Further investigation needs to clarify whether the complex enables developmental-specific gene expression or is required for the formation of DNA loops recognized by the cleavage machinery (Figure 3F). An example for such functions is the synaptonemal complex in C. elegans, which is formed by similar components as the PDE machinery in ciliates, namely Argonaute proteins, sRNAs, and a meiosis-specific SMC-1 [104]. Furthermore, the involvement of condensins in suppression of the LINE-1 retrotransposons [105] has been reported recently. Silencing in higher eukaryotes has paralleled to the excision of transposons in ciliates, further highlighting the potential function of condensins as facilitators of genomic excisions.
Outlook: the potential of the DNA elimination machinery for genome editing
In most eukaryotes, piRNA-like sRNAs orchestrate the cellular response to invading transposable elements [71,106]. In ciliates, this suppression response is driven to the extreme, resulting in the permanent elimination of transposon-derived sequences from their somatic genome. This is possible due to a nuclear dimorphism, which allows for a comparison of the contents of both nuclei, in a process known as scanning. Any newly appearing DNA can therefore be efficiently suppressed. Similar to piRNA-mediated transposon silencing in other eukaryotes, the excision process in ciliates involves two separate waves of sRNAs. The secondary sRNAs are produced directly from newly excised DNA fragments in a positive feedback loop [83] (Figure 3C). Because most of these DNA fragments are smaller than 100 bp, their transcription requires initial concatemerization and circularization to longer DNA molecules that can be transcribed [107]. This unorthodox solution exemplifies a general trend of ciliates to acquire and repurpose molecular machinery for eccentric biological properties [108,109]. Understanding such unconventional machineries is an essential step toward the ability to integrate them into engineered synthetic biology systems [110]. Engineered systems hold promise both for the development of futuristic biotechnological manufacturing methods [111] but also for establishing novel therapies for complex diseases [112].
The PiggyBac transposon suppression system in ciliates is an example for the significant evolutionary advancements achieved by transposon domestication. This process has been a prominent driver of evolution as evidenced by the adaptive immune system developed from an ancient transposase in vertebrates [113] and the placental development in mammals [9]. Additionally, the prokaryotic CRISPR-Cas system originated from RNA-guided transposon-derived proteins [114] similar to the Fanzor nucleases in eukaryotes [115-117]. Interestingly, it is hypothesized that the ability of ciliates to control transposable elements represents a double-edged sword. By limiting the detrimental consequences of transposon invasion, it also fostered the propagation of transposons in the germline [76,118]. This in term favored an evolution of the genome of the host organism [9] facilitated by the additional raw sequence material.
Noteworthy, the eukaryotic transposases from the PiggyBac family are among the most efficient molecular tools for genomic integration of long transgenes in human cells [119]. Vertebrate genomes encode many additional highly active transposases that could also be harnessed for genome editing [120]. However, a major obstacle for their use at a clinical setting stems from the safety concerns due to the uncontrollable localization of the insertions [121]. Consequently, RNA-guided transposases could enable efficient integration of DNA cargo into specific target sites, as exemplified by the bacterial transposases derived from omega and Tn7 CRISPR-associated transposons (CAST) (Figure 4A). Type I CASTs (evolved from Tn7 transposons) have recently allowed for an insertion of genetic cargo at locations that are defined by RNA guides into cyanobacteria [126] and human cells [122]. However, due to the complexity of the machinery, further improvements in efficiency are needed. Alternatively, type V-K CASTs (evolved from Tn5053 transposons) are more efficient but often result in cointegration of undesirable sequences such as the plasmid backbone through replicative instead of cut-and-paste transposition [127]. Using engineering approaches a fusion protein of a type V-K CAST with a homing DNA nickase allowed for single-product insertions [123]. Usage of dual-nickase activity was also employed previously with the aim of reducing off-target activity of the Cas9 nuclease, as it allows for using two separate guide RNAs for the cleavage at the correct target site [125] (Figure 4B). Notably, it was recently reported that negative DNA supercoiling leads to striking increase in off-target binding by Cas enzymes [128]. A possibility to bypass the effects of DNA topology on binding specificity can be imagined through an indirect recruitment to the desired DNA location via binding to nascent transcripts using sRNA/PIWIs, like in some ciliates (Figure 4C). For achieving targeted insertion at this site, the integration activity of the transposase proteins from ciliates has to be restored through protein engineering. Thus, whether the sRNA-guided DNA elimination machinery including PiggyBac-like transposases meets all requirements for use for precise genome editing in eukaryotes, such as humans, remains to be further investigated. A better understanding of naturally occurring genome editing processes such as PDE across various life forms, and not only in ciliates, will likely augment the list of available genome editing tools.
sRNA-guided tools for high-precision-targeted insertions of cargo DNA.
(A) Type I CRISPR-associated transposase (CAST) complex for RNA-guided genetic insertions [122-124]. (B)Cas9-nickase enzyme with dual-guide RNA to achieve a double-stranded break [125]. (C)A nascent transcript binding transposase complex corresponding to the ciliate genome excision complex [9,97]. sRNA, small RNA.
(A) Type I CRISPR-associated transposase (CAST) complex for RNA-guided genetic insertions [122-124]. (B)Cas9-nickase enzyme with dual-guide RNA to achieve a double-stranded break [125]. (C)A nascent transcript binding transposase complex corresponding to the ciliate genome excision complex [9,97]. sRNA, small RNA.
Perspectives
Importance of the field: Programmed DNA elimination (PDE) is a strikingly prevalent process in eukaryotes. It has diverse roles in both simple and complex organisms. Notably, the wide distribution is the product of a convergent evolution. This highlights the importance of the underlying challenge leading to this development: what to do with unnecessary DNA? While in some organisms the unnecessary portions of DNA are compacted and suppressed, it seems that eliminating it altogether is also an efficient option. In the specific case of ciliates, the PDE was perfectioned in terms of both processivity and precision allowing the elimination of thousands of fragments with base-pair precision.
Current thinking: The precision of DNA elimination in ciliates allows it to serve as a form of immune system that protects the organisms against invasive DNA sequences such as transposons. The process involves an RNA-based comparison of the gene contents of the germline and somatic genomes during sexual reproduction. As a first step, small RNAs (sRNAs) are produced from the germline. Those that do not find a match in the correct old version of the somatic genome are transferred to the newly forming somatic nucleus during its formation from the germline sequence. The sRNAs are only complementary to sequences that appear in the germline but not in the old somatic nucleus. Through a complex and presently mechanistically enigmatic cascade, the sRNAs then guide the elimination of their matching DNA sequences.
Future directions: The functions of PDE in the suppression of invasive sequences could offer insights into the early evolutionary stages of eukaryotic chromatin. Additionally, advanced proteomic techniques developed in the recent years will allow to probe the sRNA-directed DNA elimination process in ciliates in mechanistic detail. Understanding the link between the specific DNA recognition by sRNAs and the subsequent DNA elimination step holds the potential to uncover new molecular tools for the manipulation of DNA in vivo. Such tools could be then repurposed for expanding the available gene editing toolbox. Safer and more efficient DNA editing systems are essential for the advancement of such methods toward therapeutic purposes and clinical implementation.
Competing Interests
The authors declare that there are no competing interests associated with the manuscript.
Funding
The preparation of the manuscript was funded by Swiss National Science Foundation grant no. [229074] and Initiator grant from the University of Bern awarded to B.-A.S. and a Swiss National Science Foundation grant no. [214853] awarded to M.N.
CRediT Author Contribution
Bozhidar-Adrian Stefanov: Investigation, Writing - Original Draft, Writing - Review & Editing, Funding acquisition. Mariusz Nowacki: Writing - Review & Editing, Supervision, Project administration, Funding acquisition.
Acknowledgments
The authors thank Robin Hogg and the other members of the Nowacki Lab for helpful discussions. Illustrations were generated using a BioRender license owned by the University of Bern.