The difficult case of an RNA-only origin of life

The RNA world hypothesis is probably the most extensively studied model for the emergence of life on Earth. Despite a large body of evidence supporting the idea that RNA is capable of kick-starting autocatalytic self-replication and thus initiating the emergence of life, seemingly insurmountable weaknesses in the theory have also been highlighted. These problems could be overcome by novel experimental approaches, including out-of-equilibrium environments, and the exploration of an early co-evolution of RNA and other key biomolecules such as peptides and DNA, which might be necessary to mitigate the shortcomings of RNA-only systems.

The conjecture that life on Earth evolved from an 'RNA World' remains one of the most popular hypotheses for abiogenesis, even 60 years after Alex Rich first put the idea forward [1]. For some, evidence based upon ubiquitous molecular fossils and the elegance of the idea that RNA once had a dual role as information carrier and prebiotic catalyst provide overwhelming support for the theory. Nevertheless, doubts remain surrounding the chemical evolution of an RNA world, whose classical scenario is based on a temporal sequence of nucleotide formation, enzyme-free polymerisation/replication, recombination, encapsulation in lipid vesicles (or other compartments), evolution of ribozymes and finally the innovation of the genetic code and its translation ( Figure 1) [2,3]. Common criticisms are that RNA is too complex to emerge de novo in a prebiotic environment, that catalysis is a relatively rare property of RNA and requires implausibly long strands, that the catalytic repertoire of RNA is too limited and that it is difficult to envisage scenarios in which precursors and feedstocks occurred at sufficient concentrations to allow replication and evolution [4].
Breakthroughs in prebiotic chemistry demonstrating how the essential building blocks of RNA (and other biomolecules) may have formed under different primordial scenarios address many concerns about the plausibility of RNA or related nucleic acid emergence in a prebiotic world [5][6][7][8][9]. Similarly, demonstrations of enzyme-free polymerisation and copying of nucleic acids from activated building blocks [10][11][12][13] and the innate potential of random RNA strands to recombine and ligate show that the emergence of longer RNA strands capable of catalysis is, in principle, feasible [14][15][16]. The in vitro selection of ribozymes has over the years revealed the impressive catalytic repertoire of nucleic acids, despite their conformal and sequence-based limitations compared with proteins [17]. RNA is particularly adept at manipulations of its own phosphate backboneprecisely the chemistry needed to catalyse self-replication.
Despite these rebuttals, it has not yet been possible to demonstrate robust and continuous RNA self-replication from a realistic feedstock (i.e. activated mono-or short mixed-sequence oligonucleotides). Major obstacles for RNA copying such as efficiency, regiospecificity and fidelity and are discussed elsewhere [18,19] but are mostly true for both non-enzymatic and enzymatic scenarios. The ever-looming strand dissociation problem is of particular concern (Figure 2). The high melting temperature (T m ) of long RNA duplexes, such as those that arise from template-directed replication, results in the formation of dead-end duplex complexes in the absence of highly evolved helicases. When complementary RNA strands are separated, for example, by heat denaturation, reannealing occurs orders of magnitudes faster than known copying reactions [18].
In the case of ribozymes, only 'simple' ligation or recombination-based RNA replication from defined oligonucleotides has been demonstrated [20][21][22][23]. Such systems have only a limited ability to transmit heritable information and so are not capable of open-ended evolutionthe ability to indefinitely increase in complexity like living systems [24]. Open-ended evolution requires that a replicase must at least be able to efficiently copy generic sequences longer than that required to encode its own function. This topic has been reviewed in detail elsewhere [24,25].
The search for an RNA replicase ribozyme, a cornerstone of the RNA world hypothesis, is largely founded on improvements of the scaffold of the R18 RNA polymerase ribozyme, which itself is an optimised version of the complex class I ligase ribozyme [26]. The discovery of the class 1 ligase, which is capable of ligating RNA with higher efficiency and better turnover than most ribozymes, was perhaps a lucky coincidence (or misfortune, if better ribozymes were missed), and can optimistically be expected to occur on average every 1 in 2000 selection experiments [27]. Considering this, it is truly astonishing how far this single ribozyme family has been developed. Initially capable of copying only very simple templates [27], variants of the polymerase are now able to copy complex templates [28,29], including the synthesis of an entire catalytic domain of a polymerase itself from trinucleotides [30], achieved through copying and subsequent ligation of fragments albeit with multiple human interventions. The much anticipated 'riboPCR', the amplification of RNA sequences in a ribozyme-catalysed polymerase chain-like reaction, has so far only been successful for very short primer dimers, which can be melted rapidly at relatively low temperatures, therefore minimising temperature-induced RNA hydrolysis [29]. The apparent limitations of riboPCR with respect to amplification of long strands complicate self-replication scenarios, although schemes evoking the asymmetric replication of short RNAs (where 'antisense' strands are produced in excess over coding 'sense' strands) followed by ligation or recombination into an active replicase could still provide an elegant solution to the problem [31,32]. Both long and short oligomers can fold into structures of varying complexity, resulting in the emergence of functional ribozymes. As complexity increases, the first RNA replicase emerges, and encapsulation results in protocells with distinct genetic identities capable of evolution. In reality, it is likely that multiple processes occurred in parallel, rather than in a strictly stepwise manner, and encapsulation may have occurred at any stage.
Looking back at the long history of the field, one might wonder why we have yet to achieve (self-sustained) RNA replication and transcription, despite its centrality to the RNA world hypothesis. There are three possibilities: 1. RNA is capable of this process but more time is needed to identify either conditions or replicators of sufficient complexity that are able to solve the various problems associated with protein-free RNA replication. 2. RNA in isolation (including ribozymes) is simply not sufficient to catalyse its own replication, and substantial help from either other molecules or the environment is essential. 3. RNA replication was never really central during early molecular evolution but rather the late result of a (crudely) replicating non-enzymatic metabolism [33][34][35][36][37] or an early 'polypeptide first' world [38][39][40]. We will not discuss the merits of these scenarios here, but believe it is crucial to test and challenge the predictions made by these alternative models experimentally.
For the first possibility, it may only be a matter of time and combined efforts to identify experimental model scenarios that are convincing enough to please critics of the field. In the worst case, the formation of life as we know it from RNA could be the result of a 'frozen accident', similar to the genetic code [41], that is generally hard or impossible to reproduce (e.g. a robust self-replicating ribozyme). However, the current consensus seems to be that while it may never be possible to identify the exact trajectory that led to our modern biochemistry, it should still be possible to emulate the process and find related routes that lead to a 'recapitulated' origin ex situ [42]. This notion is largely grounded on the assumption that enzyme-free replication requires no a priori sequence information and can, therefore, emerge spontaneously under suitable environmental and chemical conditions (e.g. a continuous supply of activated monomers and processes that enable repeated strand separation). Assuming the remaining experimental problems of continuous enzyme-free nucleic acid replication are solved, natural selection should spontaneously produce systems that are better or 'good enough' to persist under the given conditions. Whether these new replicators will necessarily evolve into more complex systems with advanced (and potentially emergent) properties (e.g. cooperative ribozyme networks) is of course another matter, although simulations predict that genomic complexity is forced to increase in a fixed environment [43][44][45].
To identify suitable conditions for such an in-laboratory origin, a 'flexible' approach is probably the best choice in light of the large number of possible geochemical conditions that have been proposed to host the emergence of life [46]. In other words, it is most sensible to perform key experiments under relaxed but plausible experimental boundary conditions instead of trying to implement strict restraints based on educated guesses about a specific prebiotic environment. Once a set of experimental conditions that can sustain certain crucial reactions such as RNA synthesis, building block activation and self-replication have been identified, it will help to pinpoint plausible geochemical scenarios automatically. There are several examples of such problem-oriented approaches, e.g. tackling the strand inhibition problem during the replication of long RNAs using viscous solvents and temperature oscillations [47], overcoming low substrate concentrations and the fragility of RNA by working under frozen conditions [48,49] or implementation of scenarios enabling multistep, uninterrupted synthesis of key building blocks of nucleotide synthesis [50] or nucleotides themselves [16]. In addition, combining typical model RNA world reactions with non-equilibrium settings based upon thermal gradients shows great promise [51]. For example, gas bubbles in combination with thermal gradients cause dissolved materials to cycle between dry and wet states and enable the key steps of precursor/oligonucleotide accumulation and RNA phosphorylation, while drastically increasing ribozyme activity and facilitating RNA encapsulation into vesicle aggregates [52]. Oscillating salt concentrations in such environments cause local melting of nucleic acid duplexes up to 20°C lower than the T m , which could provide an environmental route to overcoming the strand dissociation problem [53]. It remains to be seen if such environments can eventually support coupled cycles of RNA activation, replication and encapsulation under continuous conditions.
Similar combined efforts will also be necessary if RNA alone is insufficient to drive continuous selfreplication and evolution. In this case, it may be necessary to diversify the pool of feedstock molecules by taking into account the chemical and conformational heterogeneities found in many experimental scenarios. For example, nucleic acid polymers with non-inheritable backbone heterogeneities (e.g. 2 0 -5 0 versus 3 0 -5 0 backbone heterogeneity for RNA [54] or mixed nucleotide backbones formed from RNA, DNA or other nucleic acid types [55]) have fascinating properties. In particular, some chimeric backbones decrease duplex stabilities, which could help to mitigate the strand dissociation problem [56]. Such a 'mixed' scenario seems plausible in view of the prebiotic clutter [57]. Recent synthesis strategies coming from different laboratories have found strong evidence that RNA and DNA could have arisen from the same set of precursor molecules [9,58], and ribozymes that can read and write both nucleotide backbone chemistries have already been found [59]. Even though heterogeneous nucleic acids pose a general problem for hereditability of genetic information, such chimaeras could have played an important role as non-genetic catalysts similar to modern proteins. Exploring such heterogeneous scenarios poses major experimental challenges, as many of the standard tools used to study RNA, particularly reverse transcription and (deep) sequencing, are harder or impossible with mixed backbone chemistries. Nevertheless, it remains important to investigate these scenarios and, if necessary, develop new molecular biological tools that can cope with non-homogeneous nucleic acid backbones [60].
Plausible help for RNA might also come from primitive polypeptides (thoroughly discussed elsewhere [61]) Nearly, all ribozymes found in extant biology are associated with proteins that help them to carry out their function under intracellular conditions. These ribonucleoprotein complexes are thought to be remnants of an ancient biology where polypeptides could have supported folding and substrate binding of catalytic RNAs [62,63]. Before the advent of translation, these peptide cofactors would have been very simple or even comprised of a pool of random peptides with sequence biases [64]. As such, they would probably not have been initially capable of precise functionalities requiring a well-defined active site (although there might have been some notable exceptions [65]). Even such simple peptides could have been crucial for RNA protection during non-enzymatic replication [18] and during ribozyme-catalysed RNA copying [66]. It is tempting to speculate that peptides might have also granted ancient 'ribonucleopeptide RNA replicases' improved nonspecific affinity to their substrates (i.e. a primer-template duplex), which is required for processivity but difficult to achieve with the polyanionic phosphate backbone of RNA alone. An advantage of this hypothesis is that an early co-evolution of RNA and peptides makes the transition to protein-dominated biology seem more plausible. Moreover, an early cooperation between RNA and peptides might also provide an elegant route to the formation of the first protocells before the advent of membrane-bound compartments [67]. As with nucleic acid heterogeneity, the inclusion of peptides represents an enormous analytical and experimental challenge, which will only be addressed by close collaboration between multiple disciplines within the origin of life field.
There remains the hope for origin of life scenarios where RNA plays a major role as an information carrier and catalyst. New experimental approaches using out-of-equilibrium settings could finally result in genuine RNA-based self-replicating systems capable of open-ended evolution. More complex scenarios involving RNA, DNA, peptides, simpler polynucleotides, chimeric intermediates or other yet unknown helper molecules may also be required, which will complicate the analytical understanding of the model systems and may ultimately render the term 'RNA world' in its traditional sense obsolete.

Summary
• Despite advances in prebiotic chemistry, it has not yet been possible to demonstrate robust and continuous RNA self-replication from a realistic feedstock.
• RNA in isolation may not be sufficient to catalyse its own replication and may require help from either other molecules or the environment.
• Non-equilibrium environments, backbone heterogeneity and polypeptide cofactors may address some of the remaining problems in the RNA world hypothesis.